File Import & Upload Design
Document Type: Domain Design (Tier 2) Domain: FILEIMPORT, UPLOAD Domain Character: Integration SRS Reference: fileimport.md, upload-runs.md Status: Draft Last Updated: 2026-01-25
1. Overview
1.1 Purpose
The File Import subsystem handles ingestion, parsing, validation, and processing of PCR run files from thermocycler instruments. It integrates with external systems (S3 storage, Parser API, DXAI Analyser) to transform raw instrument data into analyzable run records.
This is an integration-heavy domain with responsibilities spanning:
- File upload acceptance and validation
- S3 folder management and file routing
- Thermocycler data parsing via external Parser API
- Well data extraction and validation
- DXAI Analyser integration for classification
- Error handling and duplicate detection
1.2 Requirements Covered
| REQ ID | Title | Priority |
|---|---|---|
| REQ-FILEIMPORT-001 | Import Run Files from Monitored Folder | Must |
| REQ-FILEIMPORT-002 | Parse Thermocycler Data to Database Variables | Must |
| REQ-FILEIMPORT-003 | Analyze Run Data Using DXAI Analyser | Must |
| REQ-FILEIMPORT-004 | Validate Well Data During Import | Must |
| REQ-FILEIMPORT-005 | Parse Run File Names for Well Properties | Must |
| REQ-FILEIMPORT-006 | Import Observation Properties from Run Files | Must |
| REQ-FILEIMPORT-007 | Identify Crossover Wells During Import | Must |
| REQ-FILEIMPORT-008 | Identify Targets Using Wildcard Matching | Must |
| REQ-FILEIMPORT-009 | Support Prepending Fake Cycles for Analysis | Must |
| REQ-FILEIMPORT-010 | Manage Import Folder Structure | Must |
| REQ-FILEIMPORT-011 | Prevent Duplicate File Imports | Must |
| REQ-FILEIMPORT-012 | Persist Invalid Well Data for Visibility | Must |
| REQ-FILEIMPORT-013 | Maintain File Traceability | Must |
| REQ-UPLOAD-001 | Accept Run File Uploads | Must |
| REQ-UPLOAD-002 | Validate Uploaded File Types | Must |
| REQ-UPLOAD-003 | Display Upload Progress | Must |
| REQ-UPLOAD-004 | Navigate to Uploaded File | Must |
| REQ-UPLOAD-005 | Display Upload Errors | Must |
| REQ-UPLOAD-006 | Cancel Upload in Progress | Must |
| REQ-UPLOAD-007 | Handle Bulk Uploads | Must |
1.3 Constraints
Tier 2 Constraint: This document describes ownership, patterns, and design rationale. It links to reference docs for full schemas and API specifications.
1.4 Dependencies
| Direction | Domain/Component | Purpose |
|---|---|---|
| Consumes | S3 Storage | File storage and folder management |
| Parser API | Thermocycler file parsing | |
| DXAI Analyser | Classification and CT determination | |
| Client Configuration | Import settings (approach, formats) | |
| Mix/Target Config | Target matching and calibration | |
| Thermocycler Config | Instrument identification | |
| Provides to | RUNRPT | Created run records |
| Analyzer | Well data for analysis rules |
2. Component Architecture
2.1 Component Diagram
2.2 Component Responsibilities
| Component | Layer | Responsibility | REQ Trace |
|---|---|---|---|
RunFilesController | Controller | Accept uploads, validate extensions | REQ-UPLOAD-001, REQ-UPLOAD-002 |
StoreRunFilesAction | Action | Store file to S3, create progress record | REQ-FILEIMPORT-001, REQ-FILEIMPORT-010, REQ-FILEIMPORT-013 |
DispatchNextFileToParsingAction | Action | Queue next file for processing | REQ-FILEIMPORT-001 |
RunDataParseJob | Job | Orchestrate parse-to-import pipeline | REQ-FILEIMPORT-001, REQ-FILEIMPORT-002 |
RunDataParseAction | Action | Parse file, detect duplicates, handle errors | REQ-FILEIMPORT-001, REQ-FILEIMPORT-002, REQ-FILEIMPORT-010, REQ-FILEIMPORT-011 |
ParsedJson | Service | Transform parsed JSON to importable format | REQ-FILEIMPORT-002, REQ-FILEIMPORT-003, REQ-FILEIMPORT-005, REQ-FILEIMPORT-006, REQ-FILEIMPORT-007, REQ-FILEIMPORT-008, REQ-FILEIMPORT-009 |
WildCardTargetNameMatcher | Service | Match targets with wildcard patterns | REQ-FILEIMPORT-008 |
GetAutomaticBaselineFromMixChannel | Service | Extract auto_baseline property | REQ-FILEIMPORT-006 |
WellLabel | Normalizer | Parse well label tags | REQ-FILEIMPORT-007 |
RunFileName | Normalizer | Parse run file name for properties | REQ-FILEIMPORT-005 |
CreateRunAction | Action | Create run, wells, observations | REQ-FILEIMPORT-001, REQ-FILEIMPORT-002, REQ-FILEIMPORT-004, REQ-FILEIMPORT-012 |
RunFileImportProgress | Model | Track upload/import progress | REQ-UPLOAD-003, REQ-UPLOAD-005 |
RunFileImportProgressBroadcast | Event | Real-time progress updates | REQ-UPLOAD-003 |
RunFileImported | Event | Signal successful import | REQ-FILEIMPORT-013 |
3. Data Design
3.1 Entities
This domain creates and manages the following entities:
| Entity | Owner | Usage |
|---|---|---|
run_file_import_progresses | FILEIMPORT | Track upload/import status |
runs | RUNRPT | Created by import process |
wells | RUNRPT | Created with validation errors if applicable |
observations | RUNRPT | Created with analysis results |
run_targets | RUNRPT | Created for each mix/target combination |
run_mixes | RUNRPT | Created for each mix in the run |
See Database Reference for full schema.
3.2 Data Structures
Run File Import Progress
interface RunFileImportProgress {
id: string; // Unique identifier (from filename)
original_name: string; // Original filename for traceability
file_path: string; // Current S3 path
status: ImportStatus; // CONVERTING | IMPORTED | DUPLICATE | PARSE_ERROR | IMPORT_ERROR
status_message?: string; // Error details if failed
user_id: string; // Uploading user
site_id: string; // Site context
run_id?: string; // Created run (if successful)
uploaded_at: DateTime; // Upload timestamp
}
type ImportStatus = 'CONVERTING' | 'IMPORTED' | 'DUPLICATE' | 'PARSE_ERROR' | 'IMPORT_ERROR';
Parsed Run Data (Internal)
interface ParsedRunData {
run_info: {
run_name: string;
thermocycler_id?: string;
runfile_created_at: DateTime;
file_hash: string;
custom?: Record<string, any>;
};
wells: ParsedWell[];
targets: ParsedTarget[];
observations: ParsedObservation[];
}
interface ParsedWell {
well_number: string;
label: string;
well_uuid: string;
error_code?: string;
concentration_factor?: number;
}
interface ParsedObservation {
readings: number[];
ct?: number;
cls?: Classification;
dxai_ct?: number;
dxai_cls?: Classification;
target?: string;
dye?: string;
baseline_start?: number;
baseline_end?: number;
quantity?: number;
}
Well Label Structure
// Tag mapping for Well Label import approach
const WELL_LABEL_TAGS = {
'A': 'accession',
'C': 'testcode_name',
'D': 'extraction_date',
'E': 'extraction_settings', // Format: instrument-batch
'T': 'mix',
'R': 'role_alias',
'N': 'note',
'X': 'quantity_multiplier',
'Y': 'crossover', // Value: 'XOVER'
'W': 'tissue_weight',
};
3.3 State Transitions
Import Progress State Machine
| State | Description | File Location |
|---|---|---|
| CONVERTING | File received, queued for processing | {site}/Runs/Parsing/ |
| IMPORTED | Successfully imported | {archive-bucket}/ |
| DUPLICATE | File hash matches existing run | {site}/Runs/Problem_Files/ |
| PARSE_ERROR | Parser API failed | {site}/Runs/Problem_Files/ |
| IMPORT_ERROR | Validation or database error | {site}/Runs/Problem_Files/ |
4. Interface Design
4.1 APIs Consumed
| Service | Endpoint/Method | Purpose | REQ Trace |
|---|---|---|---|
| Parser API | parse_run_file() | Parse thermocycler file format | REQ-FILEIMPORT-002 |
| Parser API | extend_json() | Get DXAI analysis results | REQ-FILEIMPORT-003 |
| S3 | Storage::cloud() | File storage operations | REQ-FILEIMPORT-010 |
Parser API Contract
Function: parse_run_file(file_path, api_url, api_key)
Input:
- file_path: Local path to thermocycler file
- api_url: Parser service URL
- api_key: Authentication key
Output:
- object: Parsed JSON structure containing runs, wells, channels, data
DXAI Extend API Contract
Function: extend_json(json_path, api_url, version, api_key)
Input:
- json_path: Path to prepared JSON with targets, wells, observations
- Targets include calibration_S3_uri for each target
Output:
- object: Extended data with dxai_cls and dxai_ct per observation
4.2 APIs Provided
| Endpoint | Method | Purpose | REQ Trace |
|---|---|---|---|
/api/run-files | POST | Upload run file | REQ-UPLOAD-001 |
/api/run-files/{id} | DELETE | Cancel upload | REQ-UPLOAD-006 |
/api/run-file-import-progresses | GET | List progress records | REQ-UPLOAD-003 |
Upload Request
POST /api/run-files
Content-Type: application/json
{
"file_path": "temp/upload-uuid.sds",
"original_name": "MyRun.sds"
}
Validation:
- original_name: required, ends_with:.sds,.SDS,.ixo,.IXO,.pcrd,.PCRD,.eds,.EDS
4.3 Events
| Event | Direction | Payload | Purpose |
|---|---|---|---|
RunFileImportProgressBroadcast | Pusher → Frontend | { id, status, progress } | Real-time progress |
RunFileImported | Internal | { user, run } | Import complete notification |
4.4 S3 Folder Structure
| Folder | Purpose | REQ Trace |
|---|---|---|
{site}/Runs/toPcrai/ | Upload destination | REQ-FILEIMPORT-010 |
{site}/Runs/Parsing/ | Processing queue | REQ-FILEIMPORT-010 |
{site}/Runs/Problem_Files/ | Failed imports | REQ-FILEIMPORT-010 |
{archive-bucket}/ | Successful imports (separate bucket) | REQ-FILEIMPORT-010 |
{site}/LIMS_Reports/ | Export destination | REQ-FILEIMPORT-010 |
5. Behavioral Design
5.1 File Upload Flow
Algorithm: Upload Run File
Inputs:
- original_name: string - Original filename
- file_path: string - Temporary upload path
- user: User - Authenticated user
Outputs:
- void (async processing begins)
Assumptions:
- User has upload permission (Junior User, Senior User, Super-Admin)
- File extension already validated by controller
Steps:
1. Get site from user.loggedInSite
2. If site storage directory doesn't exist, create it
3. Generate unique filename from original_name
4. Copy file from temp to {site}/Runs/Parsing/{unique_name}
5. Create RunFileImportProgress record:
- status: 'CONVERTING'
- original_name: preserved for traceability
- file_path: new S3 path
- uploaded_at: now()
6. If no parsing job currently running:
- Dispatch RunDataParseJob for next queued file
Notes:
- Files are processed sequentially to avoid resource contention
- Original filename preserved for audit (REQ-FILEIMPORT-013)
5.2 Parse and Import Flow
Algorithm: Parse Run Data
Inputs:
- filePath: string - S3 path to file
- site: Site - Processing context
Outputs:
- void (creates Run on success, updates status on failure)
Assumptions:
- File exists at filePath
- Parser API is available
- DXAI Analyser is available
Steps:
1. Calculate file hash (MD5)
2. Check for duplicate:
a. Query runs.file_hash WHERE site_id = site.id
b. If exists, move to Problem_Files with status 'DUPLICATE'
c. Return early
3. Parse file via Parser API:
a. Download from S3 to temp
b. Call parse_run_file()
c. On failure, move to Problem_Files with status 'PARSE_ERROR'
4. Transform via ParsedJson:
a. Match mixes using well_properties_import_approach
b. Extract well properties per approach
c. Prepare observations for DXAI
d. Call extend_json() for classification
5. Import via CreateRunAction:
a. Create Run record
b. Normalize well data
c. Validate wells (generate error codes)
d. Analyze and summarize
e. Persist wells and observations
6. Update progress to 'IMPORTED' with run_id
7. Move file to archive bucket
8. Dispatch next file for parsing
Error Handling:
- DuplicateRunImportException: status = 'DUPLICATE'
- Parser exception: status = 'PARSE_ERROR'
- Import exception: status = 'IMPORT_ERROR' with message
5.3 Well Properties Import Approaches
The system supports three approaches for extracting well properties:
| Approach | Mix Source | Thermocycler Source | Crossover Support |
|---|---|---|---|
| Well Label | T: tag in label | JSON field or label | Yes (Y:XOVER tag) |
| Meta Data | File metadata | JSON field | No |
| Run File Name | Filename segment | Filename segment | No |
Algorithm: Determine Well Properties Import Approach
Steps:
1. Read client_configurations.well_properties_import_approach
2. If 'Label':
a. Parse label using WellLabel class
b. Extract tags: A, C, D, E, T, R, N, X, Y, W
c. E tag destructures to extraction_instrument + batch_number
3. If 'Meta Data':
a. Use mix.id from file metadata
b. Get thermocycler from JSON field
4. If 'Run File Name':
a. Parse: {mix}.{thermocycler}.{protocol}.{batch}.{analyst}.{ext}
b. Generate virtual labels for "Unknown" wells: {batch}-{well_number}
5.4 Wildcard Target Matching
Algorithm: Match Target with Wildcard
Inputs:
- configTarget: string - Configured target name (may contain *)
- runTarget: string - Target name from run file
Outputs:
- boolean - Whether targets match
Steps:
1. Convert configTarget to regex:
- Replace * with .*
- Wrap with /^...$/
2. Test runTarget against regex
3. Return match result
Examples:
- "Target*" matches "Target A", "Target B", "Target123"
- "HBV*" matches "HBV-FAM", "HBV-Cy5"
5.5 Prepend Fake Cycles
Algorithm: Prepare Readings with Fake Cycles
Inputs:
- readings: number[] - Original observation readings
- target: Target - Target with prepend_fake_cycles_count
Outputs:
- number[] - Readings with prepended fake cycles
Assumptions:
- prepend_fake_cycles_count >= 0
Steps:
1. If target.prepend_fake_cycles_count == 0:
- Return original readings
2. Create fake readings array:
- Length: prepend_fake_cycles_count
- Value: repeat first reading value
3. Concatenate fake readings + original readings
4. Return combined array
Post-Processing:
- After DXAI returns dxai_ct, subtract prepend_fake_cycles_count
- This adjusts CT back to original cycle numbering
6. Error Handling
6.1 Validation Errors
| Error Code | Condition | Detection Point | User Impact |
|---|---|---|---|
SAMPLE_LABEL_IS_BAD | Unrecognized label format | ParsedJson | Well shows error |
ACCESSION_MISSING | Patient well lacks accession | PatientValidator | Well shows error |
EXTRACTION_BATCH_MISSING | Missing E tag or batch | WellLabel, PatientValidator | Well shows error |
EXTRACTION_INSTRUMENT_MISSING | Missing extraction instrument | PatientValidator | Well shows error |
EXTRACTION_DATE_MISSING | Missing extraction date | PatientValidator | Well shows error |
INVALID_EXTRACTION_DATE | Date after file creation | PatientValidator | Well shows error |
TESTCODE_MISSING | Missing test code (Label approach) | PatientValidator | Well shows error |
UNKNOWN_TESTCODE | Test code not configured | TestcodeExists rule | Well shows error |
MIX_DIDNT_MATCH_TC | Mix doesn't match test code | MixMatchesTestCode rule | Well shows error |
MIX_MISSING | Mix not found | ParsedJson | Well shows error |
THERMOCYCLER_UNKNOWN | Thermocycler not configured | ParsedJson, CreateRunAction | All wells show error |
INVALID_PASSIVE_READINGS | Zero in passive dye readings | ParsedJson | Well shows error |
INVALID_TISSUE_WEIGHT | Weight format/range invalid | PatientValidator | Well shows error |
CROSSOVER_LABEL_ERROR | Crossover label missing tags | PatientValidator | Well shows error |
6.2 Import-Level Errors
| Condition | Status | File Routing | REQ Trace |
|---|---|---|---|
| Duplicate file hash | DUPLICATE | Problem_Files | REQ-FILEIMPORT-011 |
| Parser API failure | PARSE_ERROR | Problem_Files | REQ-FILEIMPORT-002 |
| DXAI API failure | IMPORT_ERROR | Problem_Files | REQ-FILEIMPORT-003 |
| Database error | IMPORT_ERROR | Problem_Files | REQ-FILEIMPORT-001 |
6.3 Invalid Data Persistence
Per REQ-FILEIMPORT-012, invalid well data is persisted to allow user correction:
// Well model includes invalid_raw_data JSON field
interface WellInvalidData {
accession?: string;
extraction_date?: string;
extraction_instrument?: string;
tissue_weight?: string;
crossover_role_alias?: string;
batch_number?: string;
testcode_name?: string;
quantity_multiplier?: string;
}
Invalid values are stored in invalid_raw_data JSON column, displayed in UI for user visibility.
7. Configuration
| Setting | Location | Default | Effect | REQ Trace |
|---|---|---|---|---|
well_properties_import_approach | client_configurations | Label | Import method selection | REQ-FILEIMPORT-005, REQ-FILEIMPORT-006, REQ-FILEIMPORT-007 |
thermocycler_json_field | client_configurations | None | JSON path to serial number | REQ-FILEIMPORT-002 |
use_passive_dye | client_configurations | false | Divide readings by passive | REQ-FILEIMPORT-003 |
prepend_fake_cycles_count | targets | 0 | Cycles to prepend per target | REQ-FILEIMPORT-009 |
calibration_file_path | targets | null | S3 URI for DXAI calibration | REQ-FILEIMPORT-003 |
parser.url | .env | - | Parser API endpoint | REQ-FILEIMPORT-002 |
parser.api_key | .env | - | Parser API authentication | REQ-FILEIMPORT-002 |
parser.version | .env | - | Parser API version | REQ-FILEIMPORT-003 |
See Configuration Reference for details.
8. Implementation Mapping
8.1 Code Locations
| Component | Type | Path |
|---|---|---|
| RunFilesController | Controller | code/app/Http/Controllers/RunFilesController.php |
| RunFileImportProgressesController | Controller | code/app/Http/Controllers/RunFileImportProgressesController.php |
| StoreRunFilesAction | Action | code/app/Actions/RunFiles/StoreRunFilesAction.php |
| DispatchNextFileToParsingAction | Action | code/app/Actions/RunFiles/DispatchNextFileToParsingAction.php |
| RunDataParseAction | Action | code/app/Actions/RunDataParseAction.php |
| CreateRunAction | Action | code/app/Actions/Runs/CreateRunAction.php |
| RunDataParseJob | Job | code/app/Jobs/RunDataParseJob.php |
| ParsedJson | Service | code/app/RunFileConverter/ParsedJson.php |
| WildCardTargetNameMatcher | Service | code/app/RunFileConverter/Support/WildCardTargetNameMatcher.php |
| GetAutomaticBaselineFromMixChannel | Service | code/app/RunFileConverter/GetAutomaticBaselineFromMixChannel.php |
| WellLabel | Normalizer | code/app/Analyzer/Normalizer/WellLabel.php |
| RunFileName | Normalizer | code/app/Analyzer/Normalizer/RunFileName.php |
| PatientValidator | Validator | code/app/Analyzer/Validators/PatientValidator.php |
| RunFileImportProgress | Model | code/app/RunFileImportProgress.php |
| RunFileImportProgressBroadcast | Event | code/app/Events/RunFileImportProgressBroadcast.php |
| RunFileImported | Event | code/app/Events/RunFileImported.php |
| DuplicateRunImportException | Exception | code/app/Exceptions/DuplicateRunImportException.php |
8.2 Requirement Traceability
| REQ ID | Design Section | Code Location |
|---|---|---|
| REQ-FILEIMPORT-001 | §5.1, §5.2 | StoreRunFilesAction, RunDataParseJob, RunDataParseAction |
| REQ-FILEIMPORT-002 | §5.2 | RunDataParseAction.parse(), ParsedJson |
| REQ-FILEIMPORT-003 | §5.2 | ParsedJson.getExtendJsonData() |
| REQ-FILEIMPORT-004 | §6.1 | PatientValidator, CreateRunAction |
| REQ-FILEIMPORT-005 | §5.3 | RunFileName, ParsedJson.alterMixesUsingWellMetaData() |
| REQ-FILEIMPORT-006 | §5.3 | ParsedJson.getBaselineStart/End(), GetAutomaticBaselineFromMixChannel |
| REQ-FILEIMPORT-007 | §5.3 | WellLabel.getIsCrossover(), ParsedJson.alterMixesUsingWellLabel() |
| REQ-FILEIMPORT-008 | §5.4 | WildCardTargetNameMatcher.doesMatch() |
| REQ-FILEIMPORT-009 | §5.5 | ParsedJson.prepareReadingsForExtend(), ParsedJson.getCtFromExtendedData() |
| REQ-FILEIMPORT-010 | §4.4 | StoreRunFilesAction, RunDataParseAction.moveFileTo*() |
| REQ-FILEIMPORT-011 | §5.2 step 2 | RunDataParseAction.checkForDuplicateImport() |
| REQ-FILEIMPORT-012 | §6.3 | CreateRunAction (invalid_raw_data field) |
| REQ-FILEIMPORT-013 | §5.1 step 5 | StoreRunFilesAction (original_name), RunFileImported |
| REQ-UPLOAD-001 | §5.1 | RunFilesController.store(), StoreRunFilesAction |
| REQ-UPLOAD-002 | §4.2 | RunFilesController.store() validation |
| REQ-UPLOAD-003 | §4.3 | RunFileImportProgressBroadcast, RunFileImportProgress |
| REQ-UPLOAD-004 | §3.1 | RunFileImportProgress.run_id |
| REQ-UPLOAD-005 | §6.2 | RunFileImportProgress.status_message |
| REQ-UPLOAD-006 | §4.2 | RunFilesController.destroy() |
| REQ-UPLOAD-007 | §5.1 | StoreRunFilesAction (per-file processing) |
9. Design Decisions
| Decision | Rationale | Alternatives Considered |
|---|---|---|
| Sequential file processing | Prevents resource contention with Parser/DXAI APIs | Parallel processing (rejected: API rate limits, memory) |
| MD5 for duplicate detection | Fast, sufficient collision resistance for file identification | SHA-256 (rejected: slower, no benefit for this use case) |
| Preserve invalid data | Users need visibility to correct problems | Reject entirely (rejected: loses context for correction) |
| Three import approaches | Different labs have different thermocycler configurations | Single approach (rejected: incompatible with existing workflows) |
| Wildcard regex for targets | Flexible matching for variant target names | Exact match only (rejected: too rigid for real-world naming) |
| Separate archive bucket | Completed files shouldn't be accessible to clients | Same bucket subfolder (rejected: access control complexity) |
| Queue-based processing | Decouples upload from processing, better resilience | Synchronous processing (rejected: timeout risk, poor UX) |
10. Related Documents
| Document | Relevant Sections |
|---|---|
| SRS: fileimport.md | Requirements source |
| SRS: upload-runs.md | Upload requirements source |
| SDS: Architecture | Queue/job processing |
| SDS: KITCFG Domain | Mix/target configuration |
| SDS: RUNRPT Domain | Created run records |
| SDD: Algorithms | Legacy algorithm documentation |