Version: 3.0.1

File Import & Upload Design

Document Type: Domain Design (Tier 2) Domain: FILEIMPORT, UPLOAD Domain Character: Integration SRS Reference: fileimport.md, upload-runs.md Status: Draft Last Updated: 2026-01-25

1. Overview

1.1 Purpose

The File Import subsystem handles ingestion, parsing, validation, and processing of PCR run files from thermocycler instruments. It integrates with external systems (S3 storage, Parser API, DXAI Analyser) to transform raw instrument data into analyzable run records.

This is an integration-heavy domain with responsibilities spanning:

File upload acceptance and validation
S3 folder management and file routing
Thermocycler data parsing via external Parser API
Well data extraction and validation
DXAI Analyser integration for classification
Error handling and duplicate detection

1.2 Requirements Covered

REQ ID	Title	Priority
REQ-FILEIMPORT-001	Import Run Files from Monitored Folder	Must
REQ-FILEIMPORT-002	Parse Thermocycler Data to Database Variables	Must
REQ-FILEIMPORT-003	Analyze Run Data Using DXAI Analyser	Must
REQ-FILEIMPORT-004	Validate Well Data During Import	Must
REQ-FILEIMPORT-005	Parse Run File Names for Well Properties	Must
REQ-FILEIMPORT-006	Import Observation Properties from Run Files	Must
REQ-FILEIMPORT-007	Identify Crossover Wells During Import	Must
REQ-FILEIMPORT-008	Identify Targets Using Wildcard Matching	Must
REQ-FILEIMPORT-009	Support Prepending Fake Cycles for Analysis	Must
REQ-FILEIMPORT-010	Manage Import Folder Structure	Must
REQ-FILEIMPORT-011	Prevent Duplicate File Imports	Must
REQ-FILEIMPORT-012	Persist Invalid Well Data for Visibility	Must
REQ-FILEIMPORT-013	Maintain File Traceability	Must
REQ-UPLOAD-001	Accept Run File Uploads	Must
REQ-UPLOAD-002	Validate Uploaded File Types	Must
REQ-UPLOAD-003	Display Upload Progress	Must
REQ-UPLOAD-004	Navigate to Uploaded File	Must
REQ-UPLOAD-005	Display Upload Errors	Must
REQ-UPLOAD-006	Cancel Upload in Progress	Must
REQ-UPLOAD-007	Handle Bulk Uploads	Must

1.3 Constraints

Tier 2 Constraint: This document describes ownership, patterns, and design rationale. It links to reference docs for full schemas and API specifications.

1.4 Dependencies

Direction	Domain/Component	Purpose
Consumes	S3 Storage	File storage and folder management
	Parser API	Thermocycler file parsing
	DXAI Analyser	Classification and CT determination
	Client Configuration	Import settings (approach, formats)
	Mix/Target Config	Target matching and calibration
	Thermocycler Config	Instrument identification
Provides to	RUNRPT	Created run records
	Analyzer	Well data for analysis rules

2. Component Architecture

2.1 Component Diagram

2.2 Component Responsibilities

Component	Layer	Responsibility	REQ Trace
`RunFilesController`	Controller	Accept uploads, validate extensions	REQ-UPLOAD-001, REQ-UPLOAD-002
`StoreRunFilesAction`	Action	Store file to S3, create progress record	REQ-FILEIMPORT-001, REQ-FILEIMPORT-010, REQ-FILEIMPORT-013
`DispatchNextFileToParsingAction`	Action	Queue next file for processing	REQ-FILEIMPORT-001
`RunDataParseJob`	Job	Orchestrate parse-to-import pipeline	REQ-FILEIMPORT-001, REQ-FILEIMPORT-002
`RunDataParseAction`	Action	Parse file, detect duplicates, handle errors	REQ-FILEIMPORT-001, REQ-FILEIMPORT-002, REQ-FILEIMPORT-010, REQ-FILEIMPORT-011
`ParsedJson`	Service	Transform parsed JSON to importable format	REQ-FILEIMPORT-002, REQ-FILEIMPORT-003, REQ-FILEIMPORT-005, REQ-FILEIMPORT-006, REQ-FILEIMPORT-007, REQ-FILEIMPORT-008, REQ-FILEIMPORT-009
`WildCardTargetNameMatcher`	Service	Match targets with wildcard patterns	REQ-FILEIMPORT-008
`GetAutomaticBaselineFromMixChannel`	Service	Extract auto_baseline property	REQ-FILEIMPORT-006
`WellLabel`	Normalizer	Parse well label tags	REQ-FILEIMPORT-007
`RunFileName`	Normalizer	Parse run file name for properties	REQ-FILEIMPORT-005
`CreateRunAction`	Action	Create run, wells, observations	REQ-FILEIMPORT-001, REQ-FILEIMPORT-002, REQ-FILEIMPORT-004, REQ-FILEIMPORT-012
`RunFileImportProgress`	Model	Track upload/import progress	REQ-UPLOAD-003, REQ-UPLOAD-005
`RunFileImportProgressBroadcast`	Event	Real-time progress updates	REQ-UPLOAD-003
`RunFileImported`	Event	Signal successful import	REQ-FILEIMPORT-013

3. Data Design

3.1 Entities

This domain creates and manages the following entities:

Entity	Owner	Usage
`run_file_import_progresses`	FILEIMPORT	Track upload/import status
`runs`	RUNRPT	Created by import process
`wells`	RUNRPT	Created with validation errors if applicable
`observations`	RUNRPT	Created with analysis results
`run_targets`	RUNRPT	Created for each mix/target combination
`run_mixes`	RUNRPT	Created for each mix in the run

See Database Reference for full schema.

3.2 Data Structures

Run File Import Progress

interface RunFileImportProgress {
  id: string;              // Unique identifier (from filename)
  original_name: string;   // Original filename for traceability
  file_path: string;       // Current S3 path
  status: ImportStatus;    // CONVERTING | IMPORTED | DUPLICATE | PARSE_ERROR | IMPORT_ERROR
  status_message?: string; // Error details if failed
  user_id: string;         // Uploading user
  site_id: string;         // Site context
  run_id?: string;         // Created run (if successful)
  uploaded_at: DateTime;   // Upload timestamp
}

type ImportStatus = 'CONVERTING' | 'IMPORTED' | 'DUPLICATE' | 'PARSE_ERROR' | 'IMPORT_ERROR';

Parsed Run Data (Internal)

interface ParsedRunData {
  run_info: {
    run_name: string;
    thermocycler_id?: string;
    runfile_created_at: DateTime;
    file_hash: string;
    custom?: Record<string, any>;
  };
  wells: ParsedWell[];
  targets: ParsedTarget[];
  observations: ParsedObservation[];
}

interface ParsedWell {
  well_number: string;
  label: string;
  well_uuid: string;
  error_code?: string;
  concentration_factor?: number;
}

interface ParsedObservation {
  readings: number[];
  ct?: number;
  cls?: Classification;
  dxai_ct?: number;
  dxai_cls?: Classification;
  target?: string;
  dye?: string;
  baseline_start?: number;
  baseline_end?: number;
  quantity?: number;
}

Well Label Structure

// Tag mapping for Well Label import approach
const WELL_LABEL_TAGS = {
  'A': 'accession',
  'C': 'testcode_name',
  'D': 'extraction_date',
  'E': 'extraction_settings',  // Format: instrument-batch
  'T': 'mix',
  'R': 'role_alias',
  'N': 'note',
  'X': 'quantity_multiplier',
  'Y': 'crossover',           // Value: 'XOVER'
  'W': 'tissue_weight',
};

3.3 State Transitions

Import Progress State Machine

State	Description	File Location
CONVERTING	File received, queued for processing	`{site}/Runs/Parsing/`
IMPORTED	Successfully imported	`{archive-bucket}/`
DUPLICATE	File hash matches existing run	`{site}/Runs/Problem_Files/`
PARSE_ERROR	Parser API failed	`{site}/Runs/Problem_Files/`
IMPORT_ERROR	Validation or database error	`{site}/Runs/Problem_Files/`

4. Interface Design

4.1 APIs Consumed

Service	Endpoint/Method	Purpose	REQ Trace
Parser API	`parse_run_file()`	Parse thermocycler file format	REQ-FILEIMPORT-002
Parser API	`extend_json()`	Get DXAI analysis results	REQ-FILEIMPORT-003
S3	`Storage::cloud()`	File storage operations	REQ-FILEIMPORT-010

Parser API Contract

Function: parse_run_file(file_path, api_url, api_key)

Input:
- file_path: Local path to thermocycler file
- api_url: Parser service URL
- api_key: Authentication key

Output:
- object: Parsed JSON structure containing runs, wells, channels, data

DXAI Extend API Contract

Function: extend_json(json_path, api_url, version, api_key)

Input:
- json_path: Path to prepared JSON with targets, wells, observations
- Targets include calibration_S3_uri for each target

Output:
- object: Extended data with dxai_cls and dxai_ct per observation

4.2 APIs Provided

Endpoint	Method	Purpose	REQ Trace
`/api/run-files`	POST	Upload run file	REQ-UPLOAD-001
`/api/run-files/{id}`	DELETE	Cancel upload	REQ-UPLOAD-006
`/api/run-file-import-progresses`	GET	List progress records	REQ-UPLOAD-003

Upload Request

POST /api/run-files
Content-Type: application/json

{
  "file_path": "temp/upload-uuid.sds",
  "original_name": "MyRun.sds"
}

Validation:
- original_name: required, ends_with:.sds,.SDS,.ixo,.IXO,.pcrd,.PCRD,.eds,.EDS

4.3 Events

Event	Direction	Payload	Purpose
`RunFileImportProgressBroadcast`	Pusher → Frontend	`{ id, status, progress }`	Real-time progress
`RunFileImported`	Internal	`{ user, run }`	Import complete notification

4.4 S3 Folder Structure

Folder	Purpose	REQ Trace
`{site}/Runs/toPcrai/`	Upload destination	REQ-FILEIMPORT-010
`{site}/Runs/Parsing/`	Processing queue	REQ-FILEIMPORT-010
`{site}/Runs/Problem_Files/`	Failed imports	REQ-FILEIMPORT-010
`{archive-bucket}/`	Successful imports (separate bucket)	REQ-FILEIMPORT-010
`{site}/LIMS_Reports/`	Export destination	REQ-FILEIMPORT-010

5. Behavioral Design

5.1 File Upload Flow

Algorithm: Upload Run File

Inputs:
- original_name: string - Original filename
- file_path: string - Temporary upload path
- user: User - Authenticated user

Outputs:
- void (async processing begins)

Assumptions:
- User has upload permission (Junior User, Senior User, Super-Admin)
- File extension already validated by controller

Steps:
1. Get site from user.loggedInSite
2. If site storage directory doesn't exist, create it
3. Generate unique filename from original_name
4. Copy file from temp to {site}/Runs/Parsing/{unique_name}
5. Create RunFileImportProgress record:
   - status: 'CONVERTING'
   - original_name: preserved for traceability
   - file_path: new S3 path
   - uploaded_at: now()
6. If no parsing job currently running:
   - Dispatch RunDataParseJob for next queued file

Notes:
- Files are processed sequentially to avoid resource contention
- Original filename preserved for audit (REQ-FILEIMPORT-013)

5.2 Parse and Import Flow

Algorithm: Parse Run Data

Inputs:
- filePath: string - S3 path to file
- site: Site - Processing context

Outputs:
- void (creates Run on success, updates status on failure)

Assumptions:
- File exists at filePath
- Parser API is available
- DXAI Analyser is available

Steps:
1. Calculate file hash (MD5)
2. Check for duplicate:
   a. Query runs.file_hash WHERE site_id = site.id
   b. If exists, move to Problem_Files with status 'DUPLICATE'
   c. Return early
3. Parse file via Parser API:
   a. Download from S3 to temp
   b. Call parse_run_file()
   c. On failure, move to Problem_Files with status 'PARSE_ERROR'
4. Transform via ParsedJson:
   a. Match mixes using well_properties_import_approach
   b. Extract well properties per approach
   c. Prepare observations for DXAI
   d. Call extend_json() for classification
5. Import via CreateRunAction:
   a. Create Run record
   b. Normalize well data
   c. Validate wells (generate error codes)
   d. Analyze and summarize
   e. Persist wells and observations
6. Update progress to 'IMPORTED' with run_id
7. Move file to archive bucket
8. Dispatch next file for parsing

Error Handling:
- DuplicateRunImportException: status = 'DUPLICATE'
- Parser exception: status = 'PARSE_ERROR'
- Import exception: status = 'IMPORT_ERROR' with message

5.3 Well Properties Import Approaches

The system supports three approaches for extracting well properties:

Approach	Mix Source	Thermocycler Source	Crossover Support
Well Label	T: tag in label	JSON field or label	Yes (Y:XOVER tag)
Meta Data	File metadata	JSON field	No
Run File Name	Filename segment	Filename segment	No

Algorithm: Determine Well Properties Import Approach

Steps:
1. Read client_configurations.well_properties_import_approach
2. If 'Label':
   a. Parse label using WellLabel class
   b. Extract tags: A, C, D, E, T, R, N, X, Y, W
   c. E tag destructures to extraction_instrument + batch_number
3. If 'Meta Data':
   a. Use mix.id from file metadata
   b. Get thermocycler from JSON field
4. If 'Run File Name':
   a. Parse: {mix}.{thermocycler}.{protocol}.{batch}.{analyst}.{ext}
   b. Generate virtual labels for "Unknown" wells: {batch}-{well_number}

5.4 Wildcard Target Matching

Algorithm: Match Target with Wildcard

Inputs:
- configTarget: string - Configured target name (may contain *)
- runTarget: string - Target name from run file

Outputs:
- boolean - Whether targets match

Steps:
1. Convert configTarget to regex:
   - Replace * with .*
   - Wrap with /^...$/
2. Test runTarget against regex
3. Return match result

Examples:
- "Target*" matches "Target A", "Target B", "Target123"
- "HBV*" matches "HBV-FAM", "HBV-Cy5"

5.5 Prepend Fake Cycles

Algorithm: Prepare Readings with Fake Cycles

Inputs:
- readings: number[] - Original observation readings
- target: Target - Target with prepend_fake_cycles_count

Outputs:
- number[] - Readings with prepended fake cycles

Assumptions:
- prepend_fake_cycles_count >= 0

Steps:
1. If target.prepend_fake_cycles_count == 0:
   - Return original readings
2. Create fake readings array:
   - Length: prepend_fake_cycles_count
   - Value: repeat first reading value
3. Concatenate fake readings + original readings
4. Return combined array

Post-Processing:
- After DXAI returns dxai_ct, subtract prepend_fake_cycles_count
- This adjusts CT back to original cycle numbering

6. Error Handling

6.1 Validation Errors

Error Code	Condition	Detection Point	User Impact
`SAMPLE_LABEL_IS_BAD`	Unrecognized label format	ParsedJson	Well shows error
`ACCESSION_MISSING`	Patient well lacks accession	PatientValidator	Well shows error
`EXTRACTION_BATCH_MISSING`	Missing E tag or batch	WellLabel, PatientValidator	Well shows error
`EXTRACTION_INSTRUMENT_MISSING`	Missing extraction instrument	PatientValidator	Well shows error
`EXTRACTION_DATE_MISSING`	Missing extraction date	PatientValidator	Well shows error
`INVALID_EXTRACTION_DATE`	Date after file creation	PatientValidator	Well shows error
`TESTCODE_MISSING`	Missing test code (Label approach)	PatientValidator	Well shows error
`UNKNOWN_TESTCODE`	Test code not configured	TestcodeExists rule	Well shows error
`MIX_DIDNT_MATCH_TC`	Mix doesn't match test code	MixMatchesTestCode rule	Well shows error
`MIX_MISSING`	Mix not found	ParsedJson	Well shows error
`THERMOCYCLER_UNKNOWN`	Thermocycler not configured	ParsedJson, CreateRunAction	All wells show error
`INVALID_PASSIVE_READINGS`	Zero in passive dye readings	ParsedJson	Well shows error
`INVALID_TISSUE_WEIGHT`	Weight format/range invalid	PatientValidator	Well shows error
`CROSSOVER_LABEL_ERROR`	Crossover label missing tags	PatientValidator	Well shows error

6.2 Import-Level Errors

Condition	Status	File Routing	REQ Trace
Duplicate file hash	DUPLICATE	Problem_Files	REQ-FILEIMPORT-011
Parser API failure	PARSE_ERROR	Problem_Files	REQ-FILEIMPORT-002
DXAI API failure	IMPORT_ERROR	Problem_Files	REQ-FILEIMPORT-003
Database error	IMPORT_ERROR	Problem_Files	REQ-FILEIMPORT-001

6.3 Invalid Data Persistence

Per REQ-FILEIMPORT-012, invalid well data is persisted to allow user correction:

// Well model includes invalid_raw_data JSON field
interface WellInvalidData {
  accession?: string;
  extraction_date?: string;
  extraction_instrument?: string;
  tissue_weight?: string;
  crossover_role_alias?: string;
  batch_number?: string;
  testcode_name?: string;
  quantity_multiplier?: string;
}

Invalid values are stored in invalid_raw_data JSON column, displayed in UI for user visibility.

7. Configuration

Setting	Location	Default	Effect	REQ Trace
`well_properties_import_approach`	`client_configurations`	Label	Import method selection	REQ-FILEIMPORT-005, REQ-FILEIMPORT-006, REQ-FILEIMPORT-007
`thermocycler_json_field`	`client_configurations`	None	JSON path to serial number	REQ-FILEIMPORT-002
`use_passive_dye`	`client_configurations`	false	Divide readings by passive	REQ-FILEIMPORT-003
`prepend_fake_cycles_count`	`targets`	0	Cycles to prepend per target	REQ-FILEIMPORT-009
`calibration_file_path`	`targets`	null	S3 URI for DXAI calibration	REQ-FILEIMPORT-003
`parser.url`	`.env`	-	Parser API endpoint	REQ-FILEIMPORT-002
`parser.api_key`	`.env`	-	Parser API authentication	REQ-FILEIMPORT-002
`parser.version`	`.env`	-	Parser API version	REQ-FILEIMPORT-003

See Configuration Reference for details.

8. Implementation Mapping

8.1 Code Locations

Component	Type	Path
RunFilesController	Controller	`code/app/Http/Controllers/RunFilesController.php`
RunFileImportProgressesController	Controller	`code/app/Http/Controllers/RunFileImportProgressesController.php`
StoreRunFilesAction	Action	`code/app/Actions/RunFiles/StoreRunFilesAction.php`
DispatchNextFileToParsingAction	Action	`code/app/Actions/RunFiles/DispatchNextFileToParsingAction.php`
RunDataParseAction	Action	`code/app/Actions/RunDataParseAction.php`
CreateRunAction	Action	`code/app/Actions/Runs/CreateRunAction.php`
RunDataParseJob	Job	`code/app/Jobs/RunDataParseJob.php`
ParsedJson	Service	`code/app/RunFileConverter/ParsedJson.php`
WildCardTargetNameMatcher	Service	`code/app/RunFileConverter/Support/WildCardTargetNameMatcher.php`
GetAutomaticBaselineFromMixChannel	Service	`code/app/RunFileConverter/GetAutomaticBaselineFromMixChannel.php`
WellLabel	Normalizer	`code/app/Analyzer/Normalizer/WellLabel.php`
RunFileName	Normalizer	`code/app/Analyzer/Normalizer/RunFileName.php`
PatientValidator	Validator	`code/app/Analyzer/Validators/PatientValidator.php`
RunFileImportProgress	Model	`code/app/RunFileImportProgress.php`
RunFileImportProgressBroadcast	Event	`code/app/Events/RunFileImportProgressBroadcast.php`
RunFileImported	Event	`code/app/Events/RunFileImported.php`
DuplicateRunImportException	Exception	`code/app/Exceptions/DuplicateRunImportException.php`

8.2 Requirement Traceability

REQ ID	Design Section	Code Location
REQ-FILEIMPORT-001	§5.1, §5.2	`StoreRunFilesAction`, `RunDataParseJob`, `RunDataParseAction`
REQ-FILEIMPORT-002	§5.2	`RunDataParseAction.parse()`, `ParsedJson`
REQ-FILEIMPORT-003	§5.2	`ParsedJson.getExtendJsonData()`
REQ-FILEIMPORT-004	§6.1	`PatientValidator`, `CreateRunAction`
REQ-FILEIMPORT-005	§5.3	`RunFileName`, `ParsedJson.alterMixesUsingWellMetaData()`
REQ-FILEIMPORT-006	§5.3	`ParsedJson.getBaselineStart/End()`, `GetAutomaticBaselineFromMixChannel`
REQ-FILEIMPORT-007	§5.3	`WellLabel.getIsCrossover()`, `ParsedJson.alterMixesUsingWellLabel()`
REQ-FILEIMPORT-008	§5.4	`WildCardTargetNameMatcher.doesMatch()`
REQ-FILEIMPORT-009	§5.5	`ParsedJson.prepareReadingsForExtend()`, `ParsedJson.getCtFromExtendedData()`
REQ-FILEIMPORT-010	§4.4	`StoreRunFilesAction`, `RunDataParseAction.moveFileTo*()`
REQ-FILEIMPORT-011	§5.2 step 2	`RunDataParseAction.checkForDuplicateImport()`
REQ-FILEIMPORT-012	§6.3	`CreateRunAction` (invalid_raw_data field)
REQ-FILEIMPORT-013	§5.1 step 5	`StoreRunFilesAction` (original_name), `RunFileImported`
REQ-UPLOAD-001	§5.1	`RunFilesController.store()`, `StoreRunFilesAction`
REQ-UPLOAD-002	§4.2	`RunFilesController.store()` validation
REQ-UPLOAD-003	§4.3	`RunFileImportProgressBroadcast`, `RunFileImportProgress`
REQ-UPLOAD-004	§3.1	`RunFileImportProgress.run_id`
REQ-UPLOAD-005	§6.2	`RunFileImportProgress.status_message`
REQ-UPLOAD-006	§4.2	`RunFilesController.destroy()`
REQ-UPLOAD-007	§5.1	`StoreRunFilesAction` (per-file processing)

9. Design Decisions

Decision	Rationale	Alternatives Considered
Sequential file processing	Prevents resource contention with Parser/DXAI APIs	Parallel processing (rejected: API rate limits, memory)
MD5 for duplicate detection	Fast, sufficient collision resistance for file identification	SHA-256 (rejected: slower, no benefit for this use case)
Preserve invalid data	Users need visibility to correct problems	Reject entirely (rejected: loses context for correction)
Three import approaches	Different labs have different thermocycler configurations	Single approach (rejected: incompatible with existing workflows)
Wildcard regex for targets	Flexible matching for variant target names	Exact match only (rejected: too rigid for real-world naming)
Separate archive bucket	Completed files shouldn't be accessible to clients	Same bucket subfolder (rejected: access control complexity)
Queue-based processing	Decouples upload from processing, better resilience	Synchronous processing (rejected: timeout risk, poor UX)

Document	Relevant Sections
SRS: fileimport.md	Requirements source
SRS: upload-runs.md	Upload requirements source
SDS: Architecture	Queue/job processing
SDS: KITCFG Domain	Mix/target configuration
SDS: RUNRPT Domain	Created run records
SDD: Algorithms	Legacy algorithm documentation

1. Overview​

1.1 Purpose​

1.2 Requirements Covered​

1.3 Constraints​

1.4 Dependencies​

2. Component Architecture​

2.1 Component Diagram​

2.2 Component Responsibilities​

3. Data Design​

3.1 Entities​

3.2 Data Structures​

3.3 State Transitions​

4. Interface Design​

4.1 APIs Consumed​

4.2 APIs Provided​

4.3 Events​

4.4 S3 Folder Structure​

5. Behavioral Design​

5.1 File Upload Flow​

5.2 Parse and Import Flow​

5.3 Well Properties Import Approaches​

5.4 Wildcard Target Matching​

5.5 Prepend Fake Cycles​

6. Error Handling​

6.1 Validation Errors​

6.2 Import-Level Errors​

6.3 Invalid Data Persistence​

7. Configuration​

8. Implementation Mapping​

8.1 Code Locations​

8.2 Requirement Traceability​

9. Design Decisions​

10. Related Documents​