Skip to main content
Version: 3.0.1

File Import & Upload Design

Document Type: Domain Design (Tier 2) Domain: FILEIMPORT, UPLOAD Domain Character: Integration SRS Reference: fileimport.md, upload-runs.md Status: Draft Last Updated: 2026-01-25


1. Overview

1.1 Purpose

The File Import subsystem handles ingestion, parsing, validation, and processing of PCR run files from thermocycler instruments. It integrates with external systems (S3 storage, Parser API, DXAI Analyser) to transform raw instrument data into analyzable run records.

This is an integration-heavy domain with responsibilities spanning:

  • File upload acceptance and validation
  • S3 folder management and file routing
  • Thermocycler data parsing via external Parser API
  • Well data extraction and validation
  • DXAI Analyser integration for classification
  • Error handling and duplicate detection

1.2 Requirements Covered

REQ IDTitlePriority
REQ-FILEIMPORT-001Import Run Files from Monitored FolderMust
REQ-FILEIMPORT-002Parse Thermocycler Data to Database VariablesMust
REQ-FILEIMPORT-003Analyze Run Data Using DXAI AnalyserMust
REQ-FILEIMPORT-004Validate Well Data During ImportMust
REQ-FILEIMPORT-005Parse Run File Names for Well PropertiesMust
REQ-FILEIMPORT-006Import Observation Properties from Run FilesMust
REQ-FILEIMPORT-007Identify Crossover Wells During ImportMust
REQ-FILEIMPORT-008Identify Targets Using Wildcard MatchingMust
REQ-FILEIMPORT-009Support Prepending Fake Cycles for AnalysisMust
REQ-FILEIMPORT-010Manage Import Folder StructureMust
REQ-FILEIMPORT-011Prevent Duplicate File ImportsMust
REQ-FILEIMPORT-012Persist Invalid Well Data for VisibilityMust
REQ-FILEIMPORT-013Maintain File TraceabilityMust
REQ-UPLOAD-001Accept Run File UploadsMust
REQ-UPLOAD-002Validate Uploaded File TypesMust
REQ-UPLOAD-003Display Upload ProgressMust
REQ-UPLOAD-004Navigate to Uploaded FileMust
REQ-UPLOAD-005Display Upload ErrorsMust
REQ-UPLOAD-006Cancel Upload in ProgressMust
REQ-UPLOAD-007Handle Bulk UploadsMust

1.3 Constraints

Tier 2 Constraint: This document describes ownership, patterns, and design rationale. It links to reference docs for full schemas and API specifications.

1.4 Dependencies

DirectionDomain/ComponentPurpose
ConsumesS3 StorageFile storage and folder management
Parser APIThermocycler file parsing
DXAI AnalyserClassification and CT determination
Client ConfigurationImport settings (approach, formats)
Mix/Target ConfigTarget matching and calibration
Thermocycler ConfigInstrument identification
Provides toRUNRPTCreated run records
AnalyzerWell data for analysis rules

2. Component Architecture

2.1 Component Diagram

2.2 Component Responsibilities

ComponentLayerResponsibilityREQ Trace
RunFilesControllerControllerAccept uploads, validate extensionsREQ-UPLOAD-001, REQ-UPLOAD-002
StoreRunFilesActionActionStore file to S3, create progress recordREQ-FILEIMPORT-001, REQ-FILEIMPORT-010, REQ-FILEIMPORT-013
DispatchNextFileToParsingActionActionQueue next file for processingREQ-FILEIMPORT-001
RunDataParseJobJobOrchestrate parse-to-import pipelineREQ-FILEIMPORT-001, REQ-FILEIMPORT-002
RunDataParseActionActionParse file, detect duplicates, handle errorsREQ-FILEIMPORT-001, REQ-FILEIMPORT-002, REQ-FILEIMPORT-010, REQ-FILEIMPORT-011
ParsedJsonServiceTransform parsed JSON to importable formatREQ-FILEIMPORT-002, REQ-FILEIMPORT-003, REQ-FILEIMPORT-005, REQ-FILEIMPORT-006, REQ-FILEIMPORT-007, REQ-FILEIMPORT-008, REQ-FILEIMPORT-009
WildCardTargetNameMatcherServiceMatch targets with wildcard patternsREQ-FILEIMPORT-008
GetAutomaticBaselineFromMixChannelServiceExtract auto_baseline propertyREQ-FILEIMPORT-006
WellLabelNormalizerParse well label tagsREQ-FILEIMPORT-007
RunFileNameNormalizerParse run file name for propertiesREQ-FILEIMPORT-005
CreateRunActionActionCreate run, wells, observationsREQ-FILEIMPORT-001, REQ-FILEIMPORT-002, REQ-FILEIMPORT-004, REQ-FILEIMPORT-012
RunFileImportProgressModelTrack upload/import progressREQ-UPLOAD-003, REQ-UPLOAD-005
RunFileImportProgressBroadcastEventReal-time progress updatesREQ-UPLOAD-003
RunFileImportedEventSignal successful importREQ-FILEIMPORT-013

3. Data Design

3.1 Entities

This domain creates and manages the following entities:

EntityOwnerUsage
run_file_import_progressesFILEIMPORTTrack upload/import status
runsRUNRPTCreated by import process
wellsRUNRPTCreated with validation errors if applicable
observationsRUNRPTCreated with analysis results
run_targetsRUNRPTCreated for each mix/target combination
run_mixesRUNRPTCreated for each mix in the run

See Database Reference for full schema.

3.2 Data Structures

Run File Import Progress

interface RunFileImportProgress {
id: string; // Unique identifier (from filename)
original_name: string; // Original filename for traceability
file_path: string; // Current S3 path
status: ImportStatus; // CONVERTING | IMPORTED | DUPLICATE | PARSE_ERROR | IMPORT_ERROR
status_message?: string; // Error details if failed
user_id: string; // Uploading user
site_id: string; // Site context
run_id?: string; // Created run (if successful)
uploaded_at: DateTime; // Upload timestamp
}

type ImportStatus = 'CONVERTING' | 'IMPORTED' | 'DUPLICATE' | 'PARSE_ERROR' | 'IMPORT_ERROR';

Parsed Run Data (Internal)

interface ParsedRunData {
run_info: {
run_name: string;
thermocycler_id?: string;
runfile_created_at: DateTime;
file_hash: string;
custom?: Record<string, any>;
};
wells: ParsedWell[];
targets: ParsedTarget[];
observations: ParsedObservation[];
}

interface ParsedWell {
well_number: string;
label: string;
well_uuid: string;
error_code?: string;
concentration_factor?: number;
}

interface ParsedObservation {
readings: number[];
ct?: number;
cls?: Classification;
dxai_ct?: number;
dxai_cls?: Classification;
target?: string;
dye?: string;
baseline_start?: number;
baseline_end?: number;
quantity?: number;
}

Well Label Structure

// Tag mapping for Well Label import approach
const WELL_LABEL_TAGS = {
'A': 'accession',
'C': 'testcode_name',
'D': 'extraction_date',
'E': 'extraction_settings', // Format: instrument-batch
'T': 'mix',
'R': 'role_alias',
'N': 'note',
'X': 'quantity_multiplier',
'Y': 'crossover', // Value: 'XOVER'
'W': 'tissue_weight',
};

3.3 State Transitions

Import Progress State Machine

StateDescriptionFile Location
CONVERTINGFile received, queued for processing{site}/Runs/Parsing/
IMPORTEDSuccessfully imported{archive-bucket}/
DUPLICATEFile hash matches existing run{site}/Runs/Problem_Files/
PARSE_ERRORParser API failed{site}/Runs/Problem_Files/
IMPORT_ERRORValidation or database error{site}/Runs/Problem_Files/

4. Interface Design

4.1 APIs Consumed

ServiceEndpoint/MethodPurposeREQ Trace
Parser APIparse_run_file()Parse thermocycler file formatREQ-FILEIMPORT-002
Parser APIextend_json()Get DXAI analysis resultsREQ-FILEIMPORT-003
S3Storage::cloud()File storage operationsREQ-FILEIMPORT-010

Parser API Contract

Function: parse_run_file(file_path, api_url, api_key)

Input:
- file_path: Local path to thermocycler file
- api_url: Parser service URL
- api_key: Authentication key

Output:
- object: Parsed JSON structure containing runs, wells, channels, data

DXAI Extend API Contract

Function: extend_json(json_path, api_url, version, api_key)

Input:
- json_path: Path to prepared JSON with targets, wells, observations
- Targets include calibration_S3_uri for each target

Output:
- object: Extended data with dxai_cls and dxai_ct per observation

4.2 APIs Provided

EndpointMethodPurposeREQ Trace
/api/run-filesPOSTUpload run fileREQ-UPLOAD-001
/api/run-files/{id}DELETECancel uploadREQ-UPLOAD-006
/api/run-file-import-progressesGETList progress recordsREQ-UPLOAD-003

Upload Request

POST /api/run-files
Content-Type: application/json

{
"file_path": "temp/upload-uuid.sds",
"original_name": "MyRun.sds"
}

Validation:
- original_name: required, ends_with:.sds,.SDS,.ixo,.IXO,.pcrd,.PCRD,.eds,.EDS

4.3 Events

EventDirectionPayloadPurpose
RunFileImportProgressBroadcastPusher → Frontend{ id, status, progress }Real-time progress
RunFileImportedInternal{ user, run }Import complete notification

4.4 S3 Folder Structure

FolderPurposeREQ Trace
{site}/Runs/toPcrai/Upload destinationREQ-FILEIMPORT-010
{site}/Runs/Parsing/Processing queueREQ-FILEIMPORT-010
{site}/Runs/Problem_Files/Failed importsREQ-FILEIMPORT-010
{archive-bucket}/Successful imports (separate bucket)REQ-FILEIMPORT-010
{site}/LIMS_Reports/Export destinationREQ-FILEIMPORT-010

5. Behavioral Design

5.1 File Upload Flow

Algorithm: Upload Run File

Inputs:
- original_name: string - Original filename
- file_path: string - Temporary upload path
- user: User - Authenticated user

Outputs:
- void (async processing begins)

Assumptions:
- User has upload permission (Junior User, Senior User, Super-Admin)
- File extension already validated by controller

Steps:
1. Get site from user.loggedInSite
2. If site storage directory doesn't exist, create it
3. Generate unique filename from original_name
4. Copy file from temp to {site}/Runs/Parsing/{unique_name}
5. Create RunFileImportProgress record:
- status: 'CONVERTING'
- original_name: preserved for traceability
- file_path: new S3 path
- uploaded_at: now()
6. If no parsing job currently running:
- Dispatch RunDataParseJob for next queued file

Notes:
- Files are processed sequentially to avoid resource contention
- Original filename preserved for audit (REQ-FILEIMPORT-013)

5.2 Parse and Import Flow

Algorithm: Parse Run Data

Inputs:
- filePath: string - S3 path to file
- site: Site - Processing context

Outputs:
- void (creates Run on success, updates status on failure)

Assumptions:
- File exists at filePath
- Parser API is available
- DXAI Analyser is available

Steps:
1. Calculate file hash (MD5)
2. Check for duplicate:
a. Query runs.file_hash WHERE site_id = site.id
b. If exists, move to Problem_Files with status 'DUPLICATE'
c. Return early
3. Parse file via Parser API:
a. Download from S3 to temp
b. Call parse_run_file()
c. On failure, move to Problem_Files with status 'PARSE_ERROR'
4. Transform via ParsedJson:
a. Match mixes using well_properties_import_approach
b. Extract well properties per approach
c. Prepare observations for DXAI
d. Call extend_json() for classification
5. Import via CreateRunAction:
a. Create Run record
b. Normalize well data
c. Validate wells (generate error codes)
d. Analyze and summarize
e. Persist wells and observations
6. Update progress to 'IMPORTED' with run_id
7. Move file to archive bucket
8. Dispatch next file for parsing

Error Handling:
- DuplicateRunImportException: status = 'DUPLICATE'
- Parser exception: status = 'PARSE_ERROR'
- Import exception: status = 'IMPORT_ERROR' with message

5.3 Well Properties Import Approaches

The system supports three approaches for extracting well properties:

ApproachMix SourceThermocycler SourceCrossover Support
Well LabelT: tag in labelJSON field or labelYes (Y:XOVER tag)
Meta DataFile metadataJSON fieldNo
Run File NameFilename segmentFilename segmentNo
Algorithm: Determine Well Properties Import Approach

Steps:
1. Read client_configurations.well_properties_import_approach
2. If 'Label':
a. Parse label using WellLabel class
b. Extract tags: A, C, D, E, T, R, N, X, Y, W
c. E tag destructures to extraction_instrument + batch_number
3. If 'Meta Data':
a. Use mix.id from file metadata
b. Get thermocycler from JSON field
4. If 'Run File Name':
a. Parse: {mix}.{thermocycler}.{protocol}.{batch}.{analyst}.{ext}
b. Generate virtual labels for "Unknown" wells: {batch}-{well_number}

5.4 Wildcard Target Matching

Algorithm: Match Target with Wildcard

Inputs:
- configTarget: string - Configured target name (may contain *)
- runTarget: string - Target name from run file

Outputs:
- boolean - Whether targets match

Steps:
1. Convert configTarget to regex:
- Replace * with .*
- Wrap with /^...$/
2. Test runTarget against regex
3. Return match result

Examples:
- "Target*" matches "Target A", "Target B", "Target123"
- "HBV*" matches "HBV-FAM", "HBV-Cy5"

5.5 Prepend Fake Cycles

Algorithm: Prepare Readings with Fake Cycles

Inputs:
- readings: number[] - Original observation readings
- target: Target - Target with prepend_fake_cycles_count

Outputs:
- number[] - Readings with prepended fake cycles

Assumptions:
- prepend_fake_cycles_count >= 0

Steps:
1. If target.prepend_fake_cycles_count == 0:
- Return original readings
2. Create fake readings array:
- Length: prepend_fake_cycles_count
- Value: repeat first reading value
3. Concatenate fake readings + original readings
4. Return combined array

Post-Processing:
- After DXAI returns dxai_ct, subtract prepend_fake_cycles_count
- This adjusts CT back to original cycle numbering

6. Error Handling

6.1 Validation Errors

Error CodeConditionDetection PointUser Impact
SAMPLE_LABEL_IS_BADUnrecognized label formatParsedJsonWell shows error
ACCESSION_MISSINGPatient well lacks accessionPatientValidatorWell shows error
EXTRACTION_BATCH_MISSINGMissing E tag or batchWellLabel, PatientValidatorWell shows error
EXTRACTION_INSTRUMENT_MISSINGMissing extraction instrumentPatientValidatorWell shows error
EXTRACTION_DATE_MISSINGMissing extraction datePatientValidatorWell shows error
INVALID_EXTRACTION_DATEDate after file creationPatientValidatorWell shows error
TESTCODE_MISSINGMissing test code (Label approach)PatientValidatorWell shows error
UNKNOWN_TESTCODETest code not configuredTestcodeExists ruleWell shows error
MIX_DIDNT_MATCH_TCMix doesn't match test codeMixMatchesTestCode ruleWell shows error
MIX_MISSINGMix not foundParsedJsonWell shows error
THERMOCYCLER_UNKNOWNThermocycler not configuredParsedJson, CreateRunActionAll wells show error
INVALID_PASSIVE_READINGSZero in passive dye readingsParsedJsonWell shows error
INVALID_TISSUE_WEIGHTWeight format/range invalidPatientValidatorWell shows error
CROSSOVER_LABEL_ERRORCrossover label missing tagsPatientValidatorWell shows error

6.2 Import-Level Errors

ConditionStatusFile RoutingREQ Trace
Duplicate file hashDUPLICATEProblem_FilesREQ-FILEIMPORT-011
Parser API failurePARSE_ERRORProblem_FilesREQ-FILEIMPORT-002
DXAI API failureIMPORT_ERRORProblem_FilesREQ-FILEIMPORT-003
Database errorIMPORT_ERRORProblem_FilesREQ-FILEIMPORT-001

6.3 Invalid Data Persistence

Per REQ-FILEIMPORT-012, invalid well data is persisted to allow user correction:

// Well model includes invalid_raw_data JSON field
interface WellInvalidData {
accession?: string;
extraction_date?: string;
extraction_instrument?: string;
tissue_weight?: string;
crossover_role_alias?: string;
batch_number?: string;
testcode_name?: string;
quantity_multiplier?: string;
}

Invalid values are stored in invalid_raw_data JSON column, displayed in UI for user visibility.


7. Configuration

SettingLocationDefaultEffectREQ Trace
well_properties_import_approachclient_configurationsLabelImport method selectionREQ-FILEIMPORT-005, REQ-FILEIMPORT-006, REQ-FILEIMPORT-007
thermocycler_json_fieldclient_configurationsNoneJSON path to serial numberREQ-FILEIMPORT-002
use_passive_dyeclient_configurationsfalseDivide readings by passiveREQ-FILEIMPORT-003
prepend_fake_cycles_counttargets0Cycles to prepend per targetREQ-FILEIMPORT-009
calibration_file_pathtargetsnullS3 URI for DXAI calibrationREQ-FILEIMPORT-003
parser.url.env-Parser API endpointREQ-FILEIMPORT-002
parser.api_key.env-Parser API authenticationREQ-FILEIMPORT-002
parser.version.env-Parser API versionREQ-FILEIMPORT-003

See Configuration Reference for details.


8. Implementation Mapping

8.1 Code Locations

ComponentTypePath
RunFilesControllerControllercode/app/Http/Controllers/RunFilesController.php
RunFileImportProgressesControllerControllercode/app/Http/Controllers/RunFileImportProgressesController.php
StoreRunFilesActionActioncode/app/Actions/RunFiles/StoreRunFilesAction.php
DispatchNextFileToParsingActionActioncode/app/Actions/RunFiles/DispatchNextFileToParsingAction.php
RunDataParseActionActioncode/app/Actions/RunDataParseAction.php
CreateRunActionActioncode/app/Actions/Runs/CreateRunAction.php
RunDataParseJobJobcode/app/Jobs/RunDataParseJob.php
ParsedJsonServicecode/app/RunFileConverter/ParsedJson.php
WildCardTargetNameMatcherServicecode/app/RunFileConverter/Support/WildCardTargetNameMatcher.php
GetAutomaticBaselineFromMixChannelServicecode/app/RunFileConverter/GetAutomaticBaselineFromMixChannel.php
WellLabelNormalizercode/app/Analyzer/Normalizer/WellLabel.php
RunFileNameNormalizercode/app/Analyzer/Normalizer/RunFileName.php
PatientValidatorValidatorcode/app/Analyzer/Validators/PatientValidator.php
RunFileImportProgressModelcode/app/RunFileImportProgress.php
RunFileImportProgressBroadcastEventcode/app/Events/RunFileImportProgressBroadcast.php
RunFileImportedEventcode/app/Events/RunFileImported.php
DuplicateRunImportExceptionExceptioncode/app/Exceptions/DuplicateRunImportException.php

8.2 Requirement Traceability

REQ IDDesign SectionCode Location
REQ-FILEIMPORT-001§5.1, §5.2StoreRunFilesAction, RunDataParseJob, RunDataParseAction
REQ-FILEIMPORT-002§5.2RunDataParseAction.parse(), ParsedJson
REQ-FILEIMPORT-003§5.2ParsedJson.getExtendJsonData()
REQ-FILEIMPORT-004§6.1PatientValidator, CreateRunAction
REQ-FILEIMPORT-005§5.3RunFileName, ParsedJson.alterMixesUsingWellMetaData()
REQ-FILEIMPORT-006§5.3ParsedJson.getBaselineStart/End(), GetAutomaticBaselineFromMixChannel
REQ-FILEIMPORT-007§5.3WellLabel.getIsCrossover(), ParsedJson.alterMixesUsingWellLabel()
REQ-FILEIMPORT-008§5.4WildCardTargetNameMatcher.doesMatch()
REQ-FILEIMPORT-009§5.5ParsedJson.prepareReadingsForExtend(), ParsedJson.getCtFromExtendedData()
REQ-FILEIMPORT-010§4.4StoreRunFilesAction, RunDataParseAction.moveFileTo*()
REQ-FILEIMPORT-011§5.2 step 2RunDataParseAction.checkForDuplicateImport()
REQ-FILEIMPORT-012§6.3CreateRunAction (invalid_raw_data field)
REQ-FILEIMPORT-013§5.1 step 5StoreRunFilesAction (original_name), RunFileImported
REQ-UPLOAD-001§5.1RunFilesController.store(), StoreRunFilesAction
REQ-UPLOAD-002§4.2RunFilesController.store() validation
REQ-UPLOAD-003§4.3RunFileImportProgressBroadcast, RunFileImportProgress
REQ-UPLOAD-004§3.1RunFileImportProgress.run_id
REQ-UPLOAD-005§6.2RunFileImportProgress.status_message
REQ-UPLOAD-006§4.2RunFilesController.destroy()
REQ-UPLOAD-007§5.1StoreRunFilesAction (per-file processing)

9. Design Decisions

DecisionRationaleAlternatives Considered
Sequential file processingPrevents resource contention with Parser/DXAI APIsParallel processing (rejected: API rate limits, memory)
MD5 for duplicate detectionFast, sufficient collision resistance for file identificationSHA-256 (rejected: slower, no benefit for this use case)
Preserve invalid dataUsers need visibility to correct problemsReject entirely (rejected: loses context for correction)
Three import approachesDifferent labs have different thermocycler configurationsSingle approach (rejected: incompatible with existing workflows)
Wildcard regex for targetsFlexible matching for variant target namesExact match only (rejected: too rigid for real-world naming)
Separate archive bucketCompleted files shouldn't be accessible to clientsSame bucket subfolder (rejected: access control complexity)
Queue-based processingDecouples upload from processing, better resilienceSynchronous processing (rejected: timeout risk, poor UX)

DocumentRelevant Sections
SRS: fileimport.mdRequirements source
SRS: upload-runs.mdUpload requirements source
SDS: ArchitectureQueue/job processing
SDS: KITCFG DomainMix/target configuration
SDS: RUNRPT DomainCreated run records
SDD: AlgorithmsLegacy algorithm documentation