Comprehensive benchmarking of methods for mutation calling in circulating tumor DNA
This study provides a comprehensive benchmarking resource for somatic variant detection in cell-free DNA (cfDNA) from cancer patients. Longitudinal plasma samples from colorectal and breast cancer cohorts were selected to create patient-matched dilution series spanning ultra-low to high circulating-tumour-DNA (ctDNA) fractions, while preserving each individual’s germline and clonal haematopoiesis background. Deep whole-genome sequencing (150×) and ultra-deep whole-exome sequencing (2,000×) generated a reference call set of ~37,000 single-nucleotide variants and ~58,000 insertions/deletions. These data enabled systematic evaluation of nine somatic variant callers across variable ctDNA levels and sequencing depths, and were further used to explore machine-learning–guided parameter tuning. The resulting dataset offers an openly accessible framework for developers and clinicians to assess and optimize somatic variant calling in liquid biopsy applications.
- Type: Cancer Genomics
- Archiver: European Genome-Phenome Archive (EGA)
Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data
| Dataset ID | Description | Technology | Samples |
|---|---|---|---|
| EGAD50000001870 | Illumina NovaSeq 6000 | 12 |
