Sarcomas are cancers of the bone and soft tissue often defined by their gene fusions. However, the timing, context, and processes by which these pathogenic fusions arise are unknown. We explored this in Ewing sarcoma, a cancer driven by EWSR1-ETS fusions, with very few cooperating mutations. Combining whole-genome sequencing with enhanced informatics, we found that the EWSR1-ETS fusion arose from striking rearrangement clusters in 42% of cases (52/124). Notably, these were organized in loops that universally contained the fusion at their center, while also weaving up to 18 genes together with it. We found the same pattern of rearrangements in three additional types of sarcoma. From these data, we define a new signature for sarcoma fusions that precedes other somatic changes, in the earliest replicating DNA of the genome. This dramatic, sudden process impinges on many genes – generating multiple coding changes that profoundly affect the transcriptome, with the disease-defining gene fusion at its core. These rearrangement loops emerge in an early ES clone from which both the primary tumor and the lethal relapse emerged, and then evolved in parallel until clinically detected.
Infiltration of human cancers by T cells is generally interpreted as a sign of immune recognition, and there is a growing effort to reactivate dysfunctional T cells at such tumor sites. However, these efforts only have value if the intratumoral T cell receptor (TCR) repertoire of such cells is intrinsically tumor-reactive, and this has not been established in an unbiased manner for most human cancers. To address this issue, we analyzed the intrinsic tumor-reactivity of the intratumoral TCR repertoire of CD8+ T cells in ovarian and colorectal cancer – two tumor types for which T cell infiltrates form a positive prognostic marker. Data obtained demonstrate that a capacity to recognize autologous tumor is limited to approximately 10% of intratumoral CD8+ T cells. Furthermore, in two out of four patient samples tested, no tumor-reactive TCRs were identified, despite infiltration of their tumors by T cells. These data indicate that the intrinsic capacity of intratumoral T cells to recognize adjacent tumor tissue can be rare and variable, and suggest that clinical efforts to reactivate intratumoral T cells will benefit from approaches that simultaneously increase the quality of the intratumoral TCR repertoire.
Testicular germ cell tumour (TGCT) is the most common cancer in young men1,2, and is characterised by strong inherited genetic risk factors3. Here we have undertaken large-scale genome wide association study (GWAS) for TGCT, encompassing ~7,500 cases and ~23,000 controls, through the Oncoarray consortium. We identified 19 novel loci, approximately doubling the number of known TGCT risk loci to 44 (P<5x10-8)4-14 and conduct deep, high-throughput functional annotation of all risk loci. We establish a network of physical interactions for all risk SNPs to candidate casual genes in 3D space, using high-throughput chromosome conformation capture techniques (HiC) in TGCT cells. Firstly, functional evidence reveals widespread disruption of developmental transcriptional regulators, consistent with failed primordial germ cell differentiation as an initiating step in TGCT oncogenesis15. We secondly observe multiple risk loci associated with defective microtubule assembly, compatible with the high level of aneuploidy observed in TGCTs, suggesting gross chromosomal instability. Finally KIT-MAPK signalling features as a recurrently dysregulated pathway. In summary our findings substantially increase the number of known TGCT risk alleles, and provides a functional basis for disease susceptibility.
Synovial sarcoma (SS) is defined by a recurrent t(x;18) chromosomal translocation, which produces the hallmark SS18-SSX oncogenic fusion. Incorporation of SS18-SSX into BAF complexes renders BAF complexes aberrant in two distinct manners: the addition of 78aa of SSX onto SS18, and concomitant loss of BAF47 assembly. However, the importance and functional contributions of each of these perturbations on BAF complex targeting and gene expression regulation remain unclear. Here we use an integrative set of genomic approaches in human cancer cell lines and primary tumor samples to define the mechanistic consequences of the SS18-SSX fusion oncoprotein. We find that SS18-SSX hijacks BAF complexes to broad polycomb domains to activate bivalent genes, driving a unique gene expression program distinct from other loss-of-function BAF complex malignancies. Importantly, restoration of BAF47 rescues enhancer activation but is dispensable for proliferative arrest in cell lines. These results demonstrate that gain-of-function SS18-SSX-mediated BAF complex targeting and gene activation is the driving event in SS, and present a mechanism by which distinct functions of BAF complexes can be co-opted to drive oncogenesis.
Transcriptional deregulation is a central event in the development of acute myeloid leukemia (AML). To identify potential disturbances in gene regulation, we conducted an unbiased screen of allele-specific expression (ASE) in 209 AML cases. The gene encoding GATA binding protein 2 (GATA2) displayed ASE more often than any other myeloid or cancer-related gene. GATA2 ASE was strongly associated with CEBPA double mutations (CEBPA DM), with 95% of cases presenting GATA2 ASE. In CEBPA DM AML with GATA2 mutations, the mutated allele was preferentially expressed. We found that GATA2 ASE is a somatic event lost in complete remission, supporting the notion that it plays a role in CEBPA DM AML. Acquisition of GATA2 ASE involved silencing of one allele via promoter methylation, compensated by overactivation of the other allele, thereby preserving expression levels. Notably, promoter methylation was also lost in remission together with GATA2 ASE. In summary, we propose that GATA2 ASE is acquired by epigenetic mechanisms and is a prerequisite for the development of AML with CEBPA DM. This finding constitutes a novel example of an epigenetic hit cooperating with a genetic hit in the pathogenesis of AML.
Single-molecule molecular inversion probes (smMIPs) provides a modular and cost-effective platform for high-multiplex targeted next-generation sequencing (NGS). Nevertheless, translating the raw smMIP-derived sequencing data into accurate and meaningful information currently requires proficient computational skills and a large amount of computational work, prohibiting wide-scale adoption of smMIP-based technologies. To enable easy, efficient, and accurate interrogation of smMIP-derived data, we developed SmMIP-tools, a computational toolset that combines the critical analytic steps for smMIP data interpretation into a single computational pipeline. Here, we describe in detail two major components of the software. The first is a read processing tool that performs quality control steps, generates read-smMIP linkages and retrieves molecular tags. The second is an error-aware variant caller capable of detecting single nucleotide variants (SNVs) and short insertions and deletions (indels). Using a cell-line DNA dilution series and a cohort of blood cancer patients, we benchmarked SmMIP-tools and evaluated its performance against clinical sequencing reports. We anticipate that SmMIP-tools will increase accessibility to smMIP-technology, enabling cost-effective genetic research to push personalized medicine forward.
Ductal Carcinoma In Situ (DCIS) is the most common form of pre-invasive breast cancer and despite treatment a small fraction (5-10%) of DCIS patients present with invasive disease many years later. A fundamental biologic question is whether the invasive disease recurring in the same breast is established by tumor cells in the initial DCIS or represents new unrelated disease. To address this question, we performed genomic analyses on the initial pure DCIS lesion and paired invasive recurrent tumors in 95 patients together with single cell DNA sequencing in a subset of cases. Our data shows that in 75% the invasive recurrence was clonally related to the initial DCIS, suggesting that the tumor cells were not eliminated during the initial treatment with surgery +/- radiotherapy. Surprisingly however, 18% were clonally unrelated to the DCIS, representing new independent lineages, and 7% of cases were ambiguous. Our findings show that although DCIS is often the precursor of invasive recurrence, a significant fraction of invasive recurrences are unrelated to the initial DCIS. This knowledge is essential for accurate risk evaluation of DCIS treatment de-escalation strategies and the identification of predictive biomarkers.
To elucidate the timing and mechanism of the clonal expansion of somatic mutations in cancer-associated genes in the normal endometrium, we conducted target sequencing of 112 genes for 1,298 endometrial glands and matched blood samples from 36 women. By collecting endometrial glands from different parts of the endometrium, we showed that multiple glands with the same somatic mutations occupied substantial areas of the endometrium. The 112 genes are as follows: ABCC1, ACRC, ANK3, ARHGAP35, ARID1A, ARID5B, ATCAY, ATM, ATR, BARD1, BCOR, BRCA1, BRCA2, BRD4, BRIP1, CAMTA1, CDC23, CDYL, CFAP54, CHD4, CHEK1, CHEK2, CTCF, CTNNB1, CUX1, DGKA, DISP2, DYNC2H1, EMSY, FAAP24, FAM135B, FAM175A, FAM65C, FANCA, FANCB, FANCC, FANCD2, FANCE, FANCF, FANCG, FANCI, FANCL, FANCM, FAT1, FAT3, FBN2, FBXW7, FGFR2, FRG1, GPR50, HEATR1, HIST1H4B, HNRNPCL1, HOOK3, KIAA1109, KIF26A, KMT2B, KMT2C, KRAS, LAMA2, LRP1B, MLH1, MON2, MRE11A, MSH2, MSH6, MTOR, NBN, PALB2, PHEX, PIK3CA, PIK3R1, PLXNB2, PLXND1, PMS2, POLE, POLR3B, PPP2R1A, PTEN, PTPN13, RAD50, RAD51, RAD51B, RAD51C, RAD51D, RAD52, RAD54B, RAD54L, RICTOR, SACS, SIGLEC9, SLC19A1, SLX4, SPEG, STT3A, TAF1, TAF2, TAS2R31, TFAP2C, TNC, TONSL, TP53, TTC6, UBA7, VNN1, WT1, XIRP2, ZBED6, ZC3H13, ZFHX3, ZFHX4, ZMYM4.
Precision oncology approaches employing genomics-guided targeted therapies for individual patients have provided significant survival benefits in several cancer types. However, low response rates in most solid malignancies, many patients without actionable genomic lesions, and increasing evidence that non-genomic mechanisms may play an important role in tumors indicate that genomics alone is often insufficient to inform and guide the clinical care of patients. Here, we show for the first time that comprehensive (phospho)proteome profiling is feasible and informative in a real-world prospective precision oncology setting. We developed a novel tumor pathway activity (TUPAC) scoring methodology that provides a holistic perspective on dysregulated receptor tyrosine kinase (RTK) signaling in individual patients. Based on 919 tumor tissue profiles of patients enrolled in the national NCT/DKTK MASTER study and the INFORM registry trial, we illustrate how TUPAC scoring can be used to uncover individual tumor biology in rare cancers and demonstrate that TUPAC methodology identifies EGFR-driven sarcomas despite the absence of genetic EGFR alterations. Overall, our work demonstrates the utility of the additional phosphoproteome data layer to enhance therapeutic recommendations in molecular tumor boards.