Gulhan Lab

Research

Mutational Processes

Similar to how archaeologists piece together human history from relics and ruins, we dig into cancer genomes and track down the footprints of biological processes using signature analysis techniques. The observation underlying this analytical approach is that biological processes underlying cancer leave their distinctive mutational patterns on the genome, referred to as signatures. Cancer genomes manifest a cumulative effect of these processes over time, allowing us to detect them decades after their activity. We gauge the levels of activities of different processes contributing to the malignant transformation and observe their changes over time. Most importantly, the link between mutational signatures and biological mechanisms provides a window into the tumors' molecular identity

Although signature analysis has been instrumental and transformed our understanding of mechanisms of genomic instability in cancer, there are still several important shortcomings in the current statistical modelling approaches. Addressing these challenges require advanced computational methods that can incorporate the complexity of mutagenesis in cancer. Our lab contributes to the development of the next-generation of signature analysis methods by rethinking the design of the models we use.

An important consideration is the dependence of signatures across genomic loci and time (see next section). Activitiy of each process depends on various features such as replication/transcription timing and strand, histone marks and genome organization. A big contributor to the variability is the repair pathways activity of which are highly influenced by the epigenome, replication, transcription and cell cycle. Relying on the vast history of research on these links and combining it with the findings from loci dependent signature analysis enable improved
mechanistic interpretation and help in annotation of etiology of signatures. Additionally, these methods are instrumental in determining how signatures contribute to key cancer-promoting events. 

Mutational patterns are jointly shaped by error mechanisms – such as DNA damage, replication errors, and error-prone repair – along with correction mechanisms of DNA proof-reading and repair. Isolating these factors will significantly enhance our understanding of repair deficiencies. We aim to decouple the contributions of damage, repair, and replication on shaping the mutational signatures that we traditionally use. This is a complex problem in particular due to the unhomogeneous activity of repair pathways across cell cycle stages and differences in the activity of repair pathways based on the underlying damage/error process. However, it is a very important problem to solve as in repair deficient tumors the signature of the same mutational process will appear differently from that observed in repair proficient tumors. As the cancer datasets we mine grows over time, we are gaining sensitivity to the nonlinear effects in repair deficient tumors. This poses challenges in reliable discovering a unique pattern of mutations for each biological process.

Our signature analysis methods aim to improve the interpretability of signature analysis and, therefore, facilitate their use as clinical biomarkers. We discuss the clinical applications we pursue in the next section.

Tumor Evolution

Interpreting large-scale genomic rearrangements is more challenging than point mutations due to their long-range effects, which result in intersections between alterations. As a result, the temporal sequence of alterations matters, and existing methods do not consider temporal dynamics when predicting and interpreting the patterns of large scale genomic alterations. However, this can lead to issues where the mutation types are not defined accurately. Unlike point mutations large scale alterations may overlapping with each other and they do so frequently. For example two copy number alterations that overlap will lead to three segments none of which have the length distribution of the original segments whose copy numbers were changed (shown in right hand side figure). Through a deconvolution in time it is possible to resolve such ambiguities. Once again, we exploit large cancer datasets for training reinforcement learning algorithms that aim to predict the sequence of alterations all the while inferring the characteristic alteration patterns that govern the processes.

Point mutations can be used to time copy number amplifications based on the assumption that mutation counts reflect molecular clock, meaning older cells carry more mutations. This is based on the simple observation that the early amplifications will carry higher count late point mutations – generated  after the amplification, on single copies of the chromosome – while, late amplifications will carry higher counts early mutations – duplicated on multiple chromosomes. We implement an algorithm that uses basic linear algebra to find analytical bounds for copy number amplification timing.

We apply our algorithms to infer the evolutionary trajectories of tumors. In particular, we focus on early stage cancer development and late stage evolution, which are less understood. For this purpose, we collaborate with clinical researchers (see below).

Genomic Instability

Classifying a patient's tumor in terms of the active mechanisms of genomic instability is instrumental in personalizing cancer treatments. The goal is to pick the right drug that can exploit the specific vulnerabilities caused as a result of these underlying mechanisms. These strategies go hand in hand with drug development and substantially improve patient outcomes because without the right biomarker, the new treatments might not show the desired efficacy in the clinic.

We develop patient classification tools tailored to the sequencing methodologies utilized in clinical practice. Our approaches are rooted in a comprehensive data mining initiative that involves the detailed examination of genomic instability subtypes within vast cancer genome cohorts from large consortia.

 

Dysfunctional DNA repair pathways and cell-cycle checkpoints are mutational processes that can be detected using signatures and aid in classification of patients for targeted therapies

 

Additionally, there are intricate the links between genomic instability and anti-tumor immune responses. Genomic instability triggers innate immune responses and leads to elevated count of neoantigens, therefore its detection is pivotal for immunotherapies such as immune checkpoint blockade. 

 

 

Figure Caption: The complex genomic landscape of a germline BRCA1 carrier. The image was created using chromoscope

We mine big public cancer datasets to catalogue different types of genomic instability. We enrich these resources with genomes from unique cohorts of patients through close collaborators. Especially, we spend substantial time working on archetypal samples of repair deficiencies, such as BRCA1/2 mutant tumors as a goldstandard representatives of homologous recombination deficient samples. Our previous work in homologous recombination repair and mismatch repair deficient cancers examplify our approaches. We expand this to a larger variety of genomic instability mechanisms. We use archetypal samples to first identify the characteristic signatures for a given genomic instability types and then use these key samples in training of our algorithms.

In addition to classifying the main categories of genomic instability it is important to appreciate the vast diversity of genomtypes among tumors that share a given mechanism. We investigate whether this diversity might be linked with molecular subtypes of genomic instability and whether signatures can aid in their deliniation. For instance, can we use structural variant signatures to define homologous recombination deficiency subtypes with differential responses to PARP inhibitors and chemotherapy? 

Having a comprehensive suite of tools to catalogue genomic instability of various types is very helpful to explore the diversity within a group, because often this can be explained by presence of multiple types of genomic instability and instead of a multi-class approach we try to score each tumor in terms of a wide set of mechanisms with the goal of providing a comprehensive multi-facetted characterization. 

Liquid Biopsies

By expanding our signature analysis toolkit to include applications tailored for sequencing data from liquid biopsies alongside tissue biopsies, we aspire to introduce non-invasive diagnostic tools and novel early cancer detection strategies, with a particular focus on patients with a genetic predisposition. In addition to signature analysis on mutations, we use similar strategies and design fragmentomics signatures, which open up a whole new world of signatures that brings us new insights on mechanisms of cell-free DNA generation and increases our sensitivity to detect tumor at low allelic fraction. Stay tuned for more!

Clinical Collaborations

Our research is heavily influenced by collaborations with clinical teams as we strive to develop methods that translate directly to precision oncology applications. Notably, our team plays a key role in bioinformatics analysis within the Investigational Cancer Therapeutics team at the Termeer Center at Mass General Hospital (MGH). The longitudinal and sequential biopsy samples from the Rapid Autopsy Program provide invaluable data from late-stage cancer patients enrolled in first-in-human trials. Using this rich resource of data, we investigate the drivers of metastasis to vital organs such as liver and brain, which often result in patient mortality. Employing our novel mutation timing and tumor phylogeny reconstruction algorithms we construct genomic trajectories. We integrate this information with the transcriptional profiles and immune microenvironment that we infer using bulk and single-cell RNA sequencing data, thereby, obtaining holistic understanding of the underlying biology.

 

Another vital partnership is with the Cancer Early Detection and Diagnostics Clinic at MGH, focusing on early cancer detection in individuals with genetic predispositions via circulating tumor DNA. We are working on a prospective study in which tissue and blood samples are collected from patients with genetic predisposition to develop novel early detection algorithms. Our goal is to recognize early-stage cancers and potentially pre-cancerous conditions through non-invasive blood tests. This project also allows us to study early stages of cancer development.

Are you interested in our research and want to be a part of the team?

CNY 149 13th Street | Charlestown, MA 02129, USA

© Gulhan Lab