73. ampliconarchitect: reconstruction of complex rearrangements of tumor gene amplification

Department: Computer Science & Engineering
Faculty Advisor(s): Vineet Bafna

Primary Student
Name: Viraj Balkrishna Deshpande
Email: vdeshpan@ucsd.edu
Phone: 979-224-6028
Grad Year: 2017

Abstract
We show that genomic amplifications, including oncogenes, mediated through extrachromosomal episome formation drive as many as 40% of all human cancers, drastically higher than any previous estimate of 1.4% of all cancers. The gene amplifications appear to be hotspots for genomic rearrangements. Additionally, virus-mediated cancers, which account for 15% of all cancer cases, display integration of viral integration into human genome followed by chimeric amplification of virus and human DNA. Traditional structural variation analysis tools only provide a primitive information about genomic rearrangements. We present AmpliconArchitect(AA), a computational approach to elucidate the entire structure of a focal amplification. AA takes as input whole genome sequencing (WGS) paired-end reads aligned to a combined human-virus reference genome, using the following steps: (a) Use discordant read-pair alignments and coverage information to iteratively identify and extend connected genomic regions with high copy numbers. (b) Segment amplified regions using a mean-shift technique to detect copy number changes. (c) Use discordant read-pair clusters to construct a breakpoint graph connecting segments. (d) Compute a maximum likelihood network flow to estimate copy counts of genomic segments. (e) Report a decomposition paths and cycles in the graph that identify the dominant linear and circular structures involved in the chimeric human-viral amplicon. We demonstrate that AA can accurately predict the true architecture by simulating 320 focal amplifications with across a diverse set of parameters include size, copy number, sequence coverage and number of rearrangement events. For comparing our "cycle decomposition" in step (e), we propose the "Repeat-DCJ" problem, a special version of the double-cut-and-join problem for measuring rearrangement complexity by only allowing graph edge swaps across repeated segments. We present analysis of amplicon reconstruction in 117 cancer samples across 13 different types and contrast these results with The Cancer Genome Atlas database of 11079 samples. We show that oncogenes are enriched on these amplifications. Additionally we found that 13/21 cases of cervical cancer displayed chimeric amplification of human papillomavirus (HPV) alongwith portions of the human genome. These chimeric amplifications contain oncogenes from the viralDNA as well as the human DNA. In conclusion, AA makes major advances towards reconstruction of complex rearrangements and provides valuable biological insights into focal amplification in human cancer.

Industry Application Area(s)
Life Sciences/Medical Devices & Instruments

« Back to Posters or Search Results