Computational Cancer Genomics

Section Leader
Rendong Yang, PhD
Assistant Professor


Developing novel algorithms for indel detection. Our most recent work has been in the development of clinical genomic variant detection pipelines for our customized oncology gene panels in the University of Minnesota Molecular Diagnostic Lab. Briefly, we developed new algorithm named ScanIndel to accurately detect insertion and deletion (indel) mutations in human genome from next generation sequencing data. In particular, ScanIndel reliably detects medium-size and large indels. With this method, indels contribute to pathogenesis of constitutional and somatic diseases can be identified quickly and accurately which is important for targeted therapy or patient prognosis

Detecting gene rearrangement, splicing and epigenetic regulator of AR in prostate cancer. We developed an integrated pipeline to detect structural rearrangements in the AR gene, which encodes the androgen receptor. Through this work, we identified diverse genomic-dependent and genomic-independent AR splicing variants expressed prostate cancer. Additionally, our early work focuses on prostate cancer eipgenomic studies by analyzing the ChIP-seq data of AR and BRD4 in prostate cancer VCaP cell line. Our study revealed the crosstalk between AR and BRD4 signaling in prostate cancer progression.

Developing algorithms for detecting cancer biomarkers. We have been developing EgoNet algorithm by integrating gene expression microarray data and protein interaction networks to identify network modules that can distinguishing different breast cancer subtypes.

Identifying specific subtypes of prostate cancer and the distinct pattern of mutations associated with them will enhance development of precise diagnostic tools that detect specific genetic aberrations, allowing doctors to reliably predict a patient’s outcome and prescribe personalized treatment

Current Research Projects

  1. Detecting novel indels from prostate cancer genome.

Insertions and deletions (indels) are a major class of genomic variation associated with human disease. Indels are primarily detected from DNA sequencing (DNA-seq) data but their transcriptional consequences remain unexplored due to challenges in discriminating medium-sized and large indels from splicing events in RNA-seq data.

We developed transIndel, a splice-aware algorithm that parses the chimeric alignments predicted by a short read aligner and reconstructs the mid-sized insertions and large deletions based on the linear alignments of split reads from DNA-seq or RNA-seq data. TransIndel exhibits competitive or superior performance over eight state-of-the-art indel detection tools on benchmarks using both synthetic and real DNA-seq data. We applied transIndel to DNA-seq and RNA-seq datasets from 333 primary prostate cancer patients from The Cancer Genome Atlas (TCGA) and 59 metastatic prostate cancer patients from AACR-PCF Stand-Up- To-Cancer (SU2C) studies. TransIndel enhanced the taxonomy of DNA- and RNA-level alterations in prostate cancer by identifying recurrent FOXA1 indels (Figure 1) as well as exitron splicing in genes implicated in disease progression (Figure 2).


  1. Delineating lncRNA landscape in prostate cancer genome

Prostate cancer (PCa) is the most commonly diagnosed cancer in men in United States, with significant health impact. Clinically, it is complicated with the lack of biomarkers and effective treatments for aggressive disease, particularly castration-resistant prostate cancer (CRPC). We have gained much insight into the biology of PCa through studying protein-coding genes, but they represent only a small fraction of our genome. It is now well accepted that the vast majority of human genome (about 75%) is actively transcribed, but protein-coding genes only account for about 2% the genome. This means the majority of the human transcriptome is comprised of noncoding RNAs (ncRNAs). Among ncRNAs, long noncoding RNAs (lncRNAs), typically >200 bp, commonly characterized by polyadenylation, splicing of multiple exons, promoter trimethylation of histone H3 at lysine 4 (H3K4me3), and transcription by RNA polymerase II, have increasingly been recognized as playing essential roles in tumor biology, representing a new focus in cancer research.

Many lncRNAs have been shown to be either up- or download-regulated in various cancers, including PCa. Several PCa-specific or PCa-associated lncRNAs have been identified to date, but only a few have been validated in independent patient cohorts or approved for clinical practice. Our current research is developing novel computational methods to achieve the first complete compendia of CRPC-associated lncRNAs and reveal the dynamic interplay between lncRNAs and tumorigenesis, progression and metastasis, which will highlight the importance of lncRNAs in the etiology of PCa.


  1. Detecting novel biomarkers for cancer immunotherapy

Immune checkpoint blockade therapy has proved to be effective on a number of cancer types such as skin, lung and kidney cancer. However, only part of the patients has response to immunotherapy drugs. We have developed a novel computational algorithm which can sensitively detect previously missed novel splicing events in human transcriptome from RNA-seq data. We utilize whole exome sequencing and RNA-seq from renal cell carcinoma, lung cancer and melanoma to correlation of the expression of our detected splicing event with immune checkpoint therapy response or resistance. This study aims to improve the computational methodology to detect and quantify novel alternative splicing events and to determine their involvement in immunotherapy-associated phenotypes. Integrative analysis of DNA mutations and RNA splicing events in the responders and non-responder patients is able to identify a list of candidate genomic independent alternative splicing events that play a role underlying the resistance of immunotherapy in the non-responders or the effects in the responders.


PhRMA foundation Research Starter Grant 2018


Yang R, Van Etten JL, Dehm SM. Indel detection from DNA and RNA sequencing data with transIndel. BMC Genomics. 2018;19:270.