ECE Seminar with Sreeram Kannan

  • Starts: 4:00 pm on Monday, March 3, 2014
Optimal RNA Assembly: From Information Theory to Software With Dr. Sreeram Kannan, University of California, Berkeley Faculty Host: Bobak Nazer Refreshments will be served outside Room 339 at 3:45 p.m. Abstract: High throughput sequencing of RNA has emerged in the last few years as a powerful method that enables discovery of novel transcripts and alternatively spliced isoforms of genes, along with accurate estimates of gene expression. In this work, we study the fundamental limits of de novo transcriptome assembly using RNA shotgun-sequencing, where the sequencing technology extracts short reads from the RNA transcripts. We propose a new linear-time algorithm for transcriptome reconstruction and derive sufficient conditions on the length of reads under which the algorithm will succeed. We also derive fundamental information-theoretic conditions for reconstruction by any algorithm and show that the proposed algorithm is near-optimal on a real data set. Along the way, we show that the NP-hard problem of decomposing a flow into the fewest number of paths can be solved in linear time for a family of instances, and biologically relevant instances tend to fall in this family. We also describe the construction of a software package for RNA assembly based on this theory and show that it obtains significant improvements in reconstruction accuracy over state-of-the-art software. About the Speaker: High throughput sequencing of RNA has emerged in the last few years as a powerful method that enables discovery of novel transcripts and alternatively spliced isoforms of genes, along with accurate estimates of gene expression. In this work, we study the fundamental limits of de novo transcriptome assembly using RNA shotgun-sequencing, where the sequencing technology extracts short reads from the RNA transcripts. We propose a new linear-time algorithm for transcriptome reconstruction and derive sufficient conditions on the length of reads under which the algorithm will succeed. We also derive fundamental information-theoretic conditions for reconstruction by any algorithm and show that the proposed algorithm is near-optimal on a real data set. Along the way, we show that the NP-hard problem of decomposing a flow into the fewest number of paths can be solved in linear time for a family of instances, and biologically relevant instances tend to fall in this family. We also describe the construction of a software package for RNA assembly based on this theory and show that it obtains significant improvements in reconstruction accuracy over state-of-the-art software.
Location:
Photonics Center, 8 Saint Mary’s St., Room 339
Registration:
http://www.bu.edu/ece/files/2014/02/Kannan.pdf

Back to Calendar