• Starts: 10:30 am on Thursday, October 11, 2018

Title: "Development of Reproducible Genomics Data Analysis Pipelines and Their Application to Transcription-related Datasets"

Committee: Fred Winston, PhD - HMS Genetics (Co-Advisor); Ahmad Khalil, PhD - BU BME (Co-Advisor, Chair); John Ngo, PhD - BU BME; Daniel Segre, PhD - BU Biology; L. Stirling Churchman, PhD - HMS Genetics.

Abstract: The complexity of genomics data and its analysis makes errors likely and the validity of reported results difficult to assess. This makes it critical to provide a complete and reproducible record of how results were obtained, which others can use to decide how strongly to believe the results, to find errors so the record can be corrected, and to make improvements for further analyses. To this end, the major work described in this prospectus is the development of analysis pipelines based on the Snakemake platform, which lend themselves to reproduction by others. I have developed pipelines for a number of data types, including transcription start site sequencing (TSS-seq), high resolution ChIP (ChIP-nexus), nascent transcript sequencing (NET-seq), RNA sequencing (RNA-seq), and micrococcal nuclease sequencing (MNase-seq). I describe results obtained from the application of these pipelines to data from three projects in progress relating to the biological process of transcription: 1) A study of the transcription elongation factor Spt6 and the phenomenon of intragenic transcription. 2) A study characterizing possible functions of intragenic transcription in stress conditions. 3) A study of the transcription elongation factor Spt5.

Location:
NRB room 350, 77 Avenue Louis Pasteur, Boston, MA 02115