BU Bioinformatics Program’s 10th Anniversary Symposium (October 3-4, 2009)

View more Symposium photos.


The fall of 2009 marked the 10th anniversary of the inaugural year of the Boston University Bioinformatics Graduate Program, an event the Program celebrated with a two day symposium on October 3 and 4.  The symposium, open to the general public as well as alumni, was held on the BU campus and featured talks given primarily by alumni from the Bioinformatics Program describing their most recent work.

A local Program Committee made up of Bioinformatics Graduate Program alumni invited speakers who represent several different research areas, as well as different employment sectors (academic, business, industry, law) and different years of entry into the Program.

The BU Bioinformatics 10th Anniversary Program Committee was comprised of: William Blake, Adnan Derti, Farren Isaacs, Melissa Landon, Jason Laramie, Joe Mellor, Yan Meng, Esther Rheinbay, Boris Shakhnovich, Evan Snitkin, Yu Zheng. The Program thanks them for all their careful planning efforts.

The schedule of events unfolded as follows:

Schedule

October 3, 2009

10:00 a.m. Welcome
Charles DeLisi, Arthur G.B. Metcalf Professor of Science and Engineering
Andrei Ruckenstein, Vice President and Associate Provost for Research
Thomas Tullius, Interim Director, Bioinformatics Program
Session I Topic: New Techniques in Biotechnology
Moderator: Zhiping Weng, Professor UMASS Medical School
10:30 a.m. Chunming Ding, Principal Investigator, Singapore Institute for Clinical Sciences
Title: Towards Non-Invasive Prenatal Diagnosis of Down Syndrome Abstract
11:00 a.m. Yu Zheng, Staff Scientist, New England Biolabs
Title: Enzyme Mining and Engineering – The Ending of the Stone Age Abstract
11:30 a.m. Joe Mellor, Postdoctoral Fellow, Harvard Medical School
Title: Detecting Genetic Interaction with High-Capacity Sequencing Abstract
12 NOON Lunch on the COM Lawn
Session II Topic: Industry and Consulting
Moderator: Sandor Vajda, Professor, Department of Biomedical Engineering
1:30 p.m. Jason Laramie, Principle Scientist, Pfizer Inc.
Title: Target Discovery in the Era of Genomics Abstract
2:00 p.m. Gabriel Eichler, Innovation Development Manager, InnoCentive
Title: “Open Innovation and Bioinformatics at InnoCentive” Abstract
2:30 p.m. Karl Clodfelter, Scientist, Forma Therapeutics
Title: Bringing CSMap Technology from Academia to Industry. Abstract
3:00 p.m. Break
Session III Topic: Genetics and Genomes
Moderator: Gary Benson, Associate Professor of Bioinformatics
3:15 p.m. Jeremiah Faith, Postdoctoral Scholar, Washington University in St. Louis
Title: Predicting a Mammalian Microbiota’s Response to Changes in Host Diet Abstract
3:45 p.m. Elinor Karlsson, Postdoctoral fellow, FAS Center for Systems Biology, Harvard University
Title: Natural Selection and Infectious Disease in Humans Abstract
4:15 p.m. Farren Isaacs, Research Fellow, Harvard Medical School
Title: Programming Cells by Multiplex Genome Engineering, Accelerated Evolution & New Genetic Codes Abstract
4:45 p.m. Poster Session in LSEB Lobby
6:30 p.m. Dinner in the SMG Dining Room followed by a performance by the Galileo Players RSVP Required

October 4, 2009

10:00 a.m. Welcome Simon Kasif, Professor, Department of Biomedical Engineering
Session IV Topic: Proteins
Moderator: Scott Mohr, Director of Graduate Studies, Bioinformatics Program
10:15 a.m. Melissa Landon, Scientist, Cubist Pharmaceuticals, Inc.
Title:  Identification of Novel Stabilizing Agents of Glucocerebrosidase via Computational and Experimental Methods. Abstract
10:45 a.m. Julian Mintseris, Research Fellow, Harvard Medical School
Title: Deriving Protein Structure Constraints from Chemical Cross-Linking and Mass Spectrometry Abstract
Session V Topic: Small Biotech
Moderator: Daniel Segrè, Assistant Professor, Bioinformatics
11:15 a.m. Dmitriy Leyfer, Senior Scientist, Pfizer, Inc.
Title: “The art of NaPPing: Holistic Approach to Network and Pathway Profiling” Abstract
11:45 p.m. Boris Shakhnovich, Founder, President and CEO, MyMetaLab Inc.
Title: From here to Orwik: Why I left academia to become an entrepreneur Abstract
12:15 p.m. Lunch in LSEB 103
Session VI Topic: Bioinformatics and Human Health
Moderator: Yu Xia, Assistant Professor, Bioinformatics Program
1:45 p.m. Yaoyu Wang, Research Fellow, Harvard Medical School
Title: Identifying High Fitness Cost Regions in Human Immunodeficiency Virus by Evolutionary Analysis Abstract
2:15 p.m. Tim Reddy, Postdoctoral Fellow, HudsonAlpha Institute for Biotechnology
Title: Genomic Determination of the Glucocorticoid Response Reveals Unexpected Mechanisms of Gene Regulation Abstract
2:45 p.m. Closing Remarks Scott Mohr

Abstracts

Towards Non-Invasive Prenatal Diagnosis of Down Syndrome

Presenter: Chunming Ding, Principal Investigator, Singapore Institute for Clinical Sciences
Abstract: The discoveries of fetal DNA and RNA in the maternal circulation during pregnancy offer a new window for definitive and non-invasive prenatal diagnosis of Trisomy 21 and many other genetic disorders. We undertake two approaches. The first approach is based on the hypothesis that fetal DNA and maternal DNA in the maternal blood may be differentially methylated such that fetal DNA can be specifically enriched. The second approach is based on the hypothesis that fetal RNA and maternal RNA, given that they are derived from different tissues, are present at different levels for different genes in the maternal blood. It is thus potential possible to identify transcripts that are fetal-specific in the maternal blood. We further developed a polymorphism-based approach to directly assess the fetal chromosome dosage. In summary, we show that fetal RNA biomarkers may be robust biomarkers for noninvasive prenatal diagnosis of Trisomy 21.

Enzyme mining and engineering — the ending of the Stone Age

Presenter: Yu Zheng, Staff Scientist, New England Biolabs Inc.
Abstract: Enzymes, especially those working on nucleic acids, set not only the foundation of the emerging molecular biotechnology but oftentimes a frameset to restrain researchers’ creativity to solve complex biology problems. Nucleic acid enzymes are naturally found in the genomes of microbes and viruses and used as wild-type forms, as prehistory human beings found handy tools in the wild in the Stone Age. Luckily, the diversity in nature has made it possible that there are many kinds of homologous enzymes with different properties. However, in many cases, the enzymes one finds in nature are not perfect one way or the other so there is need to find novel enzymes or engineer existing ones toward desired properties. In this talk, I will briefly discuss a few case studies on enzyme mining and engineering.

Detecting Genetic Interaction with High-Capacity Sequencing

Presenter: Joseph Mellor, Postdoctoral Fellow, Harvard Medical School
Abstract: Genetic interactions—surprising phenotypes sometimes occurring when multiple genes are simultaneously perturbed— help shape our understanding of nearly all known biological pathways, providing clues about parallel action, collective action, and order of action by genes in pathways. Methods that accelerate the rate of genetic interaction screening are crucial for producing comprehensive genetic interaction maps in a broad spectrum of environmental conditions. In the model organism S. cerevisiae, barcoded strains harboring individual gene deletions often used for pooled measurements of single-deletion strain growth phenotypes.  Here, we describe an approach based on sequencing of pairs of fused barcode tags that represent the double-deletion genotypes of many possible mutant combinations growing simultaneously in complex pools. We show that this method can be used to measure individual strain growth and detect genetic interaction in large heterogenous pools containing thousands of strains.

Target Discovery in the ERA of Genomics

Presenter: Jason Laramie, Principle Scientist, Pfizer Inc.

Open Innovation and Bioinformatics at InnoCentive

Presenter: Gabriel Eichler Ph.D., Innovation Development Manager, InnoCentive
Abstract: The computational world recently experienced the thrill of combining competition and innovation as they observed the conclusion and awarding of the Netflix Prize. This event underscores the value of a new, open paradigm of innovation known as Open Innovation or Crowdsourcing.  In this presentation, I provide introduce the audience to Open Innovation and explain how InnoCentive is working to promote Open Innovation in some of the world’s largest R&D organizations and leading nonprofits. I will go into detail about how some of my clients are using Open Innovation to solve the bioinformatics challenges they encounter.

Bringing CSMap Technology from Academia to Industry

Presenter: Karl Clodfelter, Scientist, Forma Therapeutics
Abstract: Academic innovation is a frequent source for technological development in any industry. It can provide the flexibility and experimental directions that yield novel scientific insights. During my graduate studies under Sandor Vajda and David Waxman, I was fortunate to help create and deploy a computational technology for small molecule binding prediction, CSMap. As part of my thesis, I applied the technology to analyze the binding pocket of the cytochrome P450 family of proteins. Following graduation, I participated in transitioning CSMap from an academic project into a critical component of the drug discovery platform at SolMap Pharmaceuticals, now a part of Forma Therapeutics. The result is that Forma Therapeutics has established an innovative discovery pipeline and CSMap has transitioned from academic pursuit to industrial technology.

Predicting a mammalian microbiota’s response to changes in host diet

Presenter: Jeremiah Faith, Postdoctoral Scholar, Washington University in St. Louis
Abstract: The healthy mammalian gut is a densely packed microbial enzyme factory specialized in energy harvest from a myriad of carbon sources.  Culture-independent characterization of these microbial gut communities (microbiotas) have yielded associations between configurations of the microbiota and  several complex human diseases including obesity and inflammatory bowel disease.  Despite these associations, the daily functioning and operations of the microbiota remain largely unknown. Here we show that the relative proportion of each species in the microbiota is largely dictated by host diet. We introduced a synthetic microbial community, composed of 10 sequenced bacteria isolated from the human gut, into gnotobiotic mice and measured their response to randomized diet oscillations. Using these responses, we are developing a model to predict the community’s response to new host diets with the goal of enabling food-based therapies to configure the microbiota into desired states.

Natural Selection and Infectious Disease in Humans

Presenter: Elinor K. Karlsson, Postdoctoral fellow, FAS Center for Systems Biology, Harvard
Shari Grossman1,2, Kristian Andersen1,2, Eric Phelan1, Ilya Shlyakhter1,2, Regina C. LaRocque3. Christian Happi4 and Pardis Sabeti1,2
1 FAS Center for Systems Biology, Harvard University, Cambridge MA USA
2 The Broad Institute, Cambridge MA USA
3 Division of Infectious Diseases, Massachusetts General Hospital, Departments of Medicine
4 Institute for Medical Research and Training, University of Ibadan, Ibadan, Nigeria
Abstract: Comprehensive studies of genetic variation in human population can find the marks left in our genome by tens of thousands of years of positive natural selection, as our species responded to forces such as infectious disease, changes in diet and new environments. Striking examples uncovered by early studies of selection include lactose tolerance, adaptations in skin pigmentation, and resistance to malaria. However, these initial surveys generally identified relatively large candidate regions with thousands of polymorphisms, and thus in just a handful of cases is the actual functional variant known. In addition, these tests of selection provide little indication of the source of the selection pressure.

We have developed an approach, called the Composite of Multiple Signals (CMS), which combines tests for different signals of selection to give 20-100x better resolution than any individual test. The CMS test is most powerful when given full sequence data for a population, such as that now being generated by the 1000 Genomes Project. Even on the sparser data offered by the Human Haplotype type map, we have substantially narrow 177 candidate regions, finding both known functional variants and many novel candidates. Currently, we are working on combining the CMS test for natural selection with studies of phenotype association, to identify genetic variants associated with resistance to infectious diseases, including Lassa Fever in West Africa, and cholera in Bangladesh.

Programming Cells by Multiplex Genome Engineering, Accelerated Evolution & New Genetic Codes

Presenter: Farren Isaacs, Research Fellow, Department of Genetics, Harvard Medical School
Abstract: The breadth of diversity found among biological systems make them well suited to solve some of the world’s defining challenges, such as producing new drugs to alleviate human disease and generating cell-based alternative sources of energy to ensure environmental sustainability. Although in vitro and directed evolution methods have created genetic variants with usefully altered phenotypes, these methods are limited to laborious and serial manipulation of single genes and are not used for parallel and continuous directed evolution of gene networks or genomes. Thus, foundational technologies that expand our ability to engineer cells are needed. To address this challenge, I describe Multiplex Automated Genome Engineering (MAGE), a technology that combines small (nt)- and large (MB+)-scale genome manipulations, for programming and evolution of cells. MAGE simultaneously targets many locations on the chromosome for modification in a single cell or across a population of cells, thus producing combinatorial genomic diversity. Because the process is cyclical and scalable, we constructed prototype devices that automate the MAGE technology to facilitate rapid and continuous generation of a diverse set of genetic changes (mismatches, insertions, deletions). We applied MAGE to optimize the 1-deoxy-Dxylulose- 5-phosphate (DXP) biosynthesis pathway in E. coli to overproduce the industrially important isoprenoid lycopene. Twenty-four genetic components in the DXP pathway were modified simultaneously using a complex pool of synthetic DNA, creating over 4.3 billion combinatorial genomic variants per day. We isolated variants with more than fivefold increase in lycopene production within 3 days, a significant improvement over existing metabolic engineering techniques. We also applied MAGE to engineer strains of E. coli in which the entire genome is recoded, replacing the 314 UAG stop codons with UAA synonymous major stop codons. These changes to the genetic code allow us to construct safer and multi-virus resistant strains and enhance the incorporation of non-natural amino acids into proteins. Our multiplex approach embraces engineering in the context of evolution by expediting the design and evolution of organisms with new and improved properties.

Identification of novel stabilizing agents of Glucocerebrosidase via computational and experimental methods

Presenter: Melissa Landon, Scientist, Cubist Pharmaceuticals, Inc.
Co-Authors: Raquel M. Lieberman, Sharotka M. Simon, Gregory A. Petsko, and Dagmar Ringe.
Abstract: Gaucher’s disease (GD) is a rare neurological disorder that results from loss of function of the enzyme Glucocerebrosidase (GCase). Currently the only treatment for GD is enzyme replacement therapy; however, there is hope that a small molecule pharmacological chaperone (PC) could be developed to stabilize GCase. To this end, the company Amicus Therapeutics has developed isofagmine, a small molecule that stabilizes the catalytic region of GCase and may serve as the first commercially available PC. Here we investigate the potential of stabilizing agents of GCase that do not bind to the catalytic region, thus avoiding possible interference with enzyme function. We applied both the multiple solvent crystal structures
(MSCS) technique and the FTMap algorithm to the identification of novel small molecule binding regions on the surface of GCase. These studies revealed a novel site near the N-terminus of GCase. Virtual screening was then employed to identify potential binders to this region; select compounds from these studies were then screened for stabilizing effects on GCase. From our screening efforts a molecule was identified that stabilizes GCase for further exploration.

Deriving Protein Structure Constraints from Chemical Cross-linking and Mass Spectrometry

Presenter: Julian Mintseris, Research Fellow, Harvard Medical School
Abstract: Detailed understanding of cellular function requires a 3-dimensional view of the architecture of protein complexes and their components.
Traditional approaches to protein structure determination remain labor-intensive, costly and require large amounts of starting material.
As an alternative, we focused on developing covalent chemical cross-linking approaches that, when coupled with sequencing by mass spectrometry, would generate sufficient numbers of geometrical constraints for low-resolution structural models. Combining existing methods with novel reagents, we performed detailed studies on model proteins. We developed a new search algorithm that scores different types of cross-linked peptides alongside regular peptides and mono-linked derivatives as well as statistical methods to identify correct interaction sites. In proof-of-principle experiments starting with only microgram amounts of proteins we observed strong, often complete, band shifts on SDS-PAGE gels, indicating efficient cross-linking. Database searching of MS/MS spectra yielded dozens of unique cross-links, bridging residues far apart in sequence and with a mean distance less than 12 angstroms between respective C-beta atoms.

Chemical cross-linking has traditionally required extensive optimization for each protein due to the scarcity of potential links but our approach promises much more generality. We believe that this method represents a key advance in the field of structural biology and will help us get a more detailed view of proteins and their interactions in the cell.

The art of NaPPing: holistic approach to Network and Pathway Profiling.

Presenter: Dmitriy Leyfer, Senior Scientist, Pfizer Inc.
The GIST of NaPPing
Strategy: holistic approach to experimentation and analysis
Tactics: Integration of prior biological knowledge
Simulatable gene network models
Knowledge gap discovery
Novel omics approaches
Unified computational & experimental workflow
Goal: realistic computational models of cellular processes
TINC: Target Identification by Network Connectivity
Ugur Guner and Dmitriy Leyfer
Motivation: It is often the case in pharmaceutical industry to bring an exciting target to clinical trials only to learn that it has serious safety concerns or lack of efficacy. A gene in the same or parallel pathway could be a good alternative, however, not all pathways are known, and finding an alternative target using existing tools could become labor-intensive. A method that automatically finds similarities between targets according to published information could significantly accelerate the search.
Method: targets were compared based on their nearest neighbors in the literature networks. The result is a rank-ordered list of targets that are most similar to the original query. The method can be generalized to annotating nodes using edges in biological or social networks. It can be used to annotate diseases with similar etiology, reposition existing drugs, discover adverse events for the targets, and more. The results can be further clustered to create groups of similar nodes.

From here to Orwik: Why I left academia to become an entrepreneur

Presenter: Boris Shakhnovich, Founder, President and CEO, MyMetaLab Inc.
Abstract: In this talk, I will discuss how I left BU to do experimental biology and then left the ivory tower of academia to join the trenches of startup life. I will describe the roller coaster highs and lows of forming a company, a team and a product. I will talk about how my experiences at BU prepared me for raising money in the worst recession since 1930s. Finally, I will describe Orwik, a fully community site for collaborating and publishing data and why I had to build it. This will be interesting to those who are interested in the relationships between entrepreneurship and science.

Identifying High Fitness Cost Regions in Human Immunodeficiency Virus by Evolutionary Analysis

Presenter: Yaoyu Wang, Research Fellow, Harvard Medical School
Abstract: The control of HIV-1 associated with particular HLA class I alleles suggests that some CD8+ T cell responses may be more effective at containing viral infection. Unfortunately, substantial diversities in the breadth, magnitude and function of these responses have impaired our ability to identify responses most critical to this control. It has been proposed that CD8 responses targeting conserved regions of the virus may be particularly effective, since the development of CTL escape mutations in these regions may significantly impair viral replication. To address this hypothesis at the population level we derived full length viral genomes from 98 genetically well-characterized chronic patients to begin characterizing the relationship between human leukocyte allele (HLA)-associated escape mutations and natural disease progression. We identified a large number of HLA class I-associated mutations across the viral genome, reflecting strong CTL selection pressure shaping genome-wide sequence evolution. Since HLA-associated escapes occur at both conserved and variable positions, we hypothesize that, while escape mutation occurring at variable residues may confer minimal impact on viral replication, escape mutations significantly impair viral replication capacity would often require additional covarying mutations to rescue. This analysis revealed that covariation networks of different complexity arise as the results of HLA-associated escape mutations, suggesting that the network structure may have strong influence on the HIV viral fitness as well as patient disease outcome.

Genome-Scale Identification of Human Glucocorticoid Receptor Binding and Related Expression Changes in Response to Dexamethasone

Presenter: Timothy Reddy, Postdoctoral Fellow, HudsonAlpha Institute for Biotechnology
Timothy E. Reddy, Florencia Pauli, Rebekka O. Sprouse, Kimberly M. Newberry, Norma F. Neff, Jason Dilocker, Mike Muratet, Barbara Pusey and Richard M. Myers.
HudsonAlpha Institute for Biotechnology, Huntsville, AL.
Cortisol is a steroid hormone released by the adrenal glands in response to acute stress and as a messenger in circadian rhythms. The glucocorticoid receptor (GR) is a nuclear receptor responsible for mediating the genomic response to cortisol throughout the body. Upon binding cortisol in the cytoplasm, GR translocates into the nucleus where it acts as a transcription factor. Owing to their anti-inflammatory activity and bioavailability, synthetic corticosteroids targeting GR are used to treat a wide range of acute and chronic inflammatory diseases. However, chronic exposure to elevated levels of corticosteroids has significant side effects including weight gain, insomnia, depression, diabetes and bone loss. To better understand the genomic effects of GR, we used ChIP-sequencing and RNA-sequencing to pinpoint GR binding and gene expression response to the synthetic glucocorticoid dexamethasone (DEX). Doing so, we gained insight into the different mechanisms responsible for gene induction and repression in response to corticosteroids. Additionally, we find evidence of dose-specific responses to DEX that help to explain the dual nature of corticosteroids: Low-level cortisol fluctuations may control circadian rhythms, whereas high-dose releases may control inflammation and vasoconstriction in response to stress. Finally, I will discuss ongoing advances in high-throughput sequencing, and its implications on future research.