Boston University

Ultra fast DNA sequencing

In 2004 the National Human Genome Research Institute (NHGRI) at NIH launched a program for the development of revolutionary genome sequencing technologies - the "$1,000 Genome". The main objective of this project is to focus on applied research for the development of ultra fast and cheap DNA sequencing technologies. Although the "$1,000 per genome" should only be taken as a figure of merit, it sets a scale for the ultimate goal - between 4 and 6 orders of magnitude cheaper and faster than current state of the art technologies.

We are developing a novel single molecule DNA sequencing technique based on the optical readout of DNA molecule translocations through nanometer scale pores ("nanopores"). Sequence-specific fluorescent tags are conjugated to the DNA molecules before they can be read. These tags are stripped off, one at a time, by threading the DNA molecule through 2 nanometer size pores, which are too small to admit double-stranded DNA, and are large enough to allow single-stranded DNA to enter (see movie). Multi-wavelength laser-induced fluorescence is used to read the color of the tags, as the tags are striped off by the nanopore. The probed color sequence reveals the DNA sequence. (to learn more about our
DNA unzipping studies click here).

To achieve the phenomenal throughput sets by the "$1,000 genome" project a high degree of multiplexing is required. Simultaneous optical detection from hundreds of nanopores in nanopore arrays will allow us to achieve this parallelism (Figure 2). To learn more on Nanopore array fabrication click here.

To achieve single nucleotide resolution, we are collaborating with LingVitae (www.lingvitae.com) who developed a biochemical transformation method from the DNA native state to the "design polymers" format. Design polymers are DNA molecules, which encode the same sequence information as the original DNA molecules they are derived from. But in the design polymers format each base ("A", "C", "G" or "T") is represented by two code units, each of which is ~10 bases long. There are only two types of code units (and therefore we can create the four desired combinations). The use of a binary based code, rather than the four letter DNA, and the magnification of the single base to ~20 bases, greatly simplifies single molecule detection.

This work is done in collaboration with Zhiping Weng's group at Boston University (zlab.bu.edu/zlab/index.shtml) and with LingVitae (www.lingvitae.com)

References:

1.Soni G. and A. Meller (2007) Progress Towards ultrafast DNA sequencing using solid state nanopores. Clin. Chem. 3, 1996-01.

2. Lee, J.W. and A. Meller. (2007). Rapid sequencing by direct nanoscale reading of nucleotide bases in individual DNA chains. In: "Perspective in Bioanalysis", Elsevier, Edt. K. Mitchelson.

Support:

National Institute of Health
Figure 1
Figure 2