Bioinformatics

Options

Volunteer, Independent Funding Available

Overview

This project involves the design and implementation of parallel algorithms to compare biological sequences such as DNA and protein sequences. Sequence comparison is a fundamental task when analyzing biological data and the standard technique is dynamic programming for sequence alignment. However, that method is too slow for the enormous quantities of data being generated, especially in light of the newer, high-throughput genome sequencing technologies (next generation sequencing). Our lab has recently developed several very fast methods that compute sequence alignment using a technique called bit-parallelism, in which the computation is the result of logic and addition operations on bits in computer words. These algorithms are now the fastest available on standard CPU computers and have been further enhanced by the use of advanced SIMD instructions (single instruction, multiple data). We are now extending the bit-parallel approach to another level of parallelism by designing bit-parallel algorithms for graphical processing units (GPUs). One application focus of the lab is the detection and analysis of tandem repeats in human DNA sequences. Tandem repeats are a common type of repetitive sequence that evolves rapidly and has been shown to have regulatory effects on gene expression. The GPU algorithms will be applied to our analysis of tandem repeats using human DNA sequence data from the 1000 Genomes Project.

Back to On-Campus Opportunities