openSESAME (Search of Expression Signatures Across Many Experiments) identifies that loss of p63 is associated with metastatic melanoma.
Handout with more details.
Development of high-throughput technology for genetics and genomics has led to the availability of massive data sets and the generation of new algorithms for statistical analysis and data mining that make HPC resources, including large-scale, high-performance storage systems, a critical component of current medical research. LinGA consists of 1000 compute cores and ¾ PByte of high-performance storage. More than 120 specialized software packages are maintained, allowing end users to focus on their research rather than computational issues. The LinGA resource supports genetic and genomic research in a wide variety of disease-related areas including Alzheimer, Parkinson, & Cardiovascular diseases, Pulmonary Function and Lung Cancer, Addiction, and Diabetes. Selected examples of computationally intensive tasks include:
Imputation to the 1000 Genomes reference set. These memory intensive analyses require hundreds of GBytes of RAM and week-long processing times on 16+ core SMP machines.
Genome-wide Association Analyses within the deeply phenotyped Framingham Heart Study. Accounting for the relationships among study participants requires specialized statistical models and extensive processing time. Over 8,000 such genome-wide association analyses, consuming over 300 CPU years, have been run on the LinGA system since the SNP Health Association Resource (SHARe) genotype data became available in 2009.
Using the processing power of LinGA, billions of reads were aligned to the human genome to perform differential expression analysis of RNA-Seq data. In addition, de-novo assembly of RNA-Seq reads lead to identification of new genes and gene structures.