See-Kiong Ng

Automating Computational Molecular Genetics: Solving the microsatellite genotyping problem Degree Type: Ph.D. in Computer Science
Advisor(s): Mark Perlin
Graduated: May 1998

Abstract:

The Human Genome Project has extended the reach of modern genetics by providing an infrastructure of high-resolution genetic maps. Scientists can now find genes using these maps by genotyping -- experimentally assaying the genome at mapped genetic markers. To track the inheritance patterns of a genetic disorder, individual genomes are genotyped at high resolution using densely distributed genetic markers, such as the microsatellites. However, because of the complexity associated with the inheritance patterns of most common human genetic diseases, hundreds of thousands of genotyping experiments are typically required to genetically localize even one disorder on the genome.

The full automation of microsatellite-based genotyping is currently limited by the human scoring bottleneck: every experiment must be viewed by a human eye. The intricate genotyping data, densely multiplexed for throughput, is confounded with intrinsic data artifacts such as PCR stuttering. Human experts are required to visually decipher the highly complex data patterns that resulted. It is estimated that over half the cost of microsatellite-based genotyping is due to this human scoring effort.

We have developed and implemented novel computer-bsed analysis methods that computationally solve the various problems associated with the microsatellite scoring bottleneck. Our system, FAST-MAP, is a platform-independent fully automated genotyping system that accurately calls alleles from quantitative microsatellite data. FAST-MAP has been extensively tested and used by scientists worldwide to generate genotypes with high accuracy from real data generated in high throughput genetic laboratories. With FAST-MAP, we have shown that by appropriately modeling and representing genotype data, powerful computational strategies can overcome key molecular biology bottlenecks and significantly advance the rapid localization of genes across the whole human genome.

Thesis Committee:
Mark W. Perlin (Chair)
Scott E. Fahlman
James H. Morris
Robert E. Ferrell (Department of Human Genetics, University of Pittsburgh)

James Morris, Head, Computer Science Department
Raj Reddy Dean, School of Computer Science

Keywords:
Artificial intelligence, automation software, biotechnology, computational biology, molecular genetics, microsatellite genotyping, pattern matching, FAST-MAP

CMU-CS-98-105.pdf (1.93 MB) ( 401 pages)
Copyright Notice