Dr. Wu Receives NSF Funding for Genomic Models

Assistant professor of Computer Science & Engineering Dr. Yufeng Wu and two collaborators from the University of California – Davis captured a three-year grant totaling $900,000 from the National Science Foundation to develop algorithms that will allow researchers to better understand the origin and manifestations of genomic traits and to develop more accurate biological models. […]

Assistant professor of Computer Science & Engineering Dr. Yufeng Wu and two collaborators from the University of California – Davis captured a three-year grant totaling $900,000 from the National Science Foundation to develop algorithms that will allow researchers to better understand the origin and manifestations of genomic traits and to develop more accurate biological models. UConn will receive $300,000 of the grant monies.

Dr. Wu, who joined the University of Connecticut in 2007, is collaborating with Drs. Dan Gusfield, a computer scientist, and Charles Langley, a population geneticist. Their initial work will involve gaining a clear understanding of the array of genetic variables and networks that produced the traits seen in current populations.

A genome represents the entire hereditary history of an organism as detailed in the DNA. The genetic variability evident within species today arises from genetic mutations and meiotic recombination – which involves a merging of chromosomes within cells that halves the total number of chromosomes and results in a unique combination. This recombinant chromosome is then passed on to the next generation. This complex process makes it impossible to characterize a genetic history using the traditional tree methodology, according to Dr. Wu. Instead, researchers represent the derivation of genomes using an Ancestral Recombination Graph (ARG). In developing an accurate algorithmic model, Dr. Wu and his colleagues will incorporate concepts from cell biology, computer science, graph-theory, mathematics, and algorithm and software engineering.

Dr. Wu explained that the key computational problem addressed in the project is “how to infer the hidden genealogical history of DNA sequences from ‘unrelated’ individuals. DNA sequences of individuals from a given population are related; however, the number of possible evolutionary histories of these DNA sequences is huge and thus, figuring out which histories make more biological sense is a challenging problem. This project aims to develop effective methods to reconstruct the genealogical history for the DNA sequences of the current population. The developed methods may have many applications, such as locating disease-causing genes.

In seeking to validate their algorithmic model, the team will compare its predictions against a combination of simulated data and genuine biological data gathered from globe-hopping genetic studies, such the International HapMap Project and the 1000 Genomes Project. The HapMap Project is a collaboration among researchers in Canada, China, Japan, Nigeria, the UK and the U.S. whose aim is to develop a haplotype map of the human genome that accurately describes the common patterns of human genetic variation. The 1000 Genomes Project is another international collaboration that aims to sequence the genomes of at least a thousand people from around the world.

Dr. Wu’s work in developing accurate algorithms will offer researchers insights into biological problems and will enable them to map genes that are linked to diseases and important “economic” traits.