A Deep Dive into the Gene Pool

A UConn engineer and his lab have won funding from NSF to develop robust computational models for ancestry hunting using DNA.

Young asian woman seated outside, smiling, looking at iPad.

Image by Jess Foami from Pixabay.

Genealogy is the second most popular hobby in the United States after gardening, so it comes as no surprise that ancestry hunting has become a billion dollar industry. Recently, commercial vendors like Ancestry and 23andMe have attracted millions of customers seeking to learn more about their family history and genetic makeup.

Despite the popularity of DNA ancestry tests and technologies collecting genetic data, computational methods to process it all into useable ancestry information are still lacking.

Associate professor in the Department of Computer Science and Engineering at the University of Connecticut, Yufeng Wu received a $411,141 grant from the National Science Foundation to develop various computational methods for DNA ancestry inference.

Wu and his team plan to build on their previous research, which resulted in the novel PedMix approach. The PedMix approach, which was developed in collaboration with Rasmus Nielsen, a population geneticist at University of California, Berkeley, allows for inferences about the ancestry of recent ancestors like parents and grandparents from a living individual’s genome.

For instance, imagine an adopted child would like to know more about her heritage. Existing approaches would only tell the genetic composition of the child, but not the heritage of her biological parents. PedMix, on the other hand, can perform such inference, letting the adopted child know their parents particular genetic makeup in addition to her own.

Thus, PedMix will allow a more complete genealogy report of an individual. Development of this technology was spearheaded by a former graduate student in the Wu lab, Jingwen Pei. Pei went on to work as a computational biologist at Ancestry.

This project will allow the Wu lab to improve the performance of the PedMix approach, so that it will acquire more accurate results and make it more applicable to real genetic tests. Furthermore, they ultimately aim to create and develop inference methods that are able to learn ancestry information for more distant ancestors.

Normally, when a single individual’s ancestry is traced back a few generations, the genome of the person contains less and less information about an ancestor. Increased computational methods developed by Wu and his team will allow for more information to be gathered regarding the genes of an individual’s ancestors. As part of this research, the Wu lab will also look at new ancestry inference formulations that have not been meticulously studied before to evaluate their effectiveness.

By studying and developing PedMix and other methods, researchers can develop more accurate algorithms for better software tools. It could also pave the way for more accurate inferences about the ancestry of genomics data collected in large-scale initiatives.

Wu and his team plan to release these software tools to the research community. They hope this will facilitate innovative biological applications in DNA ancestry inference.

Yufeng Wu is an associate professor at UConn in the Department of Computer Science and Engineering at UConn. Wu received his Ph.D. from the University of California, Davis. His research lies in computational biology and bioinformatics and he is also interested in issues related to phylogenetics and high-throughput sequencing. Currently, Wu’s research has focused on computational problems and population genomics.

NSF Grant #1909425

Follow UConn Research on Twitter & LinkedIn.