Using Emerging Science to Tackle Emerging Disease

A new course in bioinformatics teaches students with real-world data for training in an emerging scientific discipline

A computer-generated illustration of a coronavirus microbe

(Getty Images)

As we’ve all gotten very familiar with living our lives virtually, a new course lets students learn more about the epidemiology of infectious diseases like COVID-19 without ever stepping foot in a lab.

Dong-Hun Lee, assistant professor of pathobiology and veterinary science in the College of Agriculture, Health and Natural Resources, has created a course on bioinformatics in molecular epidemiology of infectious diseases that gives students practical skills in the use of bioinformatics technologies for infectious disease epidemiology. The course provides hands-on training in scientific computing and data analysis, so the students understand the enormous potential of bioinformatics and genomics in molecular epidemiology of infectious diseases.

Bioinformatics is a relatively new discipline that combines mathematics, data science, and biology to helps answer biological questions. It involves the development of software tools and algorithms to analyze and interpret biological data.

Two asian men seated looking at computer screens
Dong-Hun Lee assistant professor of pathobiology in the College of Agriculture, Health and Natural Resources with graduate student Junwon (Scott) Kim. (CAHNR photo)

“Bioinformatics, a computational biology approach, is central to biological research,” Lee says. “Because of a shortage of skilled biologists, both industry and public health sectors have been hard-pressed for trained graduates in computational biology, and as the field of biological and biomedical sciences has become increasingly data intensive, scientists must use a variety of computational tools to analyze, interpret, store, and share large amounts of biological data.”

Students will learn sequence analysis, database, alignment, phylogenetic analysis, and visualization. This will allow them to trace the development of infectious diseases, track mutations, and understand viral transmission. The course would be of interest to students from various majors from across UConn.

In a year where the country has been dealing with the COVID pandemic, advanced training for future infectious disease experts seems even more imperative. Lee drives home the relevance of the skills by emerging students in real data from infectious diseases that have disrupted society in recent years.

“I use real-world infectious diseases datasets, including COVID-19 and pandemic H1N1/2009 influenza virus, to teach students how to analyze the data for tracking their origin, evolution, and transmission,” Lee says.

Lee collected old hardware components from the UConn surplus and assembled them into more than twenty machines for basic bioinformatics analyses, cloud computing to supercomputer servers, and general bioinformatics teaching purposes. One of his goals is to teach students basic knowledge on computer assembly and configuration, open-source operating system installation, and software requirement for bioinformatics applications.

Lee points out that many of these innovative tools are filling gaps in knowledge. Traditional scientific publishing practices cannot handle rapid dissemination of data. This website provides real-time snapshots of a pathogen and is available to everyone from virologists and epidemiologists to public health officials. One such tool is Nextstrain, an online open-source project that uses genome data to monitor the evolution of viruses. This website provides real-time snapshots of a pathogen and is available to everyone from virologists and epidemiologists to public health officials. Nextstrain has been involved in following the COVID pandemic, tracking thousands of COVID-19 genomes to trace the outbreak and detect new mutations.

“This technology is transforming how laboratories diagnose and further characterize pathogens,” he explains. “Interdisciplinary combinations of genome sequencing, bioinformatics, and molecular epidemiology provide a platform to rapidly identify unknown or unexpected pathogens, identify the origins of an outbreak, and track spatiotemporal patterns of disease transmission. Much of this information would either remain unknown or take a very long time to reveal using traditional microbiology and epidemiology.”

During the pandemic when many classes went remote, Lee developed step-by-step demonstrations with screen captures to familiarize students with the technology. Students are able to use their laptops to analyze small genome sequencing data using bioinformatics tools such as the Galaxy platform, an online open-source scientific analysis platform that is used by tens of thousands of scientists across the world to analyze large genome datasets.

“Molecular epidemiology of infectious disease is still an emerging field,” he continues. “The goal of this course is to provide students with practical computational biology skills useful in infectious disease research to enable future biologists without specialized bioinformatics expertise to assemble and analyze viral genomes from sequencing data.”