Although much of the human genome has been sequenced and assembled, scientists have hit roadblocks trying to map unassembled regions of DNA that consist mostly of repetitive sequences, including the centromere.
Now, for the first time, researchers from the University of Connecticut and University of Rochester have sequenced all the centromeres in a multicellular organism.
Published in the journal PLOS Biology, the study on fruit flies sheds light on a fundamental aspect of biology, and shows that genetic elements may play a larger role in centromere function than researchers previously thought.
“Centromeres continue to be widely considered the ‘black hole’ of genomics,” says Barbara Mellone, associate professor of molecular and cell biology at UConn and lead author on the study. “We break through these barriers and leverage the power of single molecule long-read sequencing and chromatin fiber imaging to discover the detailed organization of the centromeres.”
The fruit fly, Drosophila melanogaster, is one of the most revered examples in biology of a model organism, or species that has been extensively studied for a long time in the lab in order to better understand its biology and to apply those lessons to human health. In the context of centromere biology, Drosophila is especially powerful because it only has four pairs of chromosomes as opposed to the 23 in humans, and the centromeres are smaller than those of humans and thus relatively easier to sequence and assemble.
If centromeres, vital for cell division, don’t function properly, cells may divide with too few or too many chromosomes, which can result in aneuploidy disorders like Down syndrome or tumor progression.
In many species, including humans, centromeres are often found near the center of the chromosome, embedded in large blocks of repetitive DNA known as satellite DNA. Satellite DNA, and, in turn, centromeres, are challenging to sequence because of their repetitive nature: when mapping a genome, traditional sequencing methods chop up strands of DNA and read them, then try to infer the order of those sequences and assemble them back together. But the pieces of repetitive DNA all look the same, so assembling them is like trying to put together a puzzle with very similar pieces. To solve this long-standing puzzle, researchers joined their expertise in chromatin and repetitive DNA biology.
Contrary to previous thought, the fruit fly centromeres are in fact made up of “islands” of complex DNA enriched in retroelements. These complex islands are embedded deep in satellite arrays, which hampered their discovery for more than two decades, say the researchers.
Sequencing the most repetitive parts of genomes is one of the “last frontiers of genome assembly,” says Amanda Larracuente, an assistant professor of biology at Rochester, and co-lead author.
Researchers recently presented their findings at the Centromere Biology Gordon Conference and the GSA Early Career Scientist Symposium “Cracking the Repetitive DNA Code.”
“The approaches we describe will be foundational for the discovery of centromeres in other animals,” says Mellone.
Other authors include
The study was supported by grants from the National Institutes of Health (NIH R01 GM108829, NIH R35 GM119515); National Science Foundation (NSF 1330667, DP1GM106412, R01HD091797, R01GM123289-01); and a Damon Runyon Cancer Research Foundation Howard Hughes Medical Institute Fellowship.