In the mountain forests of California and Oregon, a tiny foe is threatening the towering sugar pine trees. An invasive fungus known as white pine blister rust is threatening the survival and reproduction of the sugar pines, harming both the ecosystem and the industries that depend on the tallest pines in the world.
Conservation efforts suggested that genetic variation affects the susceptibility of individual trees to infection. But efforts to identify the genes involved were complicated by the enormous size of the sugar pine’s genome, which is 10 times the size of the human genome, or 31 billion base pairs, and is the largest genome that scientists have attempted to sequence.
Now a group of researchers, including UConn assistant professor Jill Wegrzyn, co-PI on the study, and postdocs Uzay Sezen, Daniel Gonzalez-Ibeas, and Robin Paul, has announced the complete sequence of the sugar pine genome, a significant technological feat that opens the way for researchers to develop an effective method to fight the fungus based on genetic resistance.
The study, published in the December issue of GENETICS, – where it was featured on the cover, along with a companion paper published in G3: Genes, Genomes, Genetics, highlights the evolutionary implications of such a massive genome size, as well as revealing candidate genes for blister rust resistance and a promising path to efficient selection of resistant individuals.
Despite its size, the sugar pine genome contains about the same number of protein coding genes as the human genome. No less than 79 percent of the DNA in the sugar pine genome is made up of transposable elements, which accounts for its huge size. These genetic parasites are stretches of DNA that exist only to proliferate within a genome. Rather than contributing to the sugar pine’s phenotype, they encode machinery that lets them make copies of themselves at new sites in the genome.
Transposable elements are common in all eukaryotic genomes, but in conifers, and especially the sugar pine, they have multiplied to enormous numbers. In the sugar pine genome, the transposable elements are mostly non-functional relics. These genomic leftovers can tell researchers about the evolutionary history of the sugar pine, and provide insights about how genomes’ size evolves. But they also create substantial problems for researchers trying to work with the sugar pine genome.
Transposable elements are highly repetitive, and when they are present in numbers as large as in the sugar pine, they are extremely difficult to sequence. Whole genome sequencing generally works by breaking a genome up into extremely small pieces and then putting the pieces back together one by one. Repetitive genetic sequences make this process incredibly difficult, because when the pieces are assembled, all the repeats look the same and end up incorrectly merged into a single sequence.
To get around this problem, the researchers assembling the sugar pine genome used several strategies. They obtained most of the sequence data from a single haploid pine nut, avoiding the typical complications of sequencing two parental genomes in a diploid individual. They sequenced the transcriptome to identify those sequences that produce proteins, and then used those sequences to assemble the corresponding genes. They also used sequencing libraries specially prepared with reads known to be large distances away from one another, which is useful in linking larger genomic structures to provide the big picture. These techniques, and others, allowed the researchers to build a working draft of the sugar pine genome.
Sequencing an entire genome, especially one as large as the sugar pine, is an impressive technological achievement. More importantly, it is a powerful research tool in the fight against white pine blister rust, a fungus that has been infecting multiple species of white pines in North America since it was accidentally introduced from Asia about a century ago.
White pine blister rust is a slow killer, taking years to destroy a large tree. An infection begins when fungal spores land on the surface of the tree and begin to germinate. They grow through openings into the twigs and branches, and very slowly make their way toward the main trunk of the tree. The infected branches swell up and large sacks of rusty orange-red spores burst through the branches. The fungal infection causes cankers, which prevent the tree from sending water and nutrients to its damaged limbs. Eventually, these limbs die. If cankers form on the main trunk, the entire tree may die.
Researchers and forest managers have been looking for a way to fight the spread of white pine blister rust for a long time. Some rare sugar pines carry genetic resistance to white pine blister rust, and have been used in reforestation efforts. In the 1970s, these rare individuals were used to identify a major locus of resistance called Cr1, but the daunting size of the sugar pine genome made further analysis difficult.
Using this new genome sequence, the research team was able to make a breakthrough in identifying this gene. They used the small amount of genetic information already known to find large Cr1-associated segments and identify previously unknown SNPs (single-nucleotide polymorphisms, a common type of genetic variation) that are closely associated with resistance. These markers can be used to quickly and cheaply identify trees carrying the resistant allele. Resistant trees can then be harvested for seeds to be used in reforestation.
Now armed with a roadmap, scientists can search the sugar pine genome for clues to help save these iconic trees and the ecosystems that depend on them.
The research team was led by Kristian Stevens of the University of California-Davis and UConn’s Wegrzyn, and included researchers from Children’s Hospital Oakland Research Institute, University of Florida, Johns Hopkins University, University of Maryland, Texas A&M University, USDA Forest Service Pacific Southwest Research Station, and Virginia Commonwealth University. The PineRefSeq-funded project, which is sequencing the genomes of the Loblolly Pine and the Douglas Fir as well as the Sugar Pine, is led by David Neale at University of California-Davis.
This article was first published online in Genes to Genomes, a blog from the Genetics Society of America. Minor edits have been made to the post on UConn Today.