Published on GenomeWeb / December 8, 2016
NEW YORK (GenomeWeb) – A University of Connecticut and University of California, Davis-led team of researchers has sequenced both the genome and transcriptome of the sugar pine.
In a pair of studies appearing in Genetics and G3: Genes, Genomes, Genetics, the researchers report on the 31-billion-basepair genome of Pinus lambertiana Douglas and the transcriptomes of about a dozen of its tissues. With this data, the researchers explored the region harboring a pathogen resistance gene as well as lineage-specific Dicer-like proteins within the sugar pine that may give insight into its oversized genome.
“The recent availability of a draft P. lambertiana genome sequence, coupled with transcriptomics, offers opportunities to study basic questions about the biology of conifers as it relates to genome evolution and gene expression,” UConn’s Jill Wegrzyn and her colleagues wrote in the G3: Genes, Genomes, Genetics paper.
The UC Davis researchers first announced last year that they’d sequenced the California sugar pine. It’s one of the world’s tallest trees, reaching 76 meters in height, and may have a lifespan of more than 500 years. In recent decades, though, it has suffered damage from white pine rust, caused by the fungus Cronartium ribicola.
In their Genetics paper, UC Davis’ Charles Langley and his colleagues wrote that to sequence P. lambertiana, they adapted the approach they’d used to sequence the loblolly pine genome. In particular, they used haploid DNA from a single sugar pine megagametophyte to serve as the basis for their assembly, which they then filled in using mate-pair library reads from diploid needle tissue. From this, they generated some 1.9 trillion basepairs, reflecting 62X coverage of the P. lambertiana, which they estimated to be 31 gigabasepairs in size.
Much of the sugar pine genome, though, is repeats. Transposable elements make up 79 percent of the P. lambertiana genome, slightly higher than the 74 percent found in loblolly pine, the researchers reported. Of those transposable elements, two thirds are long terminal repeat retrotransposons. The researchers estimated the median LTR insertion time for P. lambertiana to be 16 million years ago, more recent than that of loblolly.
This high number of repeats and their age gives credence, the researchers said, to the hypothesis that the sugar pine genome, like that of other conifers, got to its massive state by undergoing transposable element expansions.
While sugar pines in general have suffered from C. ribicola damage, some trees have resistance to the fungus. Previous work has mapped a biallelic locus, Cr1R/Cr1r, that confers resistance, and Langley and his colleagues used their newly generated genome sequence to uncover SNPs in association with Cr1R. They uncovered 14 genes annotated on the scaffolds genetically linked to Cr1R, one of which — PILA_lg017786 — stood out to the researchers as a candidate gene as it contains domains typically found in disease-resistance genes.
The researchers noted that being able to use SNP genotyping to uncover resistant trees would speed up reforestation efforts.
Meanwhile, in G3: Genes, Genomes, Genetics, UConn’s Wegrzyn and her colleagues reported that they conducted deep sequencing of RNA from tissues representing the tree’s embryo state, female cones nearing pollination, stems, and roots, among others, to generate the sugar pine transcriptome.
The researchers used a combination of Illumina MiSeq, HiSeq, and Pacific Biosciences sequencing to pull together the transcriptome. Overall, they uncovered 278,812 transcripts, 30,839 of which could be functionally annotated.
Because of their role in transposable element proliferation and how that might have influenced the hefty size of conifer genomes, Wegrzyn and her colleagues focused part of their analysis on Dicer-like (DCL) proteins.
Within the P. lambertiana transcriptome, the researchers uncovered 12 transcripts that exhibited similarity to DCLs, six of which were supported by gene models. Through a phylogenetic analysis drawing on conifers, monocots, dicots, and an outgroup, the researchers identified DCLs that the sugar pine shared with other trees as well as DCLs it shared only with other conifers.
In sugar pine, conventional DCL1 transcripts and one DCL4 transcript were found across all samples the researchers analyzed, while the other DCL4 transcripts were found in cones and DCL3 expression was restricted to reproductive tissues. The profiles of conifer-specific DCL1 transcripts, they added, varied: one was barely expressed, one was ubiquitously expressed, and one had a differential profile in reproductive tissues.
“Expression analysis derived from sequencing data further supports a biological role of these variants,” the researchers said in their paper. “The results presented here highlight the peculiarities of this pathway in conifers and identifies similarities with ancient land plants.”