Sequencing Cannabis Genomes

Written by Lance Griffin

We all have a genome, which includes our DNA and genes. The genome is an information super-source. It is the special code that directs the growth, functioning, and maintenance of living organisms. Variation of about 0.001% in the human genome is largely what makes individuals unique. We’re not so different, after all.

Plants also have genomes, and genomic variation of Cannabis sativa L. is the main reason dispensaries are loaded with unique chemovars. It is also the reason hemp has low delta-9-tetrahydrocannabinol (THC). Cannabis genetics are paramount for cultivators. Accessing the cannabis plant genome represents a new dawn of precision cultivation.

A 2020 investigation (preprint at time of writing) spanning 15 researchers and four universities sequenced and annotated the genomes of cannabis samples. [1] Sequencing refers to identifying the arrangement of the DNA molecule’s four bases (adenine (A), thymine (T), cytosine (C) and guanine (G)). Annotation means analyzing and defining those sequences to glean biological meaning.

The study explored how cannabinoid synthesis affects pathogen resistance in cannabis at the genetic level. [1] The researchers initially shotgun sequenced and annotated cannabis relatives — a sibling pair (Jamaican Lion) and six offspring. These reference plants supplied “the first type II cannabis genome assembled containing both functional THCAS [(tetrahydrocannabinolic acid synthase)] and CBDAS [(cannabidiolic acid synthase)] alleles.” Relatively new technology allowed for “longer reads” that “greatly simplify the analysis of gene family expansions in cannabinoid and terpene synthase gene families.”

The researchers used the aforementioned plant genomes as controls to explore the genomes of 40 additional cannabis or hemp cultivars spanning males (9), females (29), and hermaphrodites (2).  The cultivators of the cannabis samples provided reports on resistance to powdery mildew (PM). At the intersection of these reports and genomic analysis, basic findings included:

One of the more nuanced but provocative discoveries regards Type III cannabis (high CBD, low THC), which includes hemp bred to produce less than 0.3% tetrahydrocannabinol (THC) for legal purposes.

These plants lack a functional THCAS gene in favor of a CBDAS gene. But plants may concurrently express CBCAS (cannabichromenic acid synthase) genetics. The authors speculate that the CBCAS genetic cluster explains how hemp plants continue to produce small amounts of “promiscuous” THC, as CBCAS “may produce THCA as a byproduct.”

Deleting CBCAS genes opens the door to another problem, namely that this genetic cluster also houses certain pathogen response genes. The researchers hypothesized that “[b]reeding for less than 0.3% THCA production may enrich for pathogen susceptibility and higher patient fungal exposure.” In other words, selective breeding for low-THC hemp may weaken its pathogen resistance.

Overall, the research represents a concrete step toward unlocking the power of genomics in cannabis cultivation. The study authors conclude that such genomic mapping can “guide more stable and directed breeding efforts for desired chemotypes and pathogen-resistant cultivars.” [1]


  1. McKernan KJ, et al. “Sequence and Annotation of 42 Cannabis Genomes Reveals Extensive Copy Number Variation in Cannabinoid Synthesis and Pathogen Resistance Genes.” BioRxiv [preprint], 2020,

About the author

Lance Griffin


Leave a Comment