The Genetic Structure of Marijuana and Hemp, Jason Sawler et al., 2015

The Genetic Structure of Marijuana and Hemp

Jason Sawler, Jake M. Stout, Kyle M. Gardner, Darryl Hudson, John Vidmar,
Laura Butler, Jonathan E. Page, Sean Myles

PLoS ONE, 2015, 10, (8), e0133292.

doi : 10.1371/journal.pone.0133292


Despite its cultivation as a source of food, fibre and medicine, and its global status as the most used illicit drug, the genus Cannabis has an inconclusive taxonomic organization and evolutionary history. Drug types of Cannabis (marijuana), which contain high amounts of the psychoactive cannabinoid Δ9-tetrahydrocannabinol (THC), are used for medical purposes and as a recreational drug. Hemp types are grown for the production of seed and fibre, and contain low amounts of THC. Two species or gene pools (C. sativa and C. indica) are widely used in describing the pedigree or appearance of cultivated Cannabis plants. Using 14,031 single-nucleotide polymorphisms (SNPs) genotyped in 81 marijuana and 43 hemp samples, we show that marijuana and hemp are significantly differentiated at a genome-wide level, demonstrating that the distinction between these populations is not limited to genes underlying THC production. We find a moderate correlation between the genetic structure of marijuana strains and their reported C. sativa and C. indica ancestry and show that marijuana strain names often do not reflect a meaningful genetic identity. We also provide evidence that hemp is genetically more similar to C. indica type marijuana than to C. sativa strains.


Cannabis is one of humanity’s oldest crops, with records of use dating to 6000 years before present. Possibly because of its early origins, and due to restrictions on scientific inquiry brought about by drug policy, the evolutionary and domestication history of Cannabis remains poorly understood. Hillig (2005) proposed on the basis of allozyme variation that the genus consists of three species (C. sativa, C. indica, and C. ruderalis) [1], whereas an alternative viewpoint is that Cannabis is monotypic and that observable subpopulations represent subspecies of C. sativa [2]. The putative species C. ruderalis may represent feral populations of the other types or those adapted to northern regions.

The classification of Cannabis populations is confounded by many cultural factors, and tracing the history of a plant that has seen wide geographic dispersal and artificial selection by humans over thousands of years has proven difficult. Many hemp types have varietal names while marijuana types lack an organized horticultural registration system and are referred to as strains. The draft genome and transcriptome of C. sativa were published in 2011 [3], however until now there has been no published investigation of Cannabis population structure using high-throughput genotyping methods. As both public opinion and legislation in many countries shifts towards recognizing Cannabis as a plant of medical and agricultural value [4], the genetic characterization of marijuana and hemp becomes increasingly important for both clinical research and crop improvement efforts.

An important first step towards deeper evolutionary and functional analyses of Cannabis, including trait mapping and identification of functional genetic variation, is the characterization of the genetic structure of the genus. Here, we report the genotyping of a diverse collection of Cannabis germplasm and show that genetic differences between hemp and marijuana are not limited to genes involved in THC production, while the reported C. sativa and C. indica ancestries of marijuana strains only partially capture the main axes of marijuana’s genetic structure.