As more states are finding legal ways for millions of potheads to consume marijuana, more cannabis labs have work to do.
Of course, they quantify how much THC, CBD, and other chemical constituents a particular strain contains, but any remotely dedicated user can tell you that a strain is not just about potency. There are nearly 400 different strains, and each of them has its own unique smell, taste, effects, and medicinal properties. The interaction of thousands of different chemicals in a plant creates whatever distinction there exists between Bubba Kush and, say, Sour Diesel. Even the Baskin Robbins alone has 31 varieties.
The first genome sequence of marijuana was published in 2011 by a group of Canadian scientists. Since then, the process of understanding cannabis has improved by leaps and bounds. On the other hand, while marijuana remains a Schedule I drug in the United States, most scientists are working with limited weed samples, and chemical profiles mostly come from the University of Mississippi and the National Institute on Drug Abuse.
But there are also a few commercial cannabis labs that test and analyze recreational and medical marijuana to assist the growing marijuana industry. Some of them do not have a problem with the access to all those chemical profiles. The key to modern agriculture, and, at the same time, its biggest challenge, is to get the plant's genotype. Knowing the genes behind those traits, researchers can grow the weed with these traits with maximum precision and speed. It is called marker-assisted selection.
The first attempts to sequence the marijuana genome yielded a huge amount of tiny fragments—so many that nobody could stitch them together. Plant genomes are tricky to sequence, and the plant's DNA is particularly challenging.
In 2014, one marijuana analytics company called Steep Hill spent $1.1 million on building a PacBio RS II long-read sequencing machine—a giant white box that scans pieces of DNA as long as 53,000 base pairs. Thanks to this technology, there are already over 850 scientific publications in a variety of research areas, including animal genomics, human biomedical research, and microbiology. The sequencer has already led to an exciting breakthrough that was believed to be the first isolated male cannabis genome sequence.
According to weed scientists, in order to put together a good sequence—a reference genome—they need a sample that is homozygous with a pair of matched sets of chromosomes. Without a solid, inbred strain, it is not impossible but very unlikely to assemble a reference genome. For example, sequencing the maize genome took nearly 33 labs, $30 million, 160 researchers, and four years.
It is all about profit. Thus, if someone invests in a solid reference genome, anyone else's sequences increase in value for free. If Steep Hill comes up with a better sequence using its PacBio machine, it automatically makes its competitors more powerful.
There is actually no need to have a genome to find genetic markers: looking at the point of mutation called single nucleotide polymorphism, cannabis scientists can construct a rough evolutionary tree for the plant. Practically, single nucleotide polymorphisms can distinguish one strain from another, helping weed labs have a strain on the way to fight counterfeits. It means that no one can patent an illegal product.
But the situation can turn dramatically. Somewhere, someone is trying to crack the code, and if they manage to do so, it will fundamentally change the $40 billion industry.
Today, there are so many strains that pot users are able to choose the exact high they want and the medication they need. Some biologists claim that in the near future, by changing the balance of terpenoids and cannabinoids, they will be able to calibrate the high and even the medicinal properties. Fortunately, research on the neurochemistry of cannabis is still as far behind as the genetics.