Danny's DNA Discoveries
by Danny Miller, email@example.com
This project will attempt to document all of the mushrooms species that occur in the Pacific Northwest based on DNA analysis funded by PSMS member Yi-Min Wang, the PSMS board and other entities fully acknowledged below. I define the PNW as Washington, Oregon, Idaho and southern British Columbia, although MycoMatch includes species only found in northern BC as well.
DOWNLOAD ALL MY FASTA DNA DATA AND MY TREES HERE
The above link has all of my data, but it may be hard to find what you are looking for. Typically, you will find one FASTA file per genus, but for large genera you may find one file per subgenus and for small genera, you might find one file per several genera, family, order or class. Every file is also available individually on the various pages linked from my big tree below. While this is a work in progress, I can't predict which version will be more up to date. Email me for the most up-to-date data, especially for genera whose links don't work yet.
|Other free resources:
||How to help out with this project:
If you have any information to add, or corrections to make, please contact me above!
Yi-Min Wang is providing ongoing funding through PSMS for DNA sequencing at a scale large enough to make serious progress against this goal much more quickly than before. He is also coordinating with the public through social media groups to identify people who have collected valuable specimens and arrange to get them saved and sequenced. He utilizes the results to create valuable articles that educate people on how to better discriminate between difficult to identify species.
PSMS, the Puget Sound Mycological Society (http://www.psms.org) has funded the sequencing of all the collections saved at the NAMA - North American Mycological Association (http://www.namyco.org) foray near Mt. Rainier WA in 2014. PSMS and NAMA co-funded the sequencing of the collections saved at the NAMA foray near Salem OR in 2018. PSMS is also funding the sequencing of newer collections made at PSMS forays and events. Providing more than 1,000 sequences, these projects contributed greatly to our knowledge.
Stephen Russell of Purdue University, the Hoosier Mushroom Society of Indiana, the University of Michigan, and co-founder of the North American MycoFlora Project (now FunDis) has done thousands of sequences for this project and others across the country at highly discounted rates. He has almost single handedly developed the best techniques for using the new third generation nanopore sequencing process, which has decimated the cost of bulk sequencing, yet he still continues to volunteer his time and expense to do thousands more sequences for us at or below even the lower costs of this process.
The Stuntz Foundation is funding over 400 sequences of important PNW material being sequenced by the new nanopore technology, once again through Stephen Russell.
The non-profit "FunDis" (http://www.fundis.org) was founded by members of the continent-wide "North American Mycological Association" of amateur mycologists partnered with the "Mycological Society of America" (the association of professional mycologists) to create an organization that could help all of the dozens of mushroom clubs across the continent study their local mushrooms. Specifically, they help educate clubs about how to do DNA sequencing projects and try to provide low cost services and funding assistance so that each club doesn't have to figure out how to do it on their own. They also try to ensure that all of the experts in the various groups of mushrooms have access to the data discovered by every club on the continent to make sure that the information learned gets disseminated and written up in papers for the greater good.
Matt Gordon is doing the very important work of sequencing our "type" collections (see definitions below), so we actually know what our mushrooms are and have something to compare future collections to. I consider this the very important first step of any genetic work that is done, upon which everything reliable will be based.
Ian Gibson created and curates the free PNW mushroom identification program “MycoMatch (MatchMaker)” (http://www.mycomatch.com) of which I am a co-author, which contains records of all known species found in the PNW. It also contains my pictorial key, referenced above. The results of these studies will be making their way into these programs.
What follows are definitions of the terms I will be using throughout the article when discussing how confident I am that a certain specimen is really a certain species or not, as well as a bit of background on how one identifies a mushroom after getting a DNA sequence.
The DNA of a mushroom is something on the order of 50 million base pairs (called nucleotides) long. That’s much shorter than human DNA, which has about 3 billion base pairs. It’s still too expensive to sequence (and unwieldy to analyze) that much DNA, so the scientific community has agreed on some short genes that can be sequenced that can more easily provide a good deal of valuable information. One of the most popular regions to sequence, called ITS (Internal Transcribed Spacer), is only about 700 nucleotides long, which is quite manageable. It is sometimes called a “junk” area, meaning it doesn’t really do anything (or at least its integrity is not as important to the survival of the organism as other critical areas). It is thought that it is allowed to mutate quicker than other genes, as a mutation in ITS has little effect on the organism. Mutations in other critical areas might kill the organism, so those areas resist mutation. With ITS free to mutate as much as it wants, you see smaller differences between individuals, making it a great choice to try and tell if your two specimens are the same species or not. It is not a good region to determine how two organisms are related from a larger perspective (same genus or family), only for determining the more subtle differences that differentiate two species, which is what I am mostly trying to accomplish. ITS has been nicknamed the “barcode” area because you can imagine scanning a mushroom to read its ITS DNA and comparing it to a database so that this hypothetical barcode scanner could then tell you what mushroom you have like produce in the grocery store.
The biggest, most loaded question of all time is how much DNA difference does there have to be for something to be considered a different species? There is no good answer to that, and there may never be. When comparing ITS regions, although some people say that a difference of 3% probably indicates a different species, it very often turns out that a difference of 0.5% is enough to indicate a different species. That means 4 characters out of the 700 characters of an ITS sequence being different might be enough. 10 differences almost certainly does. Sometimes two mushrooms might have the exact same ITS DNA and still be different species. This can happen if other regions mutated faster than ITS did, even though that is not typically going to be the case. ITS is divided into 2 parts, ITS1 (up to 300 or so characters) and ITS2 (up to 400 or so characters). Ideally, we sequence both parts, but we often only have ITS2 data available for our local Russulas. Any more than 2 differences in ITS2 means the DNA is >0.5% different. So as I make judgment calls below as to whether or not our local species are unique, I will call out any species with more than 2 ITS2 differences as potentially being different.
In other words, identical DNA does not mean your species is the same, and vastly different DNA does not necessarily mean that your species is different.
Like humans, mushrooms are diploid organisms with two complete distinct sets of DNA. We may get a sequence of the first version, or “allele”, the second version, or both, with more than one possibility for a nucleotide at several locations. Thus, apparent differences between two sequences may be explained by different alleles of the same species. We may need to analyze more sequences of the mushroom to determine all the places where there may be more than one valid choice of nucleotide.
It’s important to remember that DNA is not a magic bullet but only one tool to be used with other, more conventional research tools. DNA results are not comprehensive when only one short region is sequenced. Obtaining an accurate picture of how a group of sequences relate to each other may require sequences six or more different genes. But even if you could look at the entire genome of 50,000,000 nucleotides, that won't answer all the questions. If there are differences in the way the mushroom looks, or microscopic or ecological differences, a few differences in DNA may be another clue that our mushroom is a unique species. But if there are no other differences, a few differences in DNA may not be significant. Conversely, if there are ecological, morphological and microscopic differences, but no DNA differences in certain genes, it may still be a separate species. You might have to sequence the entire genome to find the DNA differences. Ultimately, it’s a matter of opinion, and the more information we have, the better we can make an informed opinion.
As you might expect, there is an internet database that most people use to store their DNA sequences, called GenBank, so you can compare your sequence to this giant database to get an idea of what it might be. However, this is not very useful at all, as it turns out most GenBank entries are identified incorrectly, with the wrong mushroom name. Most of them! This always surprises people to learn. GenBank might be able to tell you what parts of the world your DNA was found in (without being able to identify it) but unfortunately, it often can’t even do that. Until recently, it has not been common for people to record in GenBank where their mushroom was found. This is a huge oversight that just goes to show how new and imperfect our technology and techniques are.
Only a small percentage of species have their official “type” specimen sequenced, which is a definitive way of knowing how your mushroom compares to the official “real” thing. One of the most important pieces of work being done is to sequence as many types as possible. This is a necessary first step before we’ll be able to make any definite conclusions on a large scale. But many of them are hundreds of years old or don’t exist anymore, so people will have to designate new types, or “neotypes” and make their best guess as to what the original mushroom was. From then on, the neotype will be the official specimen, and it will have to be forever assumed, rightly or wrongly, that it is the same as the original type.
So then, how do you figure out what your mushroom is? It is not easy. You have to look at every part of the world your sequence is found, and every sequence with an identification of the species you think you have, all over the world. You will often find that a half dozen or so vastly different DNA sequences have come out of mushrooms that people thought were the same thing. At most one of them can be right! If every recorded specimen from the type area has the same DNA, and there are no specimens that look like it with different DNA, you might have found a reliable sequence of that species. This is only the barest of overviews of this process, for more information please read my papers and watch my videos at the top left of this page under "Under Free Resources".
Any results from a single gene region, like ITS that I use, can only be considered preliminary. Definitive answers of whether or not our local species are the same or different from other species around the world must wait until more gene regions are sequenced and non-genetic morphological and ecological studies are done as well to corroborate what we find here.
You can download all of my sequences, plus rudimentary trees created from those sequences, but do not read too much into the trees, they are full of inaccuracies. ITS DNA cannot show accurate relationships between anything but closely related species.
A few other terms I use:
Russula cf emetica – cf is “confer” in Latin, meaning “compare”. I use this term to mean the mushroom looks like R. emetica, but might not be. It makes no judgement as to whether it is genetically related to R. emetica, only that it looks like it.
Russula aff emetica – aff is “affinis” in Latin, meaning it has an affinity to it. I use this term to mean the mushroom is very closely genetically related to R. emetica, but may or may not be close enough to actually be R. emetica. There is a distinct possibility that it will turn out to be a different species in need of its own new name.
Russula 'xerampelina' - if I put single quotes around a name, it means that is the name we've been using for the mushroom, but it may not be correct, for one of the above reasons. In other words, it would be more correct to call it Russula cf xerampelina or Russula aff xerampelina, depending on whether or not it is actually closely related or just looks similar.
The species epithet has to agree with the gender of the genus if it is an adjective, so when a mushrooms changes genus, it may require a slight spelling change to the end of the species epithet, such as Gymnopus peronatus becoming Collybiopsis peronata.
When I talk about a clade of mushrooms, I mean a group of species that are all each other's closest relatives. Other mushrooms may look just like the mushrooms in a clade, but be only distantly related, so they don’t count. Closely related mushrooms may share the same environmental benefits, health benefits and poisons, so it’s important to know how closely mushrooms are related to each other, not simply which look the same to the untrained eye. These articles will include mushrooms that there is genetic evidence for. We no doubt have additional species that I will not be mentioning, but are either rare or have not been part of a genetic study. If you think you have found a specimen of any of the species that I say we need more information about, or anything that I haven't mentioned, please take good pictures and save it, and contact me at firstname.lastname@example.org.
When I talk about a group of mushrooms, I might mean mushrooms that look the same, even though it's possible they may not be all closely related. A group is not as specific as a clade.
When I talk about comparing DNA, I am specifically talking about comparing the short ITS regions of DNA unless I specifically say otherwise.
Monophyletic - you can find a node in the binary tree where every mushroom with that name (at whatever level - species, genus, family, sub-order, order, class, etc.) is past that node, and nothing with a different name is past that node. For instance, the genus Amanita is a monophyletic genus if there is a node past which everything is called Amanita, and no Amanita occur anywhere else in the tree of life. This is a requirement for considering that mushrooms have been properly named without controversy. But another requirement is that doing so does not keep any other name at the same level from being monophyletic (see paraphyletic).
Paraphyletic - Almost monophyletic, but there are one or more nodes inside your tree that contain names that are not the same. For instance, inside the tree for the genus Leucoagaricus, there is a node past which are found all of Leucocoprinus. Leucocoprinus is monophyletic, but it is keeping Leucoagaricus from being monophyletic because it is "inside" it instead of "beside" it where it wouldn't affect it at all. Leucocoprinus is rendering Leucoagaricus paraphyletic - there are one or more branches you have to prune from Leucoagaricus to make it monophyletic. Sticking to the rule that nothing be paraphyletic can be problematic because all of Leucocoprinus would have to be renamed to Leucoagaricus, and you won't have a good name for those very distinct mushrooms, and all just because they evolved at an inconvenient time from a Leucoagaricus ancestor instead of from an extinct ancestor of all Leucoagaricus.
Polyphyletic - species with that level name are found past multiple nodes of the tree. A node that had all of the species with that name past it would also include other names that couldn't be pruned away. Species that are only distantly related have been incorrectly given the same name, probably because they superficially looked the same. This is never allowed.
Russula s.l. (sensu lato) - meaning "in the wide sense" refers to all mushrooms once or currently called Russula including some that don't really belong there based on DNA evidence. Expect some of them to get a new genus name.
Russula s.s. (sensu stricto) - meaning "in the narrow sense" refers to only those Russula mushrooms that form a monophyletic clade together and properly belong in Russula. None of them should need a genus name change.
Basal clade - one of the earliest branches from a certain node in a tree. It's tempting to say it is the "oldest" or "most primitive" group of species, but that's not entirely correct. The extant species in the basal branch are around today, and may have evolved recently from other similar species, just like every other species on a supposedly "more evolved" branch. But there is a more direct line from the species in a basal clade back to the node that is the common ancestor of every species past that node, and the species in the basal clade may be more likely to share traits with the ancient species from which everything past that node evolved.
Ancestral trait - when some species past a node have one character yet others have a different one, the ancestral trait is the one that the (probably extinct) species back at the node had, implying that the species past the node with the different character evolved a change.
Derived trait - the other character, the one that wasn't true of the ancestor back at the node. When deciding which of two traits is ancestral or derived, we usually assume the fewest number of required mutations back and forth to explain the tree.
And remember, when I talk about the taste of a mushroom for identification purposes, some people are comfortable tasting a small piece for 30 seconds and then spitting it out, if they are sure it is not a dangerous species. Do not swallow.
The arrangement of the tree that follows is still somewhat speculative, but based in part on the latest multi-gene studies, starting with this great site which shows what we know from the relatively few species that have had their entire genome sequenced. Otherwise, I try to go by the study that uses the most genes. The tree reads from basal to crown, top to bottom.
Eventually, there will be a report for every genus. Links that have been written so far and can be clicked on begin with a • and are underlined. You can expand and contract branches of the tree by clicking on an entry with an arrow. Some interesting features that clades have evolved are noted in bold, like spore colour.