It’s All About the Validated Library and Reporting Process...
...when you choose to sequence your microorganism (BacSeq or FunITS) and analyze the data with an optimized, manual cGMP-compliant process. Once the DNA sequence is analyzed, it is compared to a relevant, validated library of known organisms, and an identification report is generated and interpreted.
Interpretation of the sequence, library comparison and report generation are critical parts of the process. There are no interpretation rules which can be applied universally. The species cutoff varies from genus to genus. Sometimes the species that comprise a particular genus are very closely related (i.e., low genetic distance), but in other genera the species are only distantly related (i.e., high genetic distance). This variation in distance measurements makes species interpretation using only one genetic distance cutoff impossible. One must first look at the genetic distance of the unknown to its closest match and then determine if that distance is on average equal to or less than the distance that separates the known species of that particular genus. Thus, the education and experience of our Microbial Phylogeneticists are crucial to assigning the correct identification.
The sequencing identification report illustrates an alignment of the unknown organism to its 10 closest matches in order of increasing genetic distance. The distance measurement is a comparison of one sequence to another and is expressed as a percentage of nucleotidedifferences between the two sequences. First, the sequences are aligned to minimize the absolute number of differences between the two sequences. Gaps may be introduced into one or both of the sequences in order to achieve the optimal alignment. Next the sequences are compared at every nucleotide position (pairwise comparison) and the percentage difference is calculated.
The organisms in the alignment are also represented in a phylogenetic tree. The phylogenetic tree graphically illustrates the relationship of not only the sample to the top 10 matches, but also the top 10 matches to one another. The phylogenetic tree, generated for the purpose of determining the identification of an isolate, is distance based. We use the Neighbor Joining (NJ) Tree for data interpretation. The first step in generating a tree is a pairwise alignment of all of the sequences and calculation of the genetic distance for each pair. The resulting data is stored in a distance matrix. Using the data from the distance matrix, the algorithm then determines the tree topology that best represents the pairwise distances between all combinations. The distance along the horizontal lines connecting two organisms is a close approximation of the sequence differences between all pairs. Both the distribution and the branching order indicate how organisms actually relate to one another and are equally important for the interpretation.
Confidence Levels Made Simple
The phylogenetic tree is used, along with the alignment, to assign an identification and confidence level to the organism. The level of confidence for the identification is given, not as a percentage, but to the phylogenetic level to which we are certain the organism belongs, whether it is species, genus, family, order or no match. We assign the highest level of taxonomic information possible for each sample.
Identify with the Leader
Our experienced Microbial Phylogeneticists are very adept at deciphering potentially confusing data in their interpretation process. Because the phylogenetic classification of many organisms is currently incorrect, it can create a very complex interpretive situation which must take into consideration the genetic variability and branching order of a group of organisms. However, based on our experience of identifying over 500,000 unknown microorganisms, Charles River has accumulated the knowledge and expertise to transform this information into routine identifications governed by a cGMP-compliant quality processes.