Relating Microbial Gene Sequences to Environmental Factors: Cutting Through the Trees

Microbial source tracking (MST) has relied on the construction of   complex evolutionary models to describe gene survey data, which   results in a variety of phylogenetic trees. Creating meaningful   trees requires expertise on the part of the researcher, and   interpreting the relevant information from the trees is often   intricate and frustrating. Moreover, it has been difficult to relate   environmental and geochemical factors to phylogenies. We contend   that it is time for the forest to be thinned out, or for trees to be   chopped down altogether if we want to address biogeochemical   problems with gene sequence information.     Using a variety of large sequence databases that were constructed   with relevant geochemical/ecological data, this project applied   alignment-independent methods to reveal significantly correlated   relationships not seen by using phylogenies. Also, by applying word   frequency of short sequence fragments, this project was able to   implement discriminant analysis routines to find variables that   effectively identified microbial communities by environmental type   (e.g., microbial source tracking) -- One advantage of these methods   is that individual- and community-level correlations to   environmental factors were differentiated and cross-validated,   without complex evolutionary models and assumptions. The implication   of these findings is that geochemical similarities of unknown   microbial environments may be inferred directly from sequence   libraries.    (microbial source tracking); (MST); (bacterial source tracking);   (BST)

