LoboVault Home

Efficient algorithms for phylogenetic post-analysis


Please use this identifier to cite or link to this item: http://hdl.handle.net/1928/10898

Efficient algorithms for phylogenetic post-analysis

Show simple item record

dc.contributor.author Pattengale, Nicholas
dc.date.accessioned 2010-06-28T22:25:41Z
dc.date.available 2010-06-28T22:25:41Z
dc.date.issued 2010-06-28T22:25:41Z
dc.date.submitted May 2010
dc.identifier.uri http://hdl.handle.net/1928/10898
dc.description.abstract A variety of tasks are typically performed after a phylogenetic reconstruction proper -- tasks which fall under the category phylogenetic post-analysis. In this dissertation, we present novel approaches and efficient algorithms for three post-analysis tasks: taking distances between (typically, all pairs in a set of) trees, bootstrapping, and building consensus trees. For instance, it is often the case that reconstruction finds multiple plausible trees. One basic way of addressing this situation is to take distances between pairs of trees, in order to gain an understanding of the extent to which the trees disagree. The most frequently employed manner for computing the distance between a tree pair is the Robinson-Foulds metric, a natural dissimilarity measure between a pair of phylogenetic trees. We present a novel family of algorithms for efficiently computing the Robinson-Foulds metric. Bootstrapping is a post-analysis technique for drawing support values on tree edges, and is often used for assessing the extent to which the underlying data (e.g., molecular sequences) supports a reconstructed tree. The basis of the approach is to reconstruct many trees, called replicates, based on random subsampling of the original data. However, to date, there has been little treatment in phylogeny regarding the question of how many bootstrap replicates to generate. We propose bootstopping criteria which are designed to provide on-the-fly (i.e., runtime) guidance for determining when enough bootstrap replicates have been reconstructed. Another common post-analysis task is to build a consensus tree, a summary tree that attempts to capture the information agreed upon by bootstrap replicates. Unfortunately, the most popular consensus methods are susceptible to confusion by rogue taxa, i.e., taxa that cannot be placed with assurance anywhere within the tree. We present novel theory and efficient algorithms to identify rogue taxa, as well as a novel technique for interpreting the results (in the context of bootstrapping). en_US
dc.language.iso en_US en_US
dc.subject computational biology en_US
dc.subject algorithms en_US
dc.subject.lcsh Cladistic analysis--Data processing.
dc.subject.lcsh Decision trees.
dc.subject.lcsh Branching processes.
dc.subject.lcsh Computational biology.
dc.title Efficient algorithms for phylogenetic post-analysis en_US
dc.type Dissertation en_US
dc.description.degree Computer Science en_US
dc.description.level Doctoral en_US
dc.description.department University of New Mexico. Dept. of Computer Science en_US
dc.description.advisor Moret, Bernard
dc.description.advisor Saia, Jared
dc.description.committee-member Stamatakis, Alexandros
dc.description.committee-member Moore, Cristopher

Files in this item

Files Size Format View
pattengale.pdf 1.052Mb PDF View/Open

This item appears in the following Collection(s)

Show simple item record

UNM Libraries

Search LoboVault


My Account