Skip Navigation

Journal of Computational Biology

Not a subscriber? Get started...

Efficiently Computing the Robinson-Foulds Metric

To cite this article:
Nicholas D. Pattengale, Eric J. Gottlieb, and Bernard M.E. Moret. Journal of Computational Biology. July 2007, 14(6): 724-735. doi:10.1089/cmb.2007.R012.

Published in Volume: 14 Issue 6: August 12, 2007

Author information

Nicholas D. Pattengale
Department of Computer Science, University of New Mexico, Albuquerque, New Mexico.
Eric J. Gottlieb
Department of Computer Science, University of New Mexico, Albuquerque, New Mexico.
Bernard M.E. Moret
Department of Computer Science, University of New Mexico, Albuquerque, New Mexico.
School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne and Swiss Institute for Bioinformatics, Lausanne, Switzerland.

ABSTRACT

The Robinson-Foulds (RF) metric is the measure most widely used in comparing phylogenetic trees; it can be computed in linear time using Day's algorithm. When faced with the need to compare large numbers of large trees, however, even linear time becomes prohibitive. We present a randomized approximation scheme that provides, in sublinear time and with high probability, a (1 + ɛ) approximation of the true RF metric. Our approach is to use a sublinear-space embedding of the trees, combined with an application of the Johnson-Lindenstrauss lemma to approximate vector norms very rapidly. We complement our algorithm by presenting an efficient embedding procedure, thereby resolving an open issue from the preliminary version of this paper. We have also improved the performance of Day's (exact) algorithm in practice by using techniques discovered while implementing our approximation scheme. Indeed, we give a unified framework for edge-based tree algorithms in which implementation tradeoffs are clear. Finally, we present detailed experimental results illustrating the precision and running-time tradeoffs as well as demonstrating the speed of our approach. Our new implementation, FastRF, is available as an open-source tool for phylogenetic analysis.

Free first page

This paper was cited by:

Phylogenetic Search Algorithms for Maximum Likelihood
Alexandros Stamatakis
. Jan 2011: 547-577
CrossRef
How Many Bootstrap Replicates Are Necessary?
Nicholas D. Pattengale, Masoud Alipour, Olaf R.P. Bininda-Emonds, Bernard M.E. Moret, Alexandros Stamatakis
Journal of Computational Biology. Mar 2010, Vol. 17, No. 3: 337-354
Abstract | Full Text PDF | Reprints | Permissions
Phylogenetic inference under recombination using Bayesian stochastic topology selection
A. Webb, J. M. Hancock, C. C. Holmes
Bioinformatics. Jan 2009, Vol. 25, No. 2: 197-203
CrossRef

Users who read this article also read

full access
Mehmet Koyutürk, Wojciech Szpankowski, Ananth Grama
Journal of Computational Biology. July 2007: 747-764.
Abstract | Full Text PDF | Reprints | Permissions
full access
Cedric Chauve, Jean-Philippe Doyon, Nadia El-Mabrouk
Journal of Computational Biology. October 2008: 1043-1062.
Abstract | Full Text PDF | Reprints | Permissions
full access
Buhm Han, Banu Dost, Vineet Bafna, Shaojie Zhang
Journal of Computational Biology. June 2008: 489-504.
Abstract | Full Text PDF | Reprints | Permissions
full access
Catherine Eng, Charu Asthana, Bertrand Aigle, Sébastien Hergalant, Jean-François Mari, Pierre Leblond
Journal of Computational Biology. September 2009: 1211-1225.
Abstract | Full Text PDF | Supplementary Material | Reprints | Permissions
full access
Lucinda K. Southworth, Stuart K. Kim, Art B. Owen
Journal of Computational Biology. April 2009: 625-638.
Abstract | Full Text PDF | Reprints | Permissions
full access
Cuong Than, Derek Ruths, Hideki Innan, Luay Nakhleh
Journal of Computational Biology. May 2007: 517-535.
Abstract | Full Text PDF | Reprints | Permissions

Sign up for TOC Alerts


Publication Tools

  • Related articles in Liebert Online

Search:

for

Authors:

Keywords:

Go to Advanced Search