Newick

A python module for parsing trees in the Newick file format.

Usage

The module contains a python parser for the Newick tree format. It reads in trees like:

(
  ('Chimp' 1 : 0.052,
   'Human' 1 : 0.042) 0.71 : 0.007,
  'Gorilla' 1 : 0.060,
  ('Gibbon' 1 : 0.124,
   'Orangutan' 1 : 0.0971) 1 : 0.038
);

where parenthesis denotes the sub-trees and the edges are annotated "bootstrap-value : length", where both bootstrap value and length is optional.

Leaf-labels (identifiers) must be quoted (using '...') if they contain spaces, but need not be otherwise. Empty labels can be specified either as '' or by simply leaving out the identifier, as in "(,(,,),);".

The simplest usage is to load the parser from newick.tree and read the tree from a string:

from newick import parse_tree
print parse_tree("(A,B);")

A tree parsed this way can then be traversed and manipulated.

By using "handlers" it is possible to extract information from the tree—through call-backs—without first building the entire tree. The following example uses this to calculate the total branch sum:

import newick
import sys

class BranchLengthSum(newick.AbstractHandler):
    def __init__(self):
        self.sum = 0.0

    def new_edge(self,b,l):
        if l:
            self.sum += l

    def get_result(self):
        return self.sum

print newick.parse(sys.stdin.read(),BranchLengthSum())

Installation

Download the source code and unpack it (tar xzf newick-a.b.c.tar.gz, where a.b.c is the version number of newick). Then run the setup script:

python setup.py install 

See python setup.py --help for more details.

Contact

Thomas Mailund, <mailund@birc.au.dk>, Bioinformatics Research Center, University of Aarhus.

Time-stamp: "2006-01-26 21:59:50 mailund"