FitchAln
INSTALLATION
After downloading the file FitchAln.tar to your computer, untar it by the following command, and the FitchAln package will be created.
tar xvf FitchAln.tar
The FitchAln software package is a free software developed in LINUX, compiled using g++ (4.4.1 or above) in LINUX operating system with the following command (Tests have been done in Fedora 11).
make
DESCRIPTION:
Given a newick binary tree T and its correpsonding multiple sequence alignemnt A of size n x m, where n is the number of leaf nodes in T and m is the length of the alignment, FitchAln generates a (n-1) x (m+1) Fitch score matrix representing the (Maximum Parsimony) number of mutations for each site for each internal node. We number the root of T as 0 and number all the remaining internal nodes in Broadth First Order. Each row i of the matrix represents an internal node numbered i. Therefore, row 0 represents the root, and there are (n-1) rows in total. Each column of the matrix, except the last column, represents a site of the sequence alignment. For all 0<=i<=n-2 and 0<=j<=m-1, cell(i,j) of the matrix shows the number of mutations for site j of the alignment for an internal node numbered i; while each cell in the last column gives the total numbers of mutations over all sites of the alignment for the internal node. That is, for all 0<=i<=n-2, when j equals to m, cell(i, m) equals to the sum of cell(i,0), ..., cell(i,m-1).
Input:
A newick binary tree T and a multiple sequence alignment A (of clustalw or fasta format). (T has n leaves in total, each leaf corresponds to exactly one sequence in A and each sequence has m characters, so that T has 2n-1 nodes in all and A has a size of n x m.)
Output:
Format in FIRST PART:
There are n-1 rows in the first part in total. Each row has two columns, the first column is node and the second column is subtree.
node: [vi] (i is an integer ranged from 0 to n-2), represents an internal node of T numbered i. Note that the root of T is numbered 0 and all the remaining internal nodes are numbered in Breadth First Order. As T has n-1 internal nodes, these internal nodes are numbered from 0 to n-2.
subtree: a newick format subtree rooted at an internal node numbered i.
Format in SECOND PART
This is a (n-1) x (m+1) Fitch score matrix representing the (Maximum Parsimony) number of mutations for each site of the alignment for each internal node. Each row i of the matrix represents an internal node numbered i, and each column j, except the last one, represents site j of the alignment. For all 0<=i<=n-2 and 0<=j<=m-1, cell(i,j) of the matrix represents the Fitch score of site j of the alignment of the internal node numbered i; if j equals m, for all 0<=i<=n-2, cells(i,m) is the sum of cell(i,0),..,cell(i,m-1). Note that row 0 always represents the root (which is numbered 0).
PROGRAM:
./FitchAln -i alnfile -f alnformat -t treefile -o outfile
input_aln : an incoming multiple sequence alignment (MSA) file
input_tree: an incoming (binary newick) phylogenetic tree, each exterior node of which represents a distinct sequence in the incoming MSA
output: an outcoming result
alnformat : format of input_aln, 0-clustalw, 1-fasta
example:
./FitchAln -i example/T1.aln -f 0 -t example/T1.tree -o tmpout
Download:
Download FitchAln Package Here, updated on March 21 2010
Questions?
If you have any questions, please send an email to Yuan Li ( liy@cs.ucf.edu )