[R-sig-phylo] Transforming a data.frame into a phylogenetic tree

Liam J. Revell lrevell at nescent.org
Wed Jan 27 20:44:01 CET 2010


Hi Timothee,

Assuming that I figured out your tree format correctly, I have written 
you a function to translate your tree-as-data-frame into an object of 
class "phylo" in R.  I attach the disclaimer that this is programmed 
very inelegantly and may contain errors.  It also contains no method for 
checking your tree file for errors, thus will probably crash if your 
tree doesn't make any sense (say, for example, contains floating 
branches not connected to any other nodes).  It assumes that you have 
installed APE and that, for example (taking your table, below), node 
00000000 is the ancestor to node 10000000 and 00000001 (as well as 
several other nodes) with the branch length of 27 leading to node 
10000000 and 83 leading to 00000001.  If this is the correct 
interpretation of your format, the original data file, below, yields a 
Newick tree of ((5:78,9:140):27,11:185,12:196,10:176,7:159); where the 
taxon labels are determined by the row labels of the data frame.  (Note 
that this tree is highly polytomous at the root, because a number of 
"nodes" in the data frame, e.g., node 10000001, leave one and only one 
descendant.)

The input is just the table given below, with the lines above "Anc 
OffSpr. . . etc." excluded, which you can read in as follows:
 > tree.data<-read.table(filename,colClasses="character")

Then you can load the function from source, as follows:
 > source("tree.read.R") # this name is just to distinguish it from 
read.tree(), an ape function.

And run it as follows:
 > tree<-tree.read(tree.data)

Trees can be printed to file using, for example:
 > write.tree(tree,filename)

The function is as follows:

tree.read<-function(tree.data){

    n.tips=0; k=1; tip.label<-NA;
    edge<-matrix(0,nrow(tree.data)-1,2); edge.length<-NA;
    for (i in 1:nrow(tree.data)){
        external=1;
        for (j in 1:nrow(tree.data)){
            if(tree.data[i,2]==tree.data[j,1]){
                external=0;
            }
        }
        n.tips<-n.tips+external;
        if(external==1){       
            edge[i-1,2]<-n.tips;
            tip.label[k]<-row.names(tree.data[i,]);
            k=k+1;
        }
    }
    node.number<-n.tips;
    for (i in 2:nrow(tree.data)){
        for(j in 1:2){
            if(edge[i-1,j]==0){
                internal<-tree.data[i,j];
                node.number<-node.number+1;
                for(k in i:nrow(tree.data)){
                    for(l in 1:2){
                        if(tree.data[k,l]==internal){
                            edge[k-1,l]=node.number;
                        }
                    }
                }
            }   
            # find branch length
            if(j==2){
                anc.height=0;
                for(k in 1:nrow(tree.data)){
                    if(tree.data[i,1]==tree.data[k,2]){
                        anc.height<-as.numeric(tree.data[k,3]);
                    }
                }
                edge.length[i-1]<-as.numeric(tree.data[i,3])-anc.height;
            }
        }
    }

    Nnode<-node.number-n.tips;
   
    
obj<-list(edge=edge,tip.label=tip.label,edge.length=edge.length,Nnode=Nnode);
    class(obj) <- "phylo";
    obj<-collapse.singles(obj);

    obj;
}

I hope this works!

- Liam

Liam J. Revell
NESCent, Duke University
web: http://anolis.oeb.harvard.edu/~liam/
NEW email: lrevell at nescent.org



Timothee POISOT wrote:
> Dear users,
>
> I am currently doing a model of community assembly, and I would like to
> integrate phylogenetic information. For each organism, I know its ancestor
> and the time at which speciation occured. These informations are stored in
> a data.frame, an example of which is here :
>
>   
>> PhyloTree
>>     
>         Anc   OffSpr TimeApp
> 1       Anc 00000000       1
> 2  00000000 10000000      28
> 3  00000000 00000001      84
> 4  00000000 00000010     101
> 5  10000000 10100000     106
> 6  00000000 00100000     139
> 7  00000000 01000000     160
> 8  00000001 10000001     161
> 9  10000000 11000000     168
> 10 00100000 00110000     177
> 11 10000001 10010001     186
> 12 00000010 00000110     197
>
> Do you have any idea about how I can proceed to transform this object into
> a tree object, such as the one used by APE?
>
> Regards,
>
> Timothée
>
> _______________________________________________
> R-sig-phylo mailing list
> R-sig-phylo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>



More information about the R-sig-phylo mailing list