[R-sig-phylo] Transforming a data.frame into a phylogenetic tree
Liam J. Revell
lrevell at nescent.org
Wed Jan 27 20:44:01 CET 2010
Hi Timothee,
Assuming that I figured out your tree format correctly, I have written
you a function to translate your tree-as-data-frame into an object of
class "phylo" in R. I attach the disclaimer that this is programmed
very inelegantly and may contain errors. It also contains no method for
checking your tree file for errors, thus will probably crash if your
tree doesn't make any sense (say, for example, contains floating
branches not connected to any other nodes). It assumes that you have
installed APE and that, for example (taking your table, below), node
00000000 is the ancestor to node 10000000 and 00000001 (as well as
several other nodes) with the branch length of 27 leading to node
10000000 and 83 leading to 00000001. If this is the correct
interpretation of your format, the original data file, below, yields a
Newick tree of ((5:78,9:140):27,11:185,12:196,10:176,7:159); where the
taxon labels are determined by the row labels of the data frame. (Note
that this tree is highly polytomous at the root, because a number of
"nodes" in the data frame, e.g., node 10000001, leave one and only one
descendant.)
The input is just the table given below, with the lines above "Anc
OffSpr. . . etc." excluded, which you can read in as follows:
> tree.data<-read.table(filename,colClasses="character")
Then you can load the function from source, as follows:
> source("tree.read.R") # this name is just to distinguish it from
read.tree(), an ape function.
And run it as follows:
> tree<-tree.read(tree.data)
Trees can be printed to file using, for example:
> write.tree(tree,filename)
The function is as follows:
tree.read<-function(tree.data){
n.tips=0; k=1; tip.label<-NA;
edge<-matrix(0,nrow(tree.data)-1,2); edge.length<-NA;
for (i in 1:nrow(tree.data)){
external=1;
for (j in 1:nrow(tree.data)){
if(tree.data[i,2]==tree.data[j,1]){
external=0;
}
}
n.tips<-n.tips+external;
if(external==1){
edge[i-1,2]<-n.tips;
tip.label[k]<-row.names(tree.data[i,]);
k=k+1;
}
}
node.number<-n.tips;
for (i in 2:nrow(tree.data)){
for(j in 1:2){
if(edge[i-1,j]==0){
internal<-tree.data[i,j];
node.number<-node.number+1;
for(k in i:nrow(tree.data)){
for(l in 1:2){
if(tree.data[k,l]==internal){
edge[k-1,l]=node.number;
}
}
}
}
# find branch length
if(j==2){
anc.height=0;
for(k in 1:nrow(tree.data)){
if(tree.data[i,1]==tree.data[k,2]){
anc.height<-as.numeric(tree.data[k,3]);
}
}
edge.length[i-1]<-as.numeric(tree.data[i,3])-anc.height;
}
}
}
Nnode<-node.number-n.tips;
obj<-list(edge=edge,tip.label=tip.label,edge.length=edge.length,Nnode=Nnode);
class(obj) <- "phylo";
obj<-collapse.singles(obj);
obj;
}
I hope this works!
- Liam
Liam J. Revell
NESCent, Duke University
web: http://anolis.oeb.harvard.edu/~liam/
NEW email: lrevell at nescent.org
Timothee POISOT wrote:
> Dear users,
>
> I am currently doing a model of community assembly, and I would like to
> integrate phylogenetic information. For each organism, I know its ancestor
> and the time at which speciation occured. These informations are stored in
> a data.frame, an example of which is here :
>
>
>> PhyloTree
>>
> Anc OffSpr TimeApp
> 1 Anc 00000000 1
> 2 00000000 10000000 28
> 3 00000000 00000001 84
> 4 00000000 00000010 101
> 5 10000000 10100000 106
> 6 00000000 00100000 139
> 7 00000000 01000000 160
> 8 00000001 10000001 161
> 9 10000000 11000000 168
> 10 00100000 00110000 177
> 11 10000001 10010001 186
> 12 00000010 00000110 197
>
> Do you have any idea about how I can proceed to transform this object into
> a tree object, such as the one used by APE?
>
> Regards,
>
> Timothée
>
> _______________________________________________
> R-sig-phylo mailing list
> R-sig-phylo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>
More information about the R-sig-phylo
mailing list