[R-sig-phylo] adding a new taxa to an existing clade in a tree set

Emmanuel Paradis Emmanuel.Paradis at ird.fr
Thu Feb 16 12:41:32 CET 2012

Hi Annemarie,

Annemarie Verkerk wrote on 16/02/2012 09:57:
> Hi all,
> I have a question regarding the manual addition of a leaf to an existing
> clade in a set of trees. The reason I ask is because I want to test the
> influence of adding ancestral data to an ancestral state estimation. I
> have some ancestral data on the feature I want to investigate, and I
> know from other sources where the leaf is situated on the trees, but I
> don't have any data that I can use to actually build new trees which
> also incorporate this leaf at the moment. So I'm hoping to add it
> manually and see whether it makes a difference.
> I know it is possible to add a leaf to any of the external edges (I
> mean, edges leading to the leaves) of a tree using bind.tree() in ape.
> However, if I wanted to add a leaf to a lower position in the tree, say
> on an branch that splits and leads to two other leaves, it becomes
> difficult: not all the trees in the sample might feature the grouping of
> those two leaves, and even if they do, the relevant edge to which I
> would want to add the new leaf will have a different number in each tree.
> So I guess I have two questions:
> 1. Is there an easy way to select the trees that have a certain clade,
> so I can try to add a new leaf to only those trees? (I suspect a lot of
> error messages if I try to do it for the complete tree sample, as there
> will be trees that do not have the relevant clade.)

see ?is.monophyletic

> 2. Is there a way (for those trees that have the clade I want to add the
> new leaf to) to define the relevant edge using only the leaf numbers and
> not the internal nodes/edge numbers? I.e., I want to add the leaf to the
> internal edge that leads to the leaves with number 5, 6, and 7, for
> instance, is there a way identify that internal edge?

Yes, you can use mrca() like this:

m <- mrca(phy)
phy$edge[, 2] == m[tip1, tip2]

A more general way (ie, with clades made of 2 or more tips) is to use 
prop.part(phy): it will return a list of the tips descendant from each 
node (in other words, the clades of the tree). This allows you also to 
combine both steps:

list_of_tips <- sort(tip1, tip2, ...etc....

pp <- prop.part(phy)
node <- NULL
for (i in seq_along(phy)) {
     if (isTRUE(all.equal(pp[[i]], list_of_tips))) {
          node <- i + Ntip(phy)
if (is.null(node)) ## go to next tree
else {
     phy$edge[, 2] ==  node
## etc...



> Many thanks for any suggestions,
> Annemarie

More information about the R-sig-phylo mailing list