[R-sig-phylo] phyDat format problem

Liam J. Revell liamjrevell at gmail.com
Mon Jan 9 18:46:21 CET 2012


Hi Jen.

Try changing to a matrix first (from a data frame). [I.e., 
p<-as.matrix(p); p2<-phyDat(p,type="USER",levels=c(0,1)) ]

I'm not sure why this works, but it seems to. - Liam

More details below.

When I do:
p<-matrix(c(1,0,0,0,1,0,1,0,1,0,
1,1,0,1,1,0,0,1,0,0,
1,0,0,1,0,1,0,0,1,0,
0,1,0,1,1,1,0,0,0,0,
0,0,0,1,1,1,1,0,1,0),5,10,byrow=T)
dimnames(p)<-list(c("A","B","C","D","E"),paste("X",1:10,sep=""))
p2<-phyDat(p,type="USER",levels=c(0,1))

I get:
 > p2
5 sequences with 10 character and 9 different site patterns.
The states are 0 1

However:
p<-as.data.frame(p)
p2<-phyDat(p,type="USER",levels=c(0,1))

Gives me:
 > p2
10 sequences with 5 character and 5 different site patterns.
The states are 0 1

-- 
Liam J. Revell
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.revell at umb.edu
blog: http://phytools.blogspot.com


On 1/9/2012 9:14 AM, J Greenwood wrote:
> Hi all,
>
> I am having a problem with getting parsimony scores in Phangorn and have
> found that the program is not reading in my datafile the way I expect it.
> I have 10 characters and 5 taxa, but phangorn seems to read this the
> opposite way round.
>
> Can anybody help? Details follow,
>
> thanks,
>
> Jen
>
>
> I made a test dataset to try and work out the problem which is the
> following:
>
>> p<-read.table(file="tab_phangorn.txt", header=T)
>> p
> X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
> A 1 0 0 0 1 0 1 0 1 0
> B 1 1 0 1 1 0 0 1 0 0
> C 1 0 0 1 0 1 0 0 1 0
> D 0 1 0 1 1 1 0 0 0 0
> E 0 0 0 1 1 1 1 0 1 0
>
> Where A-E are taxa and X1-X10 are characters.
> I then tried converting it to phyDat format which gives the following:
>
>> p2<-phyDat(p, type="USER", levels=c(0,1))
>> p2
>
> 10 sequences with 5 character and 5 different site patterns.
> The states are 0 1
>
>> str(p2)
> List of 10
> $ X1 : int [1:5] 2 2 2 1 1
> $ X2 : int [1:5] 1 2 1 2 1
> $ X3 : int [1:5] 1 1 1 1 1
> $ X4 : int [1:5] 1 2 2 2 2
> $ X5 : int [1:5] 2 2 1 2 2
> $ X6 : int [1:5] 1 1 2 2 2
> $ X7 : int [1:5] 2 1 1 1 2
> $ X8 : int [1:5] 1 2 1 1 1
> $ X9 : int [1:5] 2 1 2 1 2
> $ X10: int [1:5] 1 1 1 1 1
> - attr(*, "class")= chr "phyDat"
> - attr(*, "weight")= int [1:5] 1 1 1 1 1
> - attr(*, "nr")= int 5
> - attr(*, "nc")= int 2
> - attr(*, "index")= int [1:5] 1 2 3 4 5
> - attr(*, "levels")= num [1:2] 0 1
> - attr(*, "allLevels")= chr [1:3] "0" "1" "?"
> - attr(*, "type")= chr "USER"
> - attr(*, "contrast")= num [1:3, 1:2] 1 0 1 0 1 1
>
> It has read the data in the opposite way round to the way i was
> expecting, which was odd, but I transposed the matrix anyway, but got
> the same output:
>
>> ptrans<-t(p)
>> ptrans
> A B C D E
> X1 1 1 1 0 0
> X2 0 1 0 1 0
> X3 0 0 0 0 0
> X4 0 1 1 1 1
> X5 1 1 0 1 1
> X6 0 0 1 1 1
> X7 1 0 0 0 1
> X8 0 1 0 0 0
> X9 1 0 1 0 1
> X10 0 0 0 0 0
>
>> ptrans2
> 10 sequences with 5 character and 5 different site patterns.
> The states are 0 1
>> str(ptrans2)
> List of 10
> $ X1 : int [1:5] 2 2 2 1 1
> $ X2 : int [1:5] 1 2 1 2 1
> $ X3 : int [1:5] 1 1 1 1 1
> $ X4 : int [1:5] 1 2 2 2 2
> $ X5 : int [1:5] 2 2 1 2 2
> $ X6 : int [1:5] 1 1 2 2 2
> $ X7 : int [1:5] 2 1 1 1 2
> $ X8 : int [1:5] 1 2 1 1 1
> $ X9 : int [1:5] 2 1 2 1 2
> $ X10: int [1:5] 1 1 1 1 1
> - attr(*, "class")= chr "phyDat"
> - attr(*, "weight")= int [1:5] 1 1 1 1 1
> - attr(*, "nr")= int 5
> - attr(*, "nc")= int 2
> - attr(*, "index")= int [1:5] 1 2 3 4 5
> - attr(*, "levels")= num [1:2] 0 1
> - attr(*, "allLevels")= chr [1:3] "0" "1" "?"
> - attr(*, "type")= chr "USER"
> - attr(*, "contrast")= num [1:3, 1:2] 1 0 1 0 1 1
>
> Can anyone explain to me what I am doing wrong? How can I get the phyDat
> format to hold my information correctly?
>
> ----------------------
> J Greenwood
> jenny.greenwood at bristol.ac.uk
>
> _______________________________________________
> R-sig-phylo mailing list
> R-sig-phylo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>



More information about the R-sig-phylo mailing list