[R] Counting variables repeted in dataframe columns to create a presence-absence table
arun
smartpink111 at yahoo.com
Thu Nov 28 20:57:22 CET 2013
Hi,
Try:
data_m <- read.table(text="Abortusovis07918 Agona08561 Anatum08125 Arizonae65S Braenderup08488
1 S5305B_IGR S5305B_IGR S5305B_IGR S5305B_IGR S5305B_IGR
2 S5305A_IGR S5300A_IGR S5305A_IGR S5300A_IGR S5300A_IGR
3 S5300A_IGR S5300B_IGR S5300A_IGR S5300B_IGR S5300B_IGR
4 S5300B_IGR S5299B_IGR S5300B_IGR S5299B_IGR S5299B_IGR
5 S5299B_IGR S5299A_IGR S5299B_IGR S5829B_IGR S5299A_IGR",sep="",header=TRUE,stringsAsFactors=FALSE)
data_m$new <-1
library(reshape2)
dM <- melt(data_m,id.vars="new")
xtabs(new~value+variable,dM)
#or
dcast(dM,value~variable,value.var="new",fill=0)
A.K.
On Thursday, November 28, 2013 12:18 PM, Gmail <o.irazoki at gmail.com> wrote:
Hi!
I'm new in R and I'm writing you asking for some guidance. I had
analyzed a comparative genomic microarray data of /56 Salmonella/
strains to identify absent genes in each of the serovars, and finally I
got a matrix that looks like that:
> data[1:5,1:5]
Abortusovis07918 Agona08561 Anatum08125 Arizonae65S Braenderup08488
1 S5305B_IGR S5305B_IGR S5305B_IGR S5305B_IGR S5305B_IGR
2 S5305A_IGR S5300A_IGR S5305A_IGR S5300A_IGR S5300A_IGR
3 S5300A_IGR S5300B_IGR S5300A_IGR S5300B_IGR S5300B_IGR
4 S5300B_IGR S5299B_IGR S5300B_IGR S5299B_IGR S5299B_IGR
5 S5299B_IGR S5299A_IGR S5299B_IGR S5829B_IGR S5299A_IGR
The variables corresponds to those genes identified as absent in each of
the serovars. I would like to create a presence-absence matrix of those
genes comparing all the serovars at the same time, I assume that should
not be complicated but I don't know how to do it.
I would like a matrix similar to the next one:
> data_m[1:5,1:5]
Abortusovis07918 Agona08561 Anatum08125 Arizonae65S
Braenderup08488
S5305B_IGR 1 1 1 1 1
S5305A_IGR 1 0 1 0 0
S5300A_IGR 1 1 1 1 1
Any help would be welcome, and thank you in advance,
Oihane
--
Oihane Irazoki Sanchez
PhD Student, Molecular Microbiology
Genetics and Microbiology Department, Faculty of Biosciences
Autonomous University of Barcelona
08193 Bellaterra (Barcelona), Spain
Telf: 34 - 935 811 665
E-mail: oihane.irazoki at uab.cat / o.irazoki at gmail.com
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list