[R] Counting occurances of a letter by a factor
Davis, Brian
Brian.Davis at uth.tmc.edu
Fri Sep 10 21:40:50 CEST 2010
I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame.
Ex.
> DF<-data.frame(c("CC", "CC", NA, "CG", "GG", "GC"), c("L", "U", "L", "U", "L", NA))
> colnames(DF)<-c("X", "Y")
> DF
X Y
1 CC L
2 CC U
3 <NA> L
4 CG U
5 GG L
6 GC <NA>
I have an ugly solution, which works if you know the factor levels of Y in advance.
> ans<-rbind(table(unlist(strsplit(as.character(DF[DF[ ,'Y'] == 'L', 1]), ""))),
+ table(unlist(strsplit(as.character(DF[DF[ ,'Y'] == 'U', 1]), ""))))
> rownames(ans)<-c("L", "U")
> ans
C G
L 2 2
U 3 1
I've played with table, xtab, tabulate, aggregate, tapply, etc but haven't found a combination that gives a more general solution to this problem.
Any ideas?
Brian
More information about the R-help
mailing list