[R] Counting occurances of a letter by a factor
Darin A. England
england at cs.umn.edu
Fri Sep 10 22:11:18 CEST 2010
I fiddled around and found this solution, which is far from elegant,
but it doesn't require you to know the factor levels in advance.
t <- with(DF, tapply(as.character(X), Y, table))
lapply(t, function(x)
table(strsplit(paste(names(x),collapse=""),split="")))
Darin
On Fri, Sep 10, 2010 at 02:40:50PM -0500, Davis, Brian wrote:
> I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame.
>
> Ex.
> > DF<-data.frame(c("CC", "CC", NA, "CG", "GG", "GC"), c("L", "U", "L", "U", "L", NA))
> > colnames(DF)<-c("X", "Y")
> > DF
> X Y
> 1 CC L
> 2 CC U
> 3 <NA> L
> 4 CG U
> 5 GG L
> 6 GC <NA>
>
> I have an ugly solution, which works if you know the factor levels of Y in advance.
>
> > ans<-rbind(table(unlist(strsplit(as.character(DF[DF[ ,'Y'] == 'L', 1]), ""))),
> + table(unlist(strsplit(as.character(DF[DF[ ,'Y'] == 'U', 1]), ""))))
> > rownames(ans)<-c("L", "U")
> > ans
> C G
> L 2 2
> U 3 1
>
>
> I've played with table, xtab, tabulate, aggregate, tapply, etc but haven't found a combination that gives a more general solution to this problem.
>
> Any ideas?
>
> Brian
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list