[R] frequency, count rows, data for heat map

Jan van der Laan djvanderlaan at gmail.com
Wed Aug 25 17:08:27 CEST 2010


Your problem is not completely clear to me, but perhaps something like

data <- data.frame(
   a = rep(c(1,2), each=10),
   b = rep(c('a', 'b', 'c', 'd'), 5))
library(plyr)
daply(data, a ~ b, nrow)

does what you need.

Regards,
Jan

On Wed, Aug 25, 2010 at 4:53 PM, rtsweeney <tripsweeney at gmail.com> wrote:
>
> Hi all,
> I have read posts of heat map creation but I am one step prior --
> Here is what I am trying to do and wonder if you have any tips?
> We are trying to map sequence reads from tumors to viral genomes.
>
> Example input file :
> 111     abc
> 111     sdf
> 111     xyz
> 1079   abc
> 1079   xyz
> 1079   xyz
> 5576   abc
> 5576   sdf
> 5576   sdf
>
> How may xyz's are there for 1079 and 111? How many abc's, etc?
> How many times did reads from sample (1079) align to virus xyz.
> In some cases there are thousands per virus in a give sample, sometimes one.
> The original file (two columns by tens of thousands of rows; 20 MB) is
> text file (tab delimited).
>
> Output file:
>         abc  sdf  xyz
> 111     1      1     1
> 1079   1      0     2
> 5576   1      2     0
>
> Or, other ways to generate this data so I can then use it for heat map
> creation?
>
> Thanks for any help you may have,
>
> rtsweeney
> palo alto, ca
> --
> View this message in context: http://r.789695.n4.nabble.com/frequency-count-rows-data-for-heat-map-tp2338363p2338363.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list