[R] frequency table across multiple variables
Philipp Pagel
p.pagel at wzw.tum.de
Fri Sep 19 10:22:19 CEST 2008
> I have a dataframe like this:
>
> x1<-c(1,2,3,4,NA ,NA ,NA, 3, 1, 1, 1, 1, 2, 2, 3, 4, 4)
> x2<-c(2,3,4,3,4,3,4,2,2,3,4,NA,NA,NA,NA,4,3)
> x3<-c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,1,2)
> m<-data.frame(x1,x2,x3)
>
> I would like to create a frequency table like this:
>
> x1 x2 x3
> NA
> 1
> 2
> 3
> 4
>
> where the values in each cell would be the count of the value for that
> variable.
> How can I do this?
The following will work IF all columns are integer:
> apply(m, 2, function(x){tabulate(na.omit(x))})
x1 x2 x3
[1,] 5 0 5
[2,] 3 3 5
[3,] 3 5 4
[4,] 3 5 3
Please note that the result will look slightly different, if some columns contain
the largest value and others don't:
> x1<-as.integer(c(1,2,3,4,NA ,NA ,NA, 3, 1, 1, 1, 1, 2, 2, 3, 4, 5))
> m<-data.frame(x1,x2,x3)
> apply(m, 2, function(x){tabulate(na.omit(x))})
$x1
[1] 5 3 3 2 1
$x2
[1] 0 3 5 5
$x3
[1] 5 5 4 3
cu
Philipp
--
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://mips.gsf.de/staff/pagel
More information about the R-help
mailing list