[R] Frequency table
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Wed Mar 17 16:27:35 CET 2004
Kai Hendry <hendry at cs.helsinki.fi> writes:
> This must be FAQ, but I can't find it in archives or with a site search.
>
> I am trying to construct a frequency table. I guess this should be done with
> table. Or perhaps factor and split. Or prop.table. cut? findInterval? Argh!
>
> Please correct me if what I am looking for is not called a "frequency table".
> Perhaps it's called grouped data.
>
> > zz$x9
> [1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 64 75 58 60 56 60
> [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 72 87 52 72 80 85
> [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84
>
> I (think) I want it to look like:
>
> 40-49 2
> 50-59 15
> 60-69 20
> 70-79 19
> 80-89 12
> 90-99 2
>
> Or the other way around with transpose.
>
> classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99")
> For the rownames
>
> sum(zz$x9 > 40 & zz$x9 < 50)
> For getting frequency counts is very laborious...
>
> I got this far:
> > table(cut(zz$x9, brk))
>
> (40,50] (50,60] (60,70] (70,80] (80,90] (90,100]
> 2 19 21 19 8 1
> > brk
> [1] 40 50 60 70 80 90 100
> >
> > t(table(cut(zz$x9, brk)))
> (40,50] (50,60] (60,70] (70,80] (80,90] (90,100]
> [1,] 2 19 21 19 8 1
>
> Still feels a million miles off.
>
> Now I could do with a little help please after spending a couple of hours
> working this out.
Hmm, interesting complication of the convention that tables are 1D
arrays there...
You got this far:
classes <- c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99")
brk <- seq(40,100,10)
However, your intervals include the wrong end and the labels are ugly,
so try
table(cut(zz,breaks=brk,right=FALSE,labels=classes))
This at least gives you the right counts and labels:
40-49 50-59 60-69 70-79 80-89 90-99
2 15 20 19 12 2
for a column display, you need to convert to a matrix somehow.
Transposing twice will actually do it, but I think I prefer
matrix(table(cut(zz,breaks=brk,right=FALSE)),dimnames=list(age=classes,""))
which gives this:
age
40-49 2
50-59 15
60-69 20
70-79 19
80-89 12
90-99 2
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list