[R] How to manipulate tables
William Dunlap
wdunlap at tibco.com
Sat Dec 26 21:00:08 CET 2009
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of James Rome
> Sent: Saturday, December 26, 2009 11:03 AM
> To: r-help at r-project.org
> Subject: [R] How to manipulate tables
>
> I am sorry to be bothering the list so much.
>
> I made a table of counts of flight arrivals by hour:
> cnts=tapply(Arrival4,list(Hour),table). There are up to 15
> arrivals in a
> bin.
> > cnts
> $`0`
>
> 1 2 3 4 5 6 7 8 9 10 13
> 1 2 5 9 2 7 5 4 2 4 1
>
> $`1`
>
> 1 2 3 4
> 3 2 2 1
>
> $`2`
>
> 1 3
> 2 2
> . . .
>
> My first problem is how to get this table filled in with the 0 slots.
> E.g., I want
> $`0`
>
> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> 1 2 5 9 2 7 5 4 2 4 0 0 1 0 0
>
> for all 24 hours. The elements of the tables are lists, but I do not
> seem to be able to extract the names of each list, which I think would
> allow this manipulation.
Let's make a fake dataset to demonstrate things more easily.
set.seed(1)
data <- data.frame(Arrival4=rpois(50, 4)+1,
Hour=sample(0:4, replace=TRUE, size=50))
Your tapply(,FUN=table) approach with this data gives
> with(data, tapply(Arrival4,list(Hour),table))
$`0`
3 4 6 8
2 2 1 1
$`1`
3 4 5 6 7 8
3 2 2 3 1 3
... 2 more one dimensional tables, `3` and `4` ...
I prefer making a two dimensional table so the rows and
columns have common meanings
> tab <- with(data, table(Hour, Arrival4))
> tab
Arrival4
Hour 1 2 3 4 5 6 7 8 11
0 0 0 2 2 0 1 0 1 0
1 0 0 3 2 2 3 1 3 0
2 0 2 0 2 3 3 0 0 0
3 0 0 0 0 2 4 4 0 1
4 1 0 2 3 1 2 0 0 0
> tab[2,] # 2nd row (Hour==1)
1 2 3 4 5 6 7 8 11
0 0 3 2 2 3 1 3 0
> tab["1",] # would give the same as tab[2,]
But this is missing columns for Arrival4 in c(10,12,13,14,15).
If you make Arrival4 a factor with levels 1:15 the table will
include entries for each level. In the following we don't actually
change Arrival4 itself, but create a factor using its data
when we call table().
> tab <- with(data, table(Hour,
Arrival4=factor(Arrival4,levels=1:15)))
> tab
Arrival4
Hour 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 0 2 2 0 1 0 1 0 0 0 0 0 0 0
1 0 0 3 2 2 3 1 3 0 0 0 0 0 0 0
2 0 2 0 2 3 3 0 0 0 0 0 0 0 0 0
3 0 0 0 0 2 4 4 0 0 0 1 0 0 0 0
4 1 0 2 3 1 2 0 0 0 0 0 0 0 0 0
> tab[2,] # 2nd row (Hour==1)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 3 2 2 3 1 3 0 0 0 0 0 0 0
If you have no data for certain hours, you may have
to use the same technique for Hour.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> My second problem is that I want to compute probability
> distributions. I
> have
> lambda
> 0 1 2 4 5
> 6 7
> 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409
> 0.160323887
> 0.401619433
> 8 9 10 11 12
> 13 14
> 0.191093117 0.177327935 0.318218623 0.404858300 0.463157895
> 0.495546559
> 0.435627530
> 15 16 17 18 19
> 20 21
> 0.418623482 0.307692308 0.405668016 0.484210526 0.580566802
> 0.585425101
> 0.519028340
> 22 23
> 0.556275304 0.503643725
>
> and I need to calculate lambda**cnts for each bin, and each hour. I am
> also unsure of how to do this.
>
> Thanks in advance kind people on this list
> Jim Rome
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list