[R] Sum Question
Marc Schwartz
marc_schwartz at me.com
Thu Jun 30 19:30:27 CEST 2011
On Jun 30, 2011, at 11:20 AM, Edgar Alminar wrote:
>>> I did this:
>>>
>>> library(data.table)
>>>
>>> dd <- data.table(bl)
>>> dd[,sum(as.integer(CONTTIME)), by = SCRNO]
>>>
>>> (I used as.integer because I got an error message: sum not meaningful for factors)
>>>
>>> And got this:
>>>
>>> SCRNO V1
>>> [1,] HBA0020036 111
>>> [2,] HBA0020087 71
>>> [3,] HBA0020209 140
>>> [4,] HBA0020213 189
>>> [5,] HBA0020222 174
>>> [6,] HBA0020292 747
>>> [7,] HBA0020310 57
>>> [8,] HBA0020317 291
>>> [9,] HBA0020365 417
>>> [10,] HBA0020366 124
>>>
>>> All the sums are way too big. Is there something making it not add up correctly?
>>>
>>> Original dataset:
>>>
> RID SCRNO VISCODE RECNO CONTTIME
> 338 43 HBA0020036 bl 1 9
> 1187 95 HBA0020087 bl 1 3
> 3251 230 HBA0020209 bl 2 3
> 3258 230 HBA0020209 bl 1 28
> 3321 235 HBA0020213 bl 2 5
> 3351 235 HBA0020213 bl 1 6
> 3436 247 HBA0020222 bl 1 5
> 3456 247 HBA0020222 bl 2 4
> 4569 321 HBA0020292 bl 13 2
> 4572 321 HBA0020292 bl 5 13
> 4573 321 HBA0020292 bl 1 25
> 4576 321 HBA0020292 bl 7 5
> 4578 321 HBA0020292 bl 8 2
> 4581 321 HBA0020292 bl 4 4
> 4582 321 HBA0020292 bl 9 5
> 4586 321 HBA0020292 bl 12 2
> 4587 321 HBA0020292 bl 6 2
> 4590 321 HBA0020292 bl 10 3
> 4591 321 HBA0020292 bl 11 7
That is not the entire dataset....HBA0020366 is missing, as an example.
I don't use the data.table package, but if you are getting an error indicating that CONTTIME is a factor, then something is wrong with either the data itself (there are non-numeric entries) or the way in which it was entered/imported into R.
Thus, I would first check your data for errors. Use str(YourDataSet) to review its structure and if CONTTIME is a factor, check into the data to see why.
Lastly, review this R FAQ:
http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f
Just as an alternative, with your data in 'DF':
> DF
RID SCRNO VISCODE RECNO CONTTIME
338 43 HBA0020036 bl 1 9
1187 95 HBA0020087 bl 1 3
3251 230 HBA0020209 bl 2 3
3258 230 HBA0020209 bl 1 28
3321 235 HBA0020213 bl 2 5
3351 235 HBA0020213 bl 1 6
3436 247 HBA0020222 bl 1 5
3456 247 HBA0020222 bl 2 4
4569 321 HBA0020292 bl 13 2
4572 321 HBA0020292 bl 5 13
4573 321 HBA0020292 bl 1 25
4576 321 HBA0020292 bl 7 5
4578 321 HBA0020292 bl 8 2
4581 321 HBA0020292 bl 4 4
4582 321 HBA0020292 bl 9 5
4586 321 HBA0020292 bl 12 2
4587 321 HBA0020292 bl 6 2
4590 321 HBA0020292 bl 10 3
4591 321 HBA0020292 bl 11 7
> aggregate(CONTTIME ~ DF$SCRNO, data = DF, sum)
DF$SCRNO CONTTIME
1 HBA0020036 9
2 HBA0020087 3
3 HBA0020209 31
4 HBA0020213 11
5 HBA0020222 9
6 HBA0020292 70
See ?aggregate
HTH,
Marc Schwartz
More information about the R-help
mailing list