[R] creating conditional means
Sherri Heck
sheck at ucar.edu
Thu Dec 6 17:52:07 CET 2007
hi gabor,
i was able to get your suggestion to work. i have been going through
the R help tools to figure out what each step actually does because i
have something similar but hours 2,5,8,11,14,17 and 20 are missing. i
haven't had any luck. each "mean value" that is calculated is the
same. i keep getting the following error:
"> DF<- read.table(textConnection(Lines), header = TRUE)
Error in read.table(textConnection(Lines), header = TRUE) :
duplicate 'row.names' are not allowed
> aggregate(DF[2:4],
+ with(DF, data.frame(Year, Qtr = (Month - 3) %/% 3 + 1, Hour)),
+ mean) #skip=hour[2,5,8,11,14]
Error in data.frame(Year, Qtr = (Month - 3)%/%3 + 1, Hour) :
object "Year" not found
"
i am not clear why in "aggregate(DF[#:#]" that we are subsetting other
variables besides co2. i have been trying to just subset co2 without
success though.
your original suggestion is below and a snippet of my data set is below
that. if you have any ideas or if you know of a help page that i may
not have found yet that would be great (i've been using the "aggregate"
help pages mostly.
thanks for your help-
s.heck
Lines <- "Year Month Hour co2 num1 num2
2006 11 0 383.3709 28 28
2006 11 1 383.3709 28 28
2006 11 2 383.3709 28 28
2006 11 3 383.3709 28 28
2006 11 4 383.3709 28 28
2006 11 5 383.3709 28 28
2006 11 6 383.3709 28 28
2006 11 7 383.3709 28 28
2006 11 8 383.3709 28 28
2006 11 9 383.3709 27 27
2006 11 10 383.3709 28 28
"
DF <- read.table(textConnection(Lines), header = TRUE)
aggregate(DF[4:6],
with(DF, data.frame(Year, Qtr = (Month - 1) %/% 3 + 1, Hour)),
mean) #skip=hour[2,5,8,11,14,17,20]???
Year Month Hour co2
2005 1 0 386.1600708
2005 1 1 386.823056
2005 1 3 387.1335939
2005 1 4 387.0681103
2005 1 6 387.4750983
2005 1 7 388.3398313
2005 1 9 388.7545317
2005 1 10 388.0844451
2005 1 12 386.7929627
2005 1 13 385.5569521
2005 1 15 384.5523752
2005 1 16 385.0246721
2005 1 18 385.8646669
2005 1 19 386.2182493
2005 1 21 386.4820756
2005 1 22 386.6606276
2005 2 0 386.6791667
2005 2 1 386.6597544
2005 2 3 386.5725303
2005 2 4 387.0638611
2005 2 6 387.9293508
2005 2 7 388.3778991
2005 2 9 388.3721947
2005 2 10 387.8324642
2005 2 12 386.8404892
2005 2 13 385.6770345
2005 2 15 384.4798484
2005 2 16 384.6214677
2005 2 18 384.3044105
2005 2 19 383.3018709
2005 2 21 382.5837339
2005 2 22 382.2658036
Gabor Grothendieck wrote:
> Just adjust the formula for Qtr appropriately if your quarters
> are not Jan/Feb/Mar, Apr/May/Jun, Jul/Aug/Sep, Oct/Nov/Dec
> as I assumed.
>
> On Dec 1, 2007 5:21 PM, Sherri Heck <sheck at ucar.edu> wrote:
>
>> Hi Gabor,
>>
>> Thank you for your help. I think I need to clarify a bit more. I am
>> trying to say
>>
>> average all 2pms for months march + april + may (for example). I hope this is clearer.
>>
>> here's a larger subset of my data set:
>>
>> year, month, hour, co2(ppm), num1,num2
>>
>> 2006 1 0 384.2055 14 14
>> 2006 1 1 384.0304 14 14
>> 2006 1 2 383.9672 14 14
>> 2006 1 3 383.8452 14 14
>> 2006 1 4 383.8594 14 14
>> 2006 1 5 383.7318 14 14
>> 2006 1 6 383.6439 14 14
>> 2006 1 7 383.7019 14 14
>> 2006 1 8 383.7487 14 14
>> 2006 1 9 383.8376 14 14
>> 2006 1 10 383.8684 14 14
>> 2006 1 11 383.8301 14 14
>> 2006 1 12 383.8058 14 14
>> 2006 1 13 383.9419 14 14
>> 2006 1 14 383.7876 14 14
>> 2006 1 15 383.7744 14 14
>> 2006 1 16 383.8566 14 14
>> 2006 1 17 384.1014 14 14
>> 2006 1 18 384.1312 14 14
>> 2006 1 19 384.1551 14 14
>> 2006 1 20 384.099 14 14
>> 2006 1 21 384.1408 14 14
>> 2006 1 22 384.3637 14 14
>> 2006 1 23 384.1491 14 14
>> 2006 2 0 384.7082 27 27
>> 2006 2 1 384.6139 27 27
>> 2006 2 2 384.7453 26 26
>> 2006 2 3 384.9224 28 28
>> 2006 2 4 384.8581 28 28
>> 2006 2 5 384.9208 28 28
>> 2006 2 6 384.9086 28 28
>> 2006 2 7 384.837 28 28
>> 2006 2 8 384.6163 27 27
>> 2006 2 9 384.7406 28 28
>> 2006 2 10 384.7468 28 28
>> 2006 2 11 384.6992 28 28
>> 2006 2 12 384.6388 28 28
>> 2006 2 13 384.6346 28 28
>> 2006 2 14 384.6037 28 28
>> 2006 2 15 384.5295 28 28
>> 2006 2 16 384.5654 28 28
>> 2006 2 17 384.6466 28 28
>> 2006 2 18 384.6344 28 28
>> 2006 2 19 384.5911 28 28
>> 2006 2 20 384.6084 28 28
>> 2006 2 21 384.6318 28 28
>> 2006 2 22 384.6181 27 27
>> 2006 2 23 384.6087 27 27
>>
>>
>> thanks you again for your assistance-
>>
>> s.heck
>>
>>
>>
>> Gabor Grothendieck wrote:
>>
>>> Try aggregate:
>>>
>>>
>>> Lines <- "Year Month Hour co2 num1 num2
>>> 2006 11 0 383.3709 28 28
>>> 2006 11 1 383.3709 28 28
>>> 2006 11 2 383.3709 28 28
>>> 2006 11 3 383.3709 28 28
>>> 2006 11 4 383.3709 28 28
>>> 2006 11 5 383.3709 28 28
>>> 2006 11 6 383.3709 28 28
>>> 2006 11 7 383.3709 28 28
>>> 2006 11 8 383.3709 28 28
>>> 2006 11 9 383.3709 27 27
>>> 2006 11 10 383.3709 28 28
>>> "
>>> DF <- read.table(textConnection(Lines), header = TRUE)
>>> aggregate(DF[4:6],
>>> with(DF, data.frame(Year, Qtr = (Month - 1) %/% 3 + 1, Hour)),
>>> mean)
>>>
>>> On Dec 1, 2007 3:57 PM, Sherri Heck <sheck at ucar.edu> wrote:
>>>
>>>
>>>> Hi all-
>>>>
>>>> I have a dataset (year, month, hour, co2(ppm), num1,num2)
>>>>
>>>>
>>>> [49,] 2006 11 0 383.3709 28 28
>>>> [50,] 2006 11 1 383.3709 28 28
>>>> [51,] 2006 11 2 383.3709 28 28
>>>> [52,] 2006 11 3 383.3709 28 28
>>>> [53,] 2006 11 4 383.3709 28 28
>>>> [54,] 2006 11 5 383.3709 28 28
>>>> [55,] 2006 11 6 383.3709 28 28
>>>> [56,] 2006 11 7 383.3709 28 28
>>>> [57,] 2006 11 8 383.3709 28 28
>>>> [58,] 2006 11 9 383.3709 27 27
>>>> [59,] 2006 11 10 383.3709 28 28
>>>>
>>>> that repeats in this style for each month. I would like to compute the
>>>> mean for each hour in three month intervals.
>>>> i.e. average all 2pms for each day for months march, april and may. and
>>>> then do this for each hour interval.
>>>> i have been messing around with 'for loops' but can't seem to get the
>>>> output I want.
>>>>
>>>> thanks in advance for any help-
>>>>
>>>> s.heck
>>>> CU, Boulder
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>
More information about the R-help
mailing list