[R] creating conditional means

Sherri Heck sheck at ucar.edu
Thu Dec 6 17:52:07 CET 2007


hi gabor,

i was able to get your suggestion to work.  i have been going through 
the R help tools to figure out what each step actually does because i 
have something similar but hours 2,5,8,11,14,17 and 20 are missing.  i 
haven't had any luck.  each "mean value" that is calculated is the 
same.  i keep getting the following error:

"> DF<- read.table(textConnection(Lines), header = TRUE)
Error in read.table(textConnection(Lines), header = TRUE) :
        duplicate 'row.names' are not allowed
 >   aggregate(DF[2:4],
+    with(DF, data.frame(Year, Qtr = (Month - 3) %/% 3 + 1, Hour)),
+    mean)    #skip=hour[2,5,8,11,14]
Error in data.frame(Year, Qtr = (Month - 3)%/%3 + 1, Hour) :
        object "Year" not found
"

i am not clear why in "aggregate(DF[#:#]" that we are subsetting other 
variables besides co2.  i have been trying to just subset co2 without 
success though.
your original suggestion is below and a snippet of my data set is below 
that. if you have any ideas  or if you know of a help page that i may 
not have found yet that would be great (i've been using the "aggregate" 
help pages mostly.

thanks for your help-

s.heck



Lines <- "Year Month Hour co2 num1 num2
 2006   11    0 383.3709   28   28
 2006   11    1 383.3709   28   28
 2006   11    2 383.3709   28   28
 2006   11    3 383.3709   28   28
 2006   11    4 383.3709   28   28
 2006   11    5 383.3709   28   28
 2006   11    6 383.3709   28   28
 2006   11    7 383.3709   28   28
 2006   11    8 383.3709   28   28
 2006   11    9 383.3709   27   27
 2006   11   10 383.3709   28   28
"
DF <- read.table(textConnection(Lines), header = TRUE)
aggregate(DF[4:6],
   with(DF, data.frame(Year, Qtr = (Month - 1) %/% 3 + 1, Hour)),
   mean)			#skip=hour[2,5,8,11,14,17,20]???


 



Year Month Hour co2
2005    1    0    386.1600708
2005    1    1    386.823056
2005    1    3    387.1335939
2005    1    4    387.0681103
2005    1    6    387.4750983
2005    1    7    388.3398313
2005    1    9    388.7545317
2005    1    10    388.0844451
2005    1    12    386.7929627
2005    1    13    385.5569521
2005    1    15    384.5523752
2005    1    16    385.0246721
2005    1    18    385.8646669
2005    1    19    386.2182493
2005    1    21    386.4820756
2005    1    22    386.6606276
2005    2    0    386.6791667
2005    2    1    386.6597544
2005    2    3    386.5725303
2005    2    4    387.0638611
2005    2    6    387.9293508
2005    2    7    388.3778991
2005    2    9    388.3721947
2005    2    10    387.8324642
2005    2    12    386.8404892
2005    2    13    385.6770345
2005    2    15    384.4798484
2005    2    16    384.6214677
2005    2    18    384.3044105
2005    2    19    383.3018709
2005    2    21    382.5837339
2005    2    22    382.2658036

Gabor Grothendieck wrote:
> Just adjust the formula for Qtr appropriately if your quarters
> are not Jan/Feb/Mar, Apr/May/Jun, Jul/Aug/Sep, Oct/Nov/Dec
> as I assumed.
>
> On Dec 1, 2007 5:21 PM, Sherri Heck <sheck at ucar.edu> wrote:
>   
>> Hi Gabor,
>>
>> Thank you for your help.  I think I need to clarify a bit more.  I am
>> trying to say
>>
>> average all 2pms for months march + april + may (for example). I hope this is clearer.
>>
>> here's a larger subset of my data set:
>>
>> year, month, hour, co2(ppm), num1,num2
>>
>> 2006 1 0 384.2055 14 14
>> 2006 1 1 384.0304 14 14
>> 2006 1 2 383.9672 14 14
>> 2006 1 3 383.8452 14 14
>> 2006 1 4 383.8594 14 14
>> 2006 1 5 383.7318 14 14
>> 2006 1 6 383.6439 14 14
>> 2006 1 7 383.7019 14 14
>> 2006 1 8 383.7487 14 14
>> 2006 1 9 383.8376 14 14
>> 2006 1 10 383.8684 14 14
>> 2006 1 11 383.8301 14 14
>> 2006 1 12 383.8058 14 14
>> 2006 1 13 383.9419 14 14
>> 2006 1 14 383.7876 14 14
>> 2006 1 15 383.7744 14 14
>> 2006 1 16 383.8566 14 14
>> 2006 1 17 384.1014 14 14
>> 2006 1 18 384.1312 14 14
>> 2006 1 19 384.1551 14 14
>> 2006 1 20 384.099 14 14
>> 2006 1 21 384.1408 14 14
>> 2006 1 22 384.3637 14 14
>> 2006 1 23 384.1491 14 14
>> 2006 2 0 384.7082 27 27
>> 2006 2 1 384.6139 27 27
>> 2006 2 2 384.7453 26 26
>> 2006 2 3 384.9224 28 28
>> 2006 2 4 384.8581 28 28
>> 2006 2 5 384.9208 28 28
>> 2006 2 6 384.9086 28 28
>> 2006 2 7 384.837 28 28
>> 2006 2 8 384.6163 27 27
>> 2006 2 9 384.7406 28 28
>> 2006 2 10 384.7468 28 28
>> 2006 2 11 384.6992 28 28
>> 2006 2 12 384.6388 28 28
>> 2006 2 13 384.6346 28 28
>> 2006 2 14 384.6037 28 28
>> 2006 2 15 384.5295 28 28
>> 2006 2 16 384.5654 28 28
>> 2006 2 17 384.6466 28 28
>> 2006 2 18 384.6344 28 28
>> 2006 2 19 384.5911 28 28
>> 2006 2 20 384.6084 28 28
>> 2006 2 21 384.6318 28 28
>> 2006 2 22 384.6181 27 27
>> 2006 2 23 384.6087 27 27
>>
>>
>> thanks you again for your assistance-
>>
>> s.heck
>>
>>
>>
>> Gabor Grothendieck wrote:
>>     
>>> Try aggregate:
>>>
>>>
>>> Lines <- "Year Month Hour co2 num1 num2
>>>  2006   11    0 383.3709   28   28
>>>  2006   11    1 383.3709   28   28
>>>  2006   11    2 383.3709   28   28
>>>  2006   11    3 383.3709   28   28
>>>  2006   11    4 383.3709   28   28
>>>  2006   11    5 383.3709   28   28
>>>  2006   11    6 383.3709   28   28
>>>  2006   11    7 383.3709   28   28
>>>  2006   11    8 383.3709   28   28
>>>  2006   11    9 383.3709   27   27
>>>  2006   11   10 383.3709   28   28
>>> "
>>> DF <- read.table(textConnection(Lines), header = TRUE)
>>> aggregate(DF[4:6],
>>>    with(DF, data.frame(Year, Qtr = (Month - 1) %/% 3 + 1, Hour)),
>>>    mean)
>>>
>>> On Dec 1, 2007 3:57 PM, Sherri Heck <sheck at ucar.edu> wrote:
>>>
>>>       
>>>> Hi all-
>>>>
>>>> I have a dataset (year, month, hour, co2(ppm), num1,num2)
>>>>
>>>>
>>>> [49,] 2006   11    0 383.3709   28   28
>>>> [50,] 2006   11    1 383.3709   28   28
>>>> [51,] 2006   11    2 383.3709   28   28
>>>> [52,] 2006   11    3 383.3709   28   28
>>>> [53,] 2006   11    4 383.3709   28   28
>>>> [54,] 2006   11    5 383.3709   28   28
>>>> [55,] 2006   11    6 383.3709   28   28
>>>> [56,] 2006   11    7 383.3709   28   28
>>>> [57,] 2006   11    8 383.3709   28   28
>>>> [58,] 2006   11    9 383.3709   27   27
>>>> [59,] 2006   11   10 383.3709   28   28
>>>>
>>>> that repeats in this style for each month.  I would like to compute the
>>>> mean for each hour in three month intervals.
>>>> i.e.  average all 2pms for each day for months march, april and may. and
>>>> then do this for each hour interval.
>>>> i have been messing around with 'for loops' but can't seem to get the
>>>> output I want.
>>>>
>>>> thanks in advance for any help-
>>>>
>>>> s.heck
>>>> CU, Boulder
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>



More information about the R-help mailing list