[R] creating conditional means
Gabor Grothendieck
ggrothendieck at gmail.com
Thu Dec 6 18:05:37 CET 2007
The error message says you have duplicate row names and that
is not allowed. Make sure you have the same number of elements
on each line of data as in the header. If you have one more on each line
than on the header then the first data item on each line will be regarded
as the row name. See ?count.fields
The rest of your message is not clear.
On Dec 6, 2007 11:52 AM, Sherri Heck <sheck at ucar.edu> wrote:
> hi gabor,
>
> i was able to get your suggestion to work. i have been going through
> the R help tools to figure out what each step actually does because i
> have something similar but hours 2,5,8,11,14,17 and 20 are missing. i
> haven't had any luck. each "mean value" that is calculated is the
> same. i keep getting the following error:
>
> "> DF<- read.table(textConnection(Lines), header = TRUE)
> Error in read.table(textConnection(Lines), header = TRUE) :
> duplicate 'row.names' are not allowed
> > aggregate(DF[2:4],
> + with(DF, data.frame(Year, Qtr = (Month - 3) %/% 3 + 1, Hour)),
> + mean) #skip=hour[2,5,8,11,14]
> Error in data.frame(Year, Qtr = (Month - 3)%/%3 + 1, Hour) :
> object "Year" not found
> "
>
> i am not clear why in "aggregate(DF[#:#]" that we are subsetting other
> variables besides co2. i have been trying to just subset co2 without
> success though.
> your original suggestion is below and a snippet of my data set is below
> that. if you have any ideas or if you know of a help page that i may
> not have found yet that would be great (i've been using the "aggregate"
> help pages mostly.
>
> thanks for your help-
>
> s.heck
>
>
>
> Lines <- "Year Month Hour co2 num1 num2
> 2006 11 0 383.3709 28 28
> 2006 11 1 383.3709 28 28
> 2006 11 2 383.3709 28 28
> 2006 11 3 383.3709 28 28
> 2006 11 4 383.3709 28 28
> 2006 11 5 383.3709 28 28
> 2006 11 6 383.3709 28 28
> 2006 11 7 383.3709 28 28
> 2006 11 8 383.3709 28 28
> 2006 11 9 383.3709 27 27
> 2006 11 10 383.3709 28 28
> "
> DF <- read.table(textConnection(Lines), header = TRUE)
> aggregate(DF[4:6],
> with(DF, data.frame(Year, Qtr = (Month - 1) %/% 3 + 1, Hour)),
> mean) #skip=hour[2,5,8,11,14,17,20]???
>
>
>
>
>
>
> Year Month Hour co2
> 2005 1 0 386.1600708
> 2005 1 1 386.823056
> 2005 1 3 387.1335939
> 2005 1 4 387.0681103
> 2005 1 6 387.4750983
> 2005 1 7 388.3398313
> 2005 1 9 388.7545317
> 2005 1 10 388.0844451
> 2005 1 12 386.7929627
> 2005 1 13 385.5569521
> 2005 1 15 384.5523752
> 2005 1 16 385.0246721
> 2005 1 18 385.8646669
> 2005 1 19 386.2182493
> 2005 1 21 386.4820756
> 2005 1 22 386.6606276
> 2005 2 0 386.6791667
> 2005 2 1 386.6597544
> 2005 2 3 386.5725303
> 2005 2 4 387.0638611
> 2005 2 6 387.9293508
> 2005 2 7 388.3778991
> 2005 2 9 388.3721947
> 2005 2 10 387.8324642
> 2005 2 12 386.8404892
> 2005 2 13 385.6770345
> 2005 2 15 384.4798484
> 2005 2 16 384.6214677
> 2005 2 18 384.3044105
> 2005 2 19 383.3018709
> 2005 2 21 382.5837339
> 2005 2 22 382.2658036
>
>
> Gabor Grothendieck wrote:
> > Just adjust the formula for Qtr appropriately if your quarters
> > are not Jan/Feb/Mar, Apr/May/Jun, Jul/Aug/Sep, Oct/Nov/Dec
> > as I assumed.
> >
> > On Dec 1, 2007 5:21 PM, Sherri Heck <sheck at ucar.edu> wrote:
> >
> >> Hi Gabor,
> >>
> >> Thank you for your help. I think I need to clarify a bit more. I am
> >> trying to say
> >>
> >> average all 2pms for months march + april + may (for example). I hope this is clearer.
> >>
> >> here's a larger subset of my data set:
> >>
> >> year, month, hour, co2(ppm), num1,num2
> >>
> >> 2006 1 0 384.2055 14 14
> >> 2006 1 1 384.0304 14 14
> >> 2006 1 2 383.9672 14 14
> >> 2006 1 3 383.8452 14 14
> >> 2006 1 4 383.8594 14 14
> >> 2006 1 5 383.7318 14 14
> >> 2006 1 6 383.6439 14 14
> >> 2006 1 7 383.7019 14 14
> >> 2006 1 8 383.7487 14 14
> >> 2006 1 9 383.8376 14 14
> >> 2006 1 10 383.8684 14 14
> >> 2006 1 11 383.8301 14 14
> >> 2006 1 12 383.8058 14 14
> >> 2006 1 13 383.9419 14 14
> >> 2006 1 14 383.7876 14 14
> >> 2006 1 15 383.7744 14 14
> >> 2006 1 16 383.8566 14 14
> >> 2006 1 17 384.1014 14 14
> >> 2006 1 18 384.1312 14 14
> >> 2006 1 19 384.1551 14 14
> >> 2006 1 20 384.099 14 14
> >> 2006 1 21 384.1408 14 14
> >> 2006 1 22 384.3637 14 14
> >> 2006 1 23 384.1491 14 14
> >> 2006 2 0 384.7082 27 27
> >> 2006 2 1 384.6139 27 27
> >> 2006 2 2 384.7453 26 26
> >> 2006 2 3 384.9224 28 28
> >> 2006 2 4 384.8581 28 28
> >> 2006 2 5 384.9208 28 28
> >> 2006 2 6 384.9086 28 28
> >> 2006 2 7 384.837 28 28
> >> 2006 2 8 384.6163 27 27
> >> 2006 2 9 384.7406 28 28
> >> 2006 2 10 384.7468 28 28
> >> 2006 2 11 384.6992 28 28
> >> 2006 2 12 384.6388 28 28
> >> 2006 2 13 384.6346 28 28
> >> 2006 2 14 384.6037 28 28
> >> 2006 2 15 384.5295 28 28
> >> 2006 2 16 384.5654 28 28
> >> 2006 2 17 384.6466 28 28
> >> 2006 2 18 384.6344 28 28
> >> 2006 2 19 384.5911 28 28
> >> 2006 2 20 384.6084 28 28
> >> 2006 2 21 384.6318 28 28
> >> 2006 2 22 384.6181 27 27
> >> 2006 2 23 384.6087 27 27
> >>
> >>
> >> thanks you again for your assistance-
> >>
> >> s.heck
> >>
> >>
> >>
> >> Gabor Grothendieck wrote:
> >>
> >>> Try aggregate:
> >>>
> >>>
> >>> Lines <- "Year Month Hour co2 num1 num2
> >>> 2006 11 0 383.3709 28 28
> >>> 2006 11 1 383.3709 28 28
> >>> 2006 11 2 383.3709 28 28
> >>> 2006 11 3 383.3709 28 28
> >>> 2006 11 4 383.3709 28 28
> >>> 2006 11 5 383.3709 28 28
> >>> 2006 11 6 383.3709 28 28
> >>> 2006 11 7 383.3709 28 28
> >>> 2006 11 8 383.3709 28 28
> >>> 2006 11 9 383.3709 27 27
> >>> 2006 11 10 383.3709 28 28
> >>> "
> >>> DF <- read.table(textConnection(Lines), header = TRUE)
> >>> aggregate(DF[4:6],
> >>> with(DF, data.frame(Year, Qtr = (Month - 1) %/% 3 + 1, Hour)),
> >>> mean)
> >>>
> >>> On Dec 1, 2007 3:57 PM, Sherri Heck <sheck at ucar.edu> wrote:
> >>>
> >>>
> >>>> Hi all-
> >>>>
> >>>> I have a dataset (year, month, hour, co2(ppm), num1,num2)
> >>>>
> >>>>
> >>>> [49,] 2006 11 0 383.3709 28 28
> >>>> [50,] 2006 11 1 383.3709 28 28
> >>>> [51,] 2006 11 2 383.3709 28 28
> >>>> [52,] 2006 11 3 383.3709 28 28
> >>>> [53,] 2006 11 4 383.3709 28 28
> >>>> [54,] 2006 11 5 383.3709 28 28
> >>>> [55,] 2006 11 6 383.3709 28 28
> >>>> [56,] 2006 11 7 383.3709 28 28
> >>>> [57,] 2006 11 8 383.3709 28 28
> >>>> [58,] 2006 11 9 383.3709 27 27
> >>>> [59,] 2006 11 10 383.3709 28 28
> >>>>
> >>>> that repeats in this style for each month. I would like to compute the
> >>>> mean for each hour in three month intervals.
> >>>> i.e. average all 2pms for each day for months march, april and may. and
> >>>> then do this for each hour interval.
> >>>> i have been messing around with 'for loops' but can't seem to get the
> >>>> output I want.
> >>>>
> >>>> thanks in advance for any help-
> >>>>
> >>>> s.heck
> >>>> CU, Boulder
> >>>>
> >>>> ______________________________________________
> >>>> R-help at r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>>
> >>>>
> >>>>
>
More information about the R-help
mailing list