[R] summarizing daily time-series date by month
Gabor Grothendieck
ggrothendieck at myway.com
Wed Jan 26 20:15:51 CET 2005
Benjamin M. Osborne <Benjamin.Osborne <at> uvm.edu> writes:
:
: Message: 63
: Date: Wed, 26 Jan 2005 04:28:51 +0000 (UTC)
: From: Gabor Grothendieck <ggrothendieck <at> myway.com>
: Subject: Re: [R] chron: parsing dates into a data frame using a
: forloop
: To: r-help <at> stat.math.ethz.ch
: Message-ID: <loom.20050126T052153-333 <at> post.gmane.org>
: Content-Type: text/plain; charset=us-ascii
:
: Benjamin M. Osborne <Benjamin.Osborne <at> uvm.edu> writes:
:
: :
: : I have one data frame with a column of dates and I want to fill another
data
: : frame with one column of dates, one of years, one of months, one of a
unique
: : combination of year and month, and one of days, but R seems to have some
: : problems with this. My initial data frame looks like this (ignore the NAs
in
: : the other fields):
: :
: : > mans[1:10,]
: : date loc snow.new prcp tmin snow.dep tmax
: : 1 11/01/54 2 NA NA NA NA NA
: : 2 11/02/54 2 NA NA NA NA NA
: : 3 11/03/54 2 NA NA NA NA NA
: : 4 11/04/54 2 NA NA NA NA NA
: : 5 11/05/54 2 NA NA NA NA NA
: : 6 11/06/54 2 NA NA NA NA NA
: : 7 11/07/54 2 NA NA NA NA NA
: : 8 11/08/54 2 NA NA NA NA NA
: : 9 11/09/54 2 NA NA NA NA NA
: : 10 11/10/54 2 NA NA NA NA NA
: : >
: :
: : The code and resultant data frame look like this:
: :
: : > for(i in 1:10){
: : + mans.met$date[i]<-mans$date[i]
: : + mans.met$year[i]<-years(mans.met$date[i])
: : + mans.met$month[i]<-months(mans.met$date[i])
: : + mans.met$yearmo[i]<-cut(mans.met$date[i], "months")
: : + mans.met$day[i]<-days(mans.met$date[i])
: : + }
: : > mans.met[1:10,]
: : date year month yearmo day snow.new snow.dep prcp tmin tmax tmean
: : 1 11/01/54 1 11 1 1 NA NA NA NA NA NA
: : 2 11/02/54 1 11 1 2 NA NA NA NA NA NA
: : 3 11/03/54 1 11 1 3 NA NA NA NA NA NA
: : 4 11/04/54 1 11 1 4 NA NA NA NA NA NA
: : 5 11/05/54 1 11 1 5 NA NA NA NA NA NA
: : 6 11/06/54 1 11 1 6 NA NA NA NA NA NA
: : 7 11/07/54 1 11 1 7 NA NA NA NA NA NA
: : 8 11/08/54 1 11 1 8 NA NA NA NA NA NA
: : 9 11/09/54 1 11 1 9 NA NA NA NA NA NA
: : 10 11/10/54 1 11 1 10 NA NA NA NA NA NA
: : >
: :
: : The problem seems to be with assigning within the forloop, or making the
: : assignment into a data frame, since:
: :
: : > years(mans.met$date[5])
: : [1] 1954
: : Levels: 1954
: : > test<-years(mans.met$date[5])
: : > test
: : [1] 1954
: : Levels: 1954
: : >
: : > months(mans.met$date[5])
: : [1] Nov
: : 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
: : > test<-months(mans.met$date[5])
: : > test
: : [1] Nov
: : 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
: : >
: : > cut(mans.met$date[3], "months")
: : [1] Nov 54
: : Levels: Nov 54
: : > test<-cut(mans.met$date[3], "months")
: : > test
: : [1] Nov 54
: : Levels: Nov 54
: : >
: : > days(mans.met$date[4])
: : [1] 4
: : 31 Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < ... < 31
: : > test<-days(mans.met$date[4])
: : > test
: : [1] 4
: : 31 Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < 11 < 12 < 13 < ... < 31
: : >
: :
: : Any suggestions will be appreciated.
: : -Ben Osborne
:
: I guess you set up mans.met as numeric columns and when you
: assign your factors to numeric variables you get
: the underlying codes. Note that if f is a factor then as.numeric(f)
: gives the codes underlying the factor whereas as.character(f) gives
: the labels.
:
: It would be better not to use a loop at all. I don't know whether you
: want or not want factors but at any rate here is something you could
: try. It creates data frame df2 without a loop.
:
: df2 <- data.frame(date = mans$date, yearmo = as.character(cut
(mans$date, "m")))
: df2 <- cbind(df2, month.day.year(mans$date))
:
: Finally, do you really want this redundant representation? I would tend to
: go with just storing the dates and computing any of the other quantities
: on-the-fly as needed.
:
: ##########
: The reason for the redundancy is that I will want to summarize these 50
years of
: daily time series data by month, so that records that share each unique year
: and month in the mans.met$yearmo column will be summed or averaged, etc.
into a
: new row in another data frame(mans.monthly, having
: nrow=length(unique(mans.met$yearmo))). The way I would do this is again
using
: a forloop, but the loop won't recognize :
: for (i in 1:(length(unique(mans.met$yearmo[i])))){
This seems circular. You are defining i in terms of i.
:
: What I really need to know is why I can call any ith of
: unique(mans.met$yearmo[i])
: by itself, but not in a loop.
:
: Or, perhaps there is an even easier way to extract the year and month from
the
: date
: column on the fly to compute these summaries?
Look at ?aggregate, ?by and ?tapply. e.g.
aggregate(mans[,-1], list(cut(mans$date, "m")), mean)
More information about the R-help
mailing list