[R] Missing data?
Kevin Burton
rkevinburton at charter.net
Mon Nov 28 02:10:59 CET 2011
This has been very helpful. Thank you.
At the risk of further confirming my ignorance and taxing your patience I
would like to add another question. How would I modify this code so that
each week starts with the same day of the week regardless of the year? I
would add this stipulation so that for multiple years I always get the same
'week-number' like
> format(as.Date("2011-11-27"), "%W-%w")
[1] "47-0"
The convention (at least for US culture) seems to be that the week starts
with Sunday (it is index 0 for day of week). So it would be convenient if
the code was modified so that each 'week' began on Sunday. The partial at
the beginning would just start with the day of week that was at the start. I
still would want to aggregate that 'week-number's that are greater than 51
like you have shown.
Thanks again.
Kevin
-----Original Message-----
From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
Sent: Sunday, November 27, 2011 4:24 PM
To: Kevin Burton
Cc: r-help at r-project.org
Subject: Re: [R] Missing data?
On Sun, Nov 27, 2011 at 4:08 PM, Kevin Burton <rkevinburton at charter.net>
wrote:
> I admit it isnt reality but I was hoping through judicious use of these
functions I could approximate reality. For example in the years where there
are more than 53 weeks in a year I would be happy if there were a way to
recognize this and drop the last week of data. If there were less than 53 I
would "pad" the year with an extra dummy week. This is just about the same
as your suggestion of putting more than 7 days in the first and last weeks.
But i still need this kind of date manipulation to even know how many days
to add in to make the approximation viable. This kind of best approximation
to reality seems better than to settle for the resolution of a month just
because it is consistent. Daily would be too much data and even then there
would be an approximation due to leap years.
>
OK. As you are willing to regard days past the 364th as part of the last
week of the year then we can do this.
Create a zoo object z as test data. Then convert its time scale to
year + week/52 where 0 is the first week of the year and we replace any week
that is greater than 51 with 51. Then we aggregate z by week taking the
last data point in the week and convert it to ts. Because of the way we
constructed it the frequency will be 52.
library(zoo)
# test data
z <- zoo(1:100, Sys.Date() + 1:100)
yr.wk <- with(as.POSIXlt(time(z)), year + 1900 + pmin(yday %/% 7, 51) / 52)
z.wk <- aggregate(z, yr.wk, tail, 1) z.ts <- as.ts(z.wk)
frequency(z.ts) # 52
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list