[R] need help with data management
analyst41 at hotmail.com
analyst41 at hotmail.com
Mon Dec 27 18:00:09 CET 2010
On Dec 25, 1:36 pm, Gabor Grothendieck <ggrothendi... at gmail.com>
wrote:
> On Sat, Dec 25, 2010 at 8:08 AM, analys... at hotmail.com
>
>
>
>
>
> <analys... at hotmail.com> wrote:
> > I have a data frame that reads
>
> > client ID date transcations
>
> > 323232 11/1/2010 22
> > 323232 11/2/2010 0
> > 323232 11/3/2010 missing
> > 121212 11/10/2010 32
> > 121212 11/11/2010 15
> > .................................
>
> > I want to order the rows by client ID and date and using a black-box
> > forecasting method create the data fcst(client,date of forecast, date
> > for which forecast applies).
>
> > Assume that I have a function that given a time series
> > x(1),x(2),....x(k) will generate f(i,j) where f(i,j) = forecast j days
> > ahead, given data till date i.
>
> > How can the forecast data be best stored and how would I go about the
> > taks of processing all the clients and dates?
>
> This isn't quite what you asked but it seems more suitable to what you
> need. Instead of using long form data we transform it to wide form
> with one client per column. Try copying this from this post and
> pasting it into your R session:
>
> Lines <- "323232 11/1/2010 22
> 323232 11/2/2010 0
> 323232 11/3/2010 missing
> 121212 11/10/2010 32
> 121212 11/11/2010 15"
>
> library(zoo)
> library(chron)
>
> # read in. split = 1 converts to wide form
> # can use "myfile.dat" in place of textConnection(Lines) for real data
> z <- read.zoo(textConnection(Lines), split = 1, index = 2, FUN = chron,
> na.strings = "missing")
> # d is matrix with one row per date and one col per client
> d <- coredata(z)
>
> # just use last point as our forecast for next 3 dates
> naive.forecast <- function(x) rep(tail(x, 1), 3)
> pred <- apply(d, 2, naive.forecast)
>
> # put predictions together with the data
> rbind(d, pred)
>
> For the data you showed this gives:
>
> > rbind(d, pred)
>
> 121212 323232
> [1,] NA 22
> [2,] NA 0
> [3,] NA NA
> [4,] 32 NA
> [5,] 15 NA
> [6,] 15 NA
> [7,] 15 NA
> [8,] 15 NA
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.- Hide quoted text -
>
> - Show quoted text -
Thank you.
Everything works on my system (windows) except that I get the final
output
X121212 X323232
[1,] NA 22
[2,] NA 0
[3,] NA NA
[4,] 32 NA
[5,] 15 NA
[6,] 15 NA
[7,] 15 NA
[8,] 15 NA
i.e., an "X" gets attached to the client name.
I'd also like to retain the dates in each row. I'll try to follow up
along these lines.
More information about the R-help
mailing list