[R] need help with data management

analyst41 at hotmail.com analyst41 at hotmail.com
Mon Dec 27 18:00:09 CET 2010



On Dec 25, 1:36 pm, Gabor Grothendieck <ggrothendi... at gmail.com>
wrote:
> On Sat, Dec 25, 2010 at 8:08 AM, analys... at hotmail.com
>
>
>
>
>
> <analys... at hotmail.com> wrote:
> > I have a data frame that reads
>
> > client ID date transcations
>
> > 323232   11/1/2010 22
> > 323232   11/2/2010 0
> > 323232   11/3/2010 missing
> > 121212   11/10/2010 32
> > 121212    11/11/2010 15
> > .................................
>
> > I want to order the rows by client ID and date and using a black-box
> > forecasting method create the data fcst(client,date of forecast, date
> > for which forecast applies).
>
> >  Assume that I have a function that given a time series
> > x(1),x(2),....x(k) will generate f(i,j) where f(i,j) = forecast j days
> > ahead, given data till date i.
>
> > How can the forecast data be best stored and how would I go about the
> > taks of processing all the clients and dates?
>
> This isn't quite what you asked but it seems more suitable to what you
> need.  Instead of using long form data we transform it to wide form
> with one client per column.  Try copying this from this post and
> pasting it into your R session:
>
> Lines <- "323232   11/1/2010 22
> 323232   11/2/2010 0
> 323232   11/3/2010 missing
> 121212   11/10/2010 32
> 121212    11/11/2010 15"
>
> library(zoo)
> library(chron)
>
> # read in. split = 1 converts to wide form
> # can use "myfile.dat" in place of textConnection(Lines) for real data
> z <- read.zoo(textConnection(Lines), split = 1, index = 2, FUN = chron,
>       na.strings = "missing")
> # d is matrix with one row per date and one col per client
> d <- coredata(z)
>
> # just use last point as our forecast for next 3 dates
> naive.forecast <- function(x) rep(tail(x, 1), 3)
> pred <- apply(d, 2, naive.forecast)
>
> # put predictions together with the data
> rbind(d, pred)
>
> For the data you showed this gives:
>
> > rbind(d, pred)
>
>      121212 323232
> [1,]     NA     22
> [2,]     NA      0
> [3,]     NA     NA
> [4,]     32     NA
> [5,]     15     NA
> [6,]     15     NA
> [7,]     15     NA
> [8,]     15     NA
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.- Hide quoted text -
>
> - Show quoted text -

Thank you.

Everything works on my system (windows) except that I get the final
output

     X121212 X323232
[1,]      NA      22
[2,]      NA       0
[3,]      NA      NA
[4,]      32      NA
[5,]      15      NA
[6,]      15      NA
[7,]      15      NA
[8,]      15      NA

i.e., an "X" gets attached to the client name.

I'd also like to retain the dates in each row.  I'll try to follow up
along these lines.



More information about the R-help mailing list