[R] converting dataframe into multiple time series

John McKown john.archie.mckown at gmail.com
Sun Aug 24 19:17:40 CEST 2014


On Aug 23, 2014 7:54 PM, "Bill" <william108 at gmail.com> wrote:
>
> Hello. Can someone suggest how to do this:
>
> for (i in 2:length(colnames(allvar.df))) {
> var=colnames(allvar.df)[i]
> timeSeriesName = paste(var,".time.series")
> varRef=paste(var,".df$",var)
> varDate=paste(var,".df$date")
> timeSeriesName <- ts(varRef,
> start = c(year(min(varDate)),month(min(varDate))),
> end = c(year(max(varDate)),month(max(varDate))),
> frequency=12)
> }
>

Please don't post in HTML. I understand that most email  clients default to
this. Thanks.

Instead of using R-line pseudo-code, you might want to just tell us,
inplain English, what your desired results are. Also, if you can would you
please paste a small subset of your data, from allvar.df, using the dput()
function. Perhaps from a command like:
dput(head(allvar.df));
That way we can easily cut and paste that into an R session for
experimentation.

=== guessing ===

My guess is that you have a data frame, named allvar.df. This data.frame
contains a number of columns. The first column is called "date". The 2nd
and subsequent columns are independent data. You want to separate each of
those columns into its own time series. You want each time series to exist
in the current environment (global?) as separate variables where the name
of each variable is the name of the original column, suffixed with
".time.series". Some of your code is confusing to me, however. Such as:
what is "varRef" and "varDate"? I would have thought that they are columns
in the allvar.df, but the code isn't anything like that. The code says that
for each column in allvar.df, there already exists another data.frame whose
name is the column name, suffixed with ".df". And it contains the desired
date and data for the time series. This simply doesn't make sense to me.

I will assume that the data are indeed in columns in allvar.df. But that
leads to why are varDate, start, and end assigned in the for() loop? They
should be invariant if they are indeed in allvar.df. In any case, I will
give you some code which echos my, likely incorrect, assumptions.

local( ( # run all the below in a local environment
  # so as to not corrupt the surrounding environment.
 # get names of columns
 all.col.names <- colnames(allvar.df);
 #drop the first one
 all.col.names <- all.col.names[2:length(all.col.names)];
 # make the date field a POSIXlt for easy of use later
 temp.date <- as.POSIXlt(allvar.df$date);
 ts.start <- c(min(temp.date$year)+1900,min(temp.date$mon)+1);
 ts.end <- c(max(temp.date$year)+1900,max(temp.date$mon)+1);
 for(col.name in all.col.names) {
   x <- ts(allvar.df[col.name],
           start=ts.start,
           end=ts.end,
           frequency=12);
 # create, or replace, a global variable using assign()
   assign(paste0(col.name,".time.series"),x, envir=.GlobalEnv);
 }
} ); # end of local environment

Well, the above is my "best guess" at what you might want. I think that
what you really needed to know was about the assign() function to create a
new variable in an environment and the fact that you can reference the
column of a data.frame() simply by indexing by the column name as a
character string. Those are the "magic" ingredients above.

	[[alternative HTML version deleted]]



More information about the R-help mailing list