[R] Time intervals is converted into seconds after converting list of dfs into a single Df.

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Tue Dec 24 22:03:56 CET 2019


1. "Similar" or "same" column names. The former is probably not going to
work.

2. Manipulations with data frames can consume a lot of memory. rbinding
8000 data frames is likely to be very slow with lots of time swapping
memory around(???). Perhaps try taking smaller bites (say 1000 at a time)
and then combining them. Or have you already tried this? If you do wish to
do this, wait to give experts a chance to tell you that my suggestion is
completely useless before you attempt it.

3. I'll let someone else resolve your dates problem, as I have never used
lubridate.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Dec 24, 2019 at 12:38 PM Allaisone 1 <Allaisone1 using hotmail.com> wrote:

> Hi dear Patrick ,
>
> Thanks for your replay. Below is a reproducible example . First,  I
> generated two  similar Dfs with one column contains the interval. Then, I
> put the 2 dfs in a list. Now, converting this list into df provides
> different results depending on the code. See below for more details.
>
>
>  # dataframe 1
>
> id <- c(1,1)
>
> dates1 <- c("2010/2/4","2011/2/4")
>
> dates2 <- c("2010/9/4","2011/1/1")
>
> df1 <- data.frame(id,dates1,dates2)
>
> df1[,2] <- as.Date(df1[,2])
>
> df1[,3] <- as.Date(df1[,3])
>
> df1$interaction <-
> intersect(interval(df1[1,2],df1[2,2]),interval(df1[1,3],df1[2,3]))
>
>
>
>   # Dataframe 2
>
> id <- c(2,2)
>
> dates1 <- c("2010/1/4","2011/2/4")
>
> dates2 <- c("2010/10/4","2011/1/16")
>
> df2 <- data.frame(id,dates1,dates2)
>
> df2[,2] <- as.Date(df1[,2])
>
> df2[,3] <- as.Date(df1[,3])
>
>
> df2$interaction <-
> intersect(interval(df1[1,2],df1[2,2]),interval(df1[1,3],df1[2,3]))
>
>
>
>  # 2 datframes in a list :
>
>  ListOfDFs <- list(df1,df2)
>
>  # Convert list of Dfs into a single df :-
>
>  SingDF <- ldply( ListOfDFs,data.frame)
>
>        # The interval has been converted into numbers which is not what I
> want.
>
>        #but trying this code :
>  SingDF <- do.call(rbind,ListOfDFs)
>
>        # It works perfectly but only with this example as we have only 2
> datframes. Howver, in my actual data I have around 8000 datframes. Applying
> this code to it , make R code freezes and I waited for many hours but it
> still freezes with no results generated.
>
>  Could anyone please suggest any alternative syntax or modifications to
> the codes above?
>
> Kind Regards
>
>
>
>
> Sent from Outlook
> ________________________________
> From: Patrick (Malone Quantitative) <malone using malonequantitative.com>
> Sent: 24 December 2019 17:01:59
> To: Allaisone 1 <Allaisone1 using hotmail.com>
> Cc: r-help using r-project.org <r-help using r-project.org>
> Subject: Re: [R] Time intervals is converted into seconds after converting
> list of dfs into a single Df.
>
> You didn't provide a reproducible example for testing (or post in
> plain text), but lubridate has an as.interval() function. You'll need
> to be able to extract the start time, though, for use in the function.
>
> On Tue, Dec 24, 2019 at 11:54 AM Allaisone 1 <Allaisone1 using hotmail.com>
> wrote:
> >
> >
> > Hi dear group ,
> >
> > I have list of datframes with similar column names. I want to rebind all
> dataframes so I have a single dataframe. One of the column's in each df is
> of 'interval' time class which was generated from 'lubridate' package.
> >
> > The problem is that when I convert the list of dfs into a single df
> using any of the below codes :
> >
> > Library(plyr)
> > MySingleDf <- ldply(MyListOfDfs, data.frame)
> > Or
> > MySingleDf <- ldply(MyListOfDfs, rbind)
> > Or
> > MySingleDf <- rebind. fill (MyListOfDfs)
> >
> > What heppens is that  time intervals which looks like : 2010-4-5
> UTC--2011-7-9 UTC is converted into a single numeric value which seems to
> be the difference between the 2 dates in seconds.
> >
> > When I use :
> > MySingleDf <- do.call ("rbind",MyListOfDfs)
> >
> > The code is freezes and it shows like of the data are being analysed but
> no result. I have used this code previously for the same purpose but with
> another datse and it works perfectly.
> >
> > What I want to see is that time intervals are shown as they are but not
> converted into seconds.
> >
> > Could you please suggest any alternative syntax or modifications to my
> codes ?
> >
> > Thank you so much in advance
> >
> > Regards
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list