[R] poor rbind performance
Tony Plate
tplate at acm.org
Wed Jul 18 19:32:55 CEST 2007
As Jim points out, building up a data frame by rbinding in a loop can be
a slow way to do things in R.
Here's an example of how you can easily read data frames into a list:
> # Create 3 files
> invisible(lapply(1:3, function(i)
write.csv(file=paste("tmp",i,".csv",sep=""),
data.frame(i=2*i+(1:2),c=letters[2*i+(1:2)]))))
> # Read the files into a list of data frames
> list.of.dfs <- lapply(paste("tmp",1:3,".csv",sep=""), read.csv,
row.names=1)
> # rbind the data frames
> myData <- do.call("rbind", list.of.dfs)
> myData
i c
1 3 c
2 4 d
3 5 e
4 6 f
5 7 g
6 8 h
>
(and of course, these last two expressions can be composed into a single
expression if you want)
-- Tony Plate
Aydemir, Zava (FID) wrote:
> Hi
>
> I rbind data frames in a loop in a cumulative way and the performance
> detriorates very quickly.
>
> My code looks like this:
>
> for( k in 1:N)
> {
> filename <- paste("/tmp/myData_",as.character(k),".txt",sep="")
> myDataTmp <- read.table(filename,header=TRUE,sep=",")
> if( k == 1) {
> myData <- myDataTmp
> }
> else{
> myData <- rbind(myData,myDataTmp)
> }
> }
>
> Some more details:
> - the size of the stored text files is about 100,000 rows and 50 columns
> each
> - for k=1: rbind takes 0.0004 seconds
> - for k=2: rbind takes 13 seconds
> - for k=3: rbind takes 30 seconds
> - for k=4: rbind takes 36 seconds
> etc
>
> Any suggestions to improve speed?
>
> Thanks
>
> Zava
> --------------------------------------------------------
>
> This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list