[R] Saving multiple rda-files as one rda-file

Dark info at software-solutions.nl
Mon Jul 22 13:18:05 CEST 2013


Hi all,

For a project we have to process some very large CSV files (up to 40 gig)
To reduce them in size and increase operating performance I wanted to store
them as RData files.
Since it was to big I decided to split the csv and saving those parts as
separate .RDA files.
So far so good. Now I want to bind them all together to save as one RDA file
again and this is supprisingly difficult.

First I load my rda files into my environment:
load(paste(rdaoutputdir, "file1.rda", sep=""))
load(paste(rdaoutputdir, "file2.rda", sep=""))
load(paste(rdaoutputdir, "file3.rda", sep=""))
etc

Then I try to combine them into one object.

Using rbind like this gives memory allocation problems ('Error: cannot
allocate vector of size')
objectToSave <- rbind(object1, object2, object3)

using pre-allocation gives me a factor level error. I used this code:
	nextrow <- nrow(object1)+1
	object1[nextrow:(nextrow+nrow(object2)-1),] <- object2
	# we need to assure unique row names
        row.names(object1) = 1:nrow(object1)
	rm(object2)
        gc()

15! warning messages:
1: In `[<-.factor`(`*tmp*`, iseq, value = structure(c(1L,  ... :
  invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, iseq, value = structure(c(1L,  ... :
  invalid factor level, NA generated

What can I do?

Regards Derk



--
View this message in context: http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list