[R] Importing binary data
Uwe Ligges
ligges at statistik.uni-dortmund.de
Tue Jun 1 13:49:54 CEST 2004
Uli Tuerk wrote:
> Hi everybody!
>
> I've a large dataset, about 2 Mio entries of the format which I would like
> to import into a frame:
> <integer><integer><float><string><float><string><string>
>
> Because to the huge data amount I've choosen a binary format instead
> of a text format when exporting from Matlab.
> My import function is attached below. It works fine for only some entries
> but is deadly slow when trying to read the complete set.
>
> Does anybody has some pointers for me for improving the import or handling
> such large data sets?
Suggestion:
a) Use a database!!!
And only for very strong reasons against a):
b) Rewrite your import code in C.
c) optimize the code below by initializing the objects in full length
(e.g. imp.v <- numeric(n)) (maybe you can read it from the header or
derive the size from the size of the file ....)
Uwe Ligges
> Thanks in advance!
>
> Uli
>
>
>
> read.DET.data <- function ( f ) {
> counter <- 1
> spk.v <- c()
> imp.v <- c()
> score.v <- c()
> th.v <- c()
> ses.v <- c()
> rec.v <- c()
> type.v <- c()
> fid <- file( f ,"rb")
> tempi <- readBin(fid , integer(), size=1, signed=FALSE)
> while ( length(tempi) != 0) {
> spk.v[ counter ] <- tempi
> imp.v[ counter ] <- readBin(fid, integer(), size=1, signed=FALSE)
> score.v[ counter ] <- readBin(fid, numeric(), size=4)
> type.v[ counter ] <- readBin(fid, character())
> th.v[ counter ] <- readBin(fid, numeric(), size=4)
> ses.v[ counter ] <- readBin(fid, character())
> rec.v[ counter ] <- readBin(fid, character())
> counter <- counter + 1
> tempi <- readBin(fid, integer(), size=1, signed=FALSE)
> }
> close( fid )
> spkf <- factor ( spk.v )
> impf <- factor ( imp.v )
>
> det.f <- data.frame( spk=spkf, imp=impf, score=score.v, th=th.v, ses=ses.v, rec=rec.v, type=type.v)
>
> det.f
> }
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list