[R] hdf5 package segfault when processing large data
Budi Mulyono
budi.mulyono at alvantage.com
Mon Aug 24 12:37:53 CEST 2009
Hi there,
I am currently working on something that uses hdf5 library. I think
hdf5 is a great data format, I've used it somewhat extensively in
python via PyTables. I was looking for something similar to that in R.
The closest I can get is this library: hdf5. While it does not work
the same way as PyTables did, but it's good enough to let them
exchange data via hdf5 file.
There is just 1 problem, I keep getting Segfault error when trying to
process large files (>10MB), although this is by no mean large when we
talk about hdf5 capabilities. I have included the example code and
data below. I have tried with different OS (WinXP and Ubuntu 8.04),
architecture (32 and 64bit) and R versions (2.7.1, 2.72, and 2.9.1),
but all of them present the same problem. I was wondering if anyone
have any clue as to what's going on here and maybe can advice me to
handle it.
Thank you, appreciate any help i can get.
Cheers,
Budi
The example script
====================
library(hdf5)
fileName <- "sample.txt"
myTable <- read.table(fileName,header=TRUE,sep="\t",as.is=TRUE)
hdf5save("test.hdf", "myTable")
========
The data example, the list continue for more than 250,000 rows: sample.txt
========
Date Time f1 f2 f3 f4 f5
20070328 07:56 463 463.07 462.9 463.01 1100
20070328 07:57 463.01 463.01 463.01 463.01 200
....
More information about the R-help
mailing list