[R] bigmemory - extracting submatrix from big.matrix object
utkarshsinghal
utkarsh.singhal at global-analytics.com
Tue Jun 2 17:25:48 CEST 2009
I am using the library(bigmemory) to handle large datasets, say 1 GB,
and facing following problems. Any hints from anybody can be helpful.
_Problem-1:
_
I am using "read.big.matrix" function to create a filebacked big matrix
of my data and get the following warning:
> x =
read.big.matrix("/home/utkarsh.s/data.csv",header=T,type="double",shared=T,backingfile
= "backup", backingpath = "/home/utkarsh.s")
Warning message:
In filebacked.big.matrix(nrow = numRows, ncol = numCols, type = type, :
A descriptor file has not been specified. A descriptor named
backup.desc will be created.
However there is no such argument in "read.big.matrix". Although there
is an argument "descriptorfile" in the function "as.big.matrix" but if I
try to use it in "read.big.matrix", I get an error showing it as unused
argument (as expected).
_Problem-2:_
I want to get a filebacked *sub*matrix of "x", say only selected
columns: x[, 1:100]. Is there any way of doing that without actually
loading the data into R memory.
_
Problem-3
_There are functions available like: summary, colmean, colsd, ... for
standard summary statistics. But is there any way to calculate other
summaries say number of missing values or skewness of each variable,
without loading the whole data into R memory.
Regards
Utkarsh
More information about the R-help
mailing list