[R] bigmemory - extracting submatrix from big.matrix object

utkarshsinghal utkarsh.singhal at global-analytics.com
Tue Jun 2 17:25:48 CEST 2009


I am using the library(bigmemory) to handle large datasets, say 1 GB, 
and facing following problems. Any hints from anybody can be helpful.

_Problem-1:
_
I am using "read.big.matrix" function  to create a filebacked big matrix 
of my data and get the following warning:

 > x = 
read.big.matrix("/home/utkarsh.s/data.csv",header=T,type="double",shared=T,backingfile 
= "backup", backingpath = "/home/utkarsh.s")

Warning message:
In filebacked.big.matrix(nrow = numRows, ncol = numCols, type = type,  :
  A descriptor file has not been specified.  A descriptor named 
backup.desc will be created.

However there is no such argument in "read.big.matrix". Although there 
is an argument "descriptorfile" in the function "as.big.matrix" but if I 
try to use it in "read.big.matrix", I get an error showing it as unused 
argument (as expected).


_Problem-2:_

I want to get a filebacked *sub*matrix of "x", say only selected 
columns: x[, 1:100]. Is there any way of doing that without actually 
loading the data into R memory.

_
Problem-3

_There are functions available like:  summary, colmean, colsd, ... for 
standard summary statistics. But is there any way to calculate other 
summaries say number of missing values or skewness of each variable, 
without loading the whole data into R memory.


Regards
Utkarsh




More information about the R-help mailing list