[R] help usin scan on large matrix (caveats to what has been discussed before)

Martin Tomko martin.tomko at geo.uzh.ch
Thu Aug 12 11:30:37 CEST 2010


Dear all,
I have a few points that I am unsure about using scan. I know that it is 
covered in the intro to R, and also has been discussed here: 
http://www.mail-archive.com/r-help@r-project.org/msg04869.html
but nevertheless, I cannot get it to work.

I have a potentially very large matrix that I need to read in (35MB). I 
am about to run it on a server with 16G of memory etc, so I hope it will 
work. I ultimately only need to run image() on it, producing a heatmap.

read.table crashes on it, and is slow, so I would like to read it using 
scan.

The file where I store it has the following format:
"V1" "V2" "V3" "V4" "V5"
"1" 508 424 208 111 66
"2" 59 101 95 113 81
"3" 26 30 24 17 18
"4" 4 0 8 3 9
"5" 0 0 0 0 0
"6" 0 0 0 0 0

where the first line are column names, the first column rownames. 
read.table works perfectly without any parameters on this (the file has 
been output using write.table). I use:
rows<-length(R)
cols <- max(unlist(lapply(R,function(x) length(unlist(gregexpr(" 
",x,fixed=TRUE,useBytes=TRUE))))))

c<-scan(file=f,what=list(c("",(rep(integer(0),cols)))), skip=1)
m<-matrix(c, nrow = rows, ncol=cols,byrow=TRUE);

for some reason I end up with a character matrix, which I don't want. Is 
this the proper way to skip the first column (this is not documented 
anywhere - how does one skip the first column in scan???). is my way of 
specifying "integer(0)" correct?

And finally - would any sparse matrix package be more appropriate, and 
can I use a sparse matrix for the image() function producing typical 
heat,aps? I have seen that some sparse matrix packages produce different 
looking outputs, which would not be appropriate.

Thanks
Martin



More information about the R-help mailing list