[R] help usin scan on large matrix (caveats to what has been discussed before)
Martin Tomko
martin.tomko at geo.uzh.ch
Thu Aug 12 11:30:37 CEST 2010
Dear all,
I have a few points that I am unsure about using scan. I know that it is
covered in the intro to R, and also has been discussed here:
http://www.mail-archive.com/r-help@r-project.org/msg04869.html
but nevertheless, I cannot get it to work.
I have a potentially very large matrix that I need to read in (35MB). I
am about to run it on a server with 16G of memory etc, so I hope it will
work. I ultimately only need to run image() on it, producing a heatmap.
read.table crashes on it, and is slow, so I would like to read it using
scan.
The file where I store it has the following format:
"V1" "V2" "V3" "V4" "V5"
"1" 508 424 208 111 66
"2" 59 101 95 113 81
"3" 26 30 24 17 18
"4" 4 0 8 3 9
"5" 0 0 0 0 0
"6" 0 0 0 0 0
where the first line are column names, the first column rownames.
read.table works perfectly without any parameters on this (the file has
been output using write.table). I use:
rows<-length(R)
cols <- max(unlist(lapply(R,function(x) length(unlist(gregexpr("
",x,fixed=TRUE,useBytes=TRUE))))))
c<-scan(file=f,what=list(c("",(rep(integer(0),cols)))), skip=1)
m<-matrix(c, nrow = rows, ncol=cols,byrow=TRUE);
for some reason I end up with a character matrix, which I don't want. Is
this the proper way to skip the first column (this is not documented
anywhere - how does one skip the first column in scan???). is my way of
specifying "integer(0)" correct?
And finally - would any sparse matrix package be more appropriate, and
can I use a sparse matrix for the image() function producing typical
heat,aps? I have seen that some sparse matrix packages produce different
looking outputs, which would not be appropriate.
Thanks
Martin
More information about the R-help
mailing list