[R] R crash with 'library(Matrix); as(x, "dgCMatrix")' ...
Martin Maechler
maechler at stat.math.ethz.ch
Fri Jul 7 10:34:07 CEST 2006
>>>>> "JohnT" == Thaden, John J <ThadenJohnJ at uams.edu>
>>>>> on Thu, 6 Jul 2006 12:29:42 -0500 writes:
JohnT> Martin Maechler replied to my query "Warning while subsetting...":
MartinM> >>>>> "JohnT" == Thaden, John J <ThadenJohnJ at uams.edu>
MartinM> >>>>> on Thu, 6 Jul 2006 00:02:10 -0500 writes:
JohnT> ...
JohnT> > # While subsetting x, I was surprised to get this warning:
JohnT> > y<-x[1:300,]
JohnT> Warning message:
JohnT> number of items to replace is not a multiple of
JohnT> replacement length
MartinM> and later
JohnT> Sorry, I omitted background information:
JohnT> R version: 2.3.0
JohnT> OS: Windows XP
JohnT> CPU: Pentium III,
JohnT> RAM: 768 MB
MartinM> You omitted the most pertinent information: The
MartinM> version of 'Matrix' you are using.
MartinM> The latest released version of Matrix does
MartinM> *not* show the behavior you mentioned. {So I have
MartinM> now spent 20 minutes just because you did not
MartinM> update 'Matrix'..}
JohnT> The Matrix package was version 0.995-10, now is 0.995-11.
JohnT> The R base was version 2.3.0, now is 2.3.1.
JohnT> Subsetting 'y <- x[1:300,]' now works. Please accept my apology.
JohnT> Also, what command-line memory settings might prevent
JohnT> R from crashing while using the Matrix package to
JohnT> convert my 600 X 4482 dgTMatrix to the dgCMatrix class
JohnT> or to an expanded Matrix, via the as() function? I can
JohnT> do this with half of the matrix, 300 x 4482.
MartinM> It's hard to believe that you get a "crash"
MartinM> when coercing to 'dgC' -- but of course this
MartinM> really depends how much memory you have already
MartinM> goggled up by other large objects in your R
MartinM> workspace, or by other applications running at
MartinM> the same time in Windows. Coercing to a full
MartinM> matrix will of course require 8 * 601 * 4482 =
MartinM> 21549456 extra bytes just for the numbers.
MartinM> That's only 21.5 Megabytes, so I wonder..
MartinM>
MartinM> I have never seen R crashes from using 'Matrix',
(actually that's not even true; at some point in time we had a
bug in 'Matrix' which lead to spurious segmentation faults)
MartinM> but then I work with an operating system, not
MartinM> with M$ Windows.
MartinM>
MartinM> Maybe you meant you got an error message
MartinM> "... memory allocation .."?
JohnT> Testing again, I closed all applications; disabled antivirus;
JohnT> opened RGui; removed all R objects but 'x' (a 600x4482 dgTMatrix);
JohnT> opened WinXP's 'Task Manager'; saw only "Rgui" under
JohnT> 'Applications'; saw processes using a total of 287 MB of memory
JohnT> under 'Processes'; closed 'Task Manager'; and typed R commands:
>> # Steps leading to an R crash...
>> ls()
JohnT> [1] "x"
>> str(x)
JohnT> Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
JohnT> ..@ i : int [1:923636] 1 2 3 4 5 6 7 8 9 10 ...
JohnT> ..@ j : int [1:923636] 1 1 1 1 1 1 1 1 1 1 ...
JohnT> ..@ Dim : int [1:2] 600 4482
JohnT> ..@ Dimnames:List of 2
JohnT> .. ..$ : chr [1:601] "50" "51" "52" "53" ...
JohnT> .. ..$ : chr [1:4482] "1" "2" "3" "4" ...
JohnT> ..@ x : num [1:923636] 50.2 51.2 52.2 53.2 54.2 ...
JohnT> ..@ factors : list()
>> gc()
JohnT> used (Mb) gc trigger (Mb) max used (Mb)
JohnT> Ncells 183529 5.0 407500 10.9 350000 9.4
JohnT> Vcells 1928101 14.8 2286173 17.5 1928652 14.8
>> library(Matrix)
JohnT> Loading required package: lattice
>> gc()
JohnT> used (Mb) gc trigger (Mb) max used (Mb)
JohnT> Ncells 627772 16.8 1073225 28.7 1073225 28.7
JohnT> Vcells 2165773 16.6 3345184 25.6 2332013 17.8
>> search()
JohnT> [1] ".GlobalEnv" "package:Matrix" "package:lattice"
JohnT> [4] "package:methods" "package:stats" "package:graphics"
JohnT> [7] "package:grDevices" "package:utils" "package:datasets"
JohnT> [10] "Autoloads" "package:base"
>> #Now the line that causes crashes...
>> y <- as(x,"dgCMatrix")
JohnT> After ~10 seconds, R blinks off and a WinXP dialog appears:
JohnT> R for Windows GUI front-end has encountered
JohnT> a problem and needs to close. We are sorry
JohnT> for the inconvenience....Error signature:
JohnT> AppName: rgui.exe AppVer: 2.31.38247.0
JohnT> ModName: matrix.dll Offset: 0000ff31....
JohnT> Report error?
Thanks a lot, John, for the more detailed report.
I do wonder how it happens, since the memory allocation is not
really big. E.g., I can easily solve ``your'' (well, a
simulated version of it) problem on a machine with only 512 MB
RAM:
library("Matrix")
## MM: construct a matrix *as* John's :
d <- as.integer(c(600,4482))
n0 <- 923636
set.seed(1)
M <- new("dgTMatrix", Dim = d,
i = sort(sample(0:(d[1]-1), size = n0, replace = TRUE)),
j = sample(0:(d[2]-1), size = n0, replace = TRUE),
x = round(rnorm(n0, m = 50, sd = 10), 1))
dimnames(M) <- list(paste("r", 1:d[1], sep=''),
paste("C", 1:d[2], sep=''))
str(M)
M1.10 <- M[1:10,] # gave warning in earlier versions of 'Matrix'
## on 'nanny' which has just 512 MB (with other processes active, etc):
gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 642690 17.2 1073225 28.7 1073225 28.7
## Vcells 3136547 24.0 8305047 63.4 7988501 61.0
mC <- as(M, "dgCMatrix")
## ---------
gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 642721 17.2 1073225 28.7 1073225 28.7
## Vcells 4311327 32.9 8305047 63.4 7988501 61.0
## well, this will need a bit more memory, but should still work:
mm <- as(M, "matrix")
## -------
gc()
##- used (Mb) gc trigger (Mb) max used (Mb)
##- Ncells 642725 17.2 1073225 28.7 1073225 28.7
##- Vcells 7000528 53.5 8438708 64.4 7988501 61.0
I see in the CHANGES file for {R for Windows}
>> R 2.3.1 patched
>> ===============
>>
>> [.........................]
>>
>> R could crash when very low on memory. (PR#8981)
So, maybe you can try to even run "R 2.3.1 patched" for Windows,
which you can get from here,
http://cran.us.r-project.org/bin/windows/base/rpatched.html
and see if your crashes go away ?
Regards,
Martin
More information about the R-help
mailing list