[R] speeding up loop and dealing wtih memory problems
ONKELINX, Thierry
Thierry.ONKELINX at inbo.be
Mon Jul 28 15:43:40 CEST 2008
Dear Denise,
It looks like you want to replace all NA with 0 in the dataset? The code
below should do that trick without loops. And it will be rather fast.
dat[is.na(dat)] <- 0
> dat <- matrix(rbinom(40, 1, 0.75), ncol = 4, nrow = 10)
> dat[dat == 0] <- NA
> dat
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 1 1 NA 1
[3,] NA 1 NA NA
[4,] 1 1 NA 1
[5,] 1 1 1 NA
[6,] 1 1 1 NA
[7,] 1 1 1 1
[8,] 1 1 1 NA
[9,] NA 1 1 1
[10,] 1 1 1 1
>
> dat[is.na(dat)] <- 0
> dat
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 1 1 0 1
[3,] 0 1 0 0
[4,] 1 1 0 1
[5,] 1 1 1 0
[6,] 1 1 1 0
[7,] 1 1 1 1
[8,] 1 1 1 0
[9,] 0 1 1 1
[10,] 1 1 1 1
>
HTH,
Thierry
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx op inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
-----Oorspronkelijk bericht-----
Van: r-help-bounces op r-project.org [mailto:r-help-bounces op r-project.org]
Namens Denise Xifara
Verzonden: maandag 28 juli 2008 15:15
Aan: r-help op r-project.org
Onderwerp: [R] speeding up loop and dealing wtih memory problems
Dear All and Mark,
Given a dataset that I have called dat, I was hoping to speed up the
following loop:
for(i in 1:835353){
for(j in 1:86){
if (is.na(dat[i,j])==TRUE){dat[i,j]<-0 }}}
Actually I am also having a memory problem. I get the following:
Error: cannot allocate vector of size 3.2 Mb
In addition: Warning messages:
1: In dat[i, j] <- 0 :
Reached total allocation of 1535Mb: see help(memory.size)
2: In dat[i, j] <- 0 :
Reached total allocation of 1535Mb: see help(memory.size)
3: In dat[i, j] <- 0 :
Reached total allocation of 1535Mb: see help(memory.size)
4: In dat[i, j] <- 0 :
Reached total allocation of 1535Mb: see help(memory.size)
If I try and apply the loop just to a particular column, rather than the
whole dataset, so that I dont have the memory problem, ie
for(i in 1:835353){
if (is.na(dat[i,4])==TRUE){dat[i,4]<-0 }}
it takes ridiculously long to process, so I was hoping that there would
be a
quicker way to do this.
Thank you all very much for the help,
Denise
[[alternative HTML version deleted]]
______________________________________________
R-help op r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list