[R] Efficiency question: replacing all NAs with a zero

Gabor Grothendieck ggrothendieck at gmail.com
Tue Mar 30 02:27:29 CEST 2010


See if this works for you:

DF[is.na(DF)] <- 0

On Mon, Mar 29, 2010 at 8:21 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote:
> Dear R'ers,
>
> I have a very large data frame (over 4000 rows and 2,500 columns). My
> task is very simple - I have to replace all NAs with a zero. My code
> works fine on smaller data frames - but I have to deal with a huge one
> and there are many NAs in each column.
> R runs out of memory on me ("Reached total allocation of 1535Mb: see
> help(memory.size)"). Is there any other, more efficient way of doing
> it?
> Thanks a lot for any hints!
> Dimitri
>
>
> # Building an example frame:
> frame<-data.frame(a=rnorm(1:100),b=rnorm(1:100),c=rnorm(1:100),d=rnorm(1:100),e=rnorm(1:100),f=rnorm(1:100),g=rnorm(1:100))
> set.seed(1234)
> for(i in names(frame)){
>        i.for.NA<-sample(1:100,60)
>        frame[[i]][i.for.NA]<-NA
> }
>
> # Replacing all NAs in "frame" with zeros - is of course fast in this
> example, because this data frame is very small
> system.time({
> frame<-lapply(frame,function(x){
>        x[is.na(x)]<-0
>        return(x)
> })})
>
>
> --
> Dimitri Liakhovitski
> Ninah.com
> Dimitri.Liakhovitski at ninah.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list