[R] replacing all NA's in a dataframe with zeros...

Peter Dalgaard P.Dalgaard at biostat.ku.dk
Thu Mar 15 10:21:22 CET 2007


Gavin Simpson wrote:
> On Wed, 2007-03-14 at 20:16 -0700, Steven McKinney wrote:
>   
>> Since you can index a matrix or dataframe with
>> a matrix of logicals, you can use is.na()
>> to index all the NA locations and replace them
>> all with 0 in one command.
>>
>>     
>
> A quicker solution, that, IIRC,  was posted to the list by Peter
> Dalgaard several years ago is:
>
> sapply(mydata.df, function(x) {x[is.na(x)] <- 0; x}))
>   
I hope your memory fails you, because it doesn't actually work.....

> sapply(test.df, function(x) {x[is.na(x)] <- 0; x})
     x1 x2 x3
[1,]  0  1  1
[2,]  2  2  0
[3,]  3  3  0
[4,]  0  4  4

is a matrix, not a data frame.

Instead:

> test.df[] <- lapply(test.df, function(x) {x[is.na(x)] <- 0; x})
> test.df
  x1 x2 x3
1  0  1  1
2  2  2  0
3  3  3  0
4  0  4  4

Speedwise, sapply() is doing lapply() internally, and the assignment
overhead should be small, so I'd expect similar timings.

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list