[R] How to globally convert NaN to NA in dataframe?

Luigi Marongiu m@rong|u@|u|g| @end|ng |rom gm@||@com
Fri Sep 3 09:59:12 CEST 2021


Fair enough, I'll check the actual data to see if there are indeed any
NaN (which should not, since the data are categories, not generated by
math).
Thanks!

On Fri, Sep 3, 2021 at 8:26 AM PIKAL Petr <petr.pikal using precheza.cz> wrote:
>
> Hi Luigi.
>
> Weird. But maybe it is the desired behaviour of summary when calculating
> mean of numeric column full of NAs.
>
> See example
>
> dat <- data.frame(x=rep(NA, 110), y=rep(1, 110), z= rnorm(110))
>
> # change all values in second column to NA
> dat[,2] <- NA
> # change some of them to NAN
> dat[5:6, 2:3] <- 0/0
>
> # see summary
> summary(dat)
>     x                 y             z
>  Mode:logical   Min.   : NA   Min.   :-1.9798
>  NA's:110       1st Qu.: NA   1st Qu.:-0.4729
>                 Median : NA   Median : 0.1745
>                 Mean   :NaN   Mean   : 0.1856
>                 3rd Qu.: NA   3rd Qu.: 0.8017
>                 Max.   : NA   Max.   : 2.5075
>                 NA's   :110   NA's   :2
>
> # change NAN values to NA
> dat[sapply(dat, is.nan)] <- NA
> *************************
>
> #summary is same
> summary(dat)
>     x                 y             z
>  Mode:logical   Min.   : NA   Min.   :-1.9798
>  NA's:110       1st Qu.: NA   1st Qu.:-0.4729
>                 Median : NA   Median : 0.1745
>                 Mean   :NaN   Mean   : 0.1856
>                 3rd Qu.: NA   3rd Qu.: 0.8017
>                 Max.   : NA   Max.   : 2.5075
>                 NA's   :110   NA's   :2
>
> # but no NAN value in data
> dat[1:10,]
>     x  y          z
> 1  NA NA -0.9148696
> 2  NA NA  0.7110570
> 3  NA NA -0.1901676
> 4  NA NA  0.5900650
> 5  NA NA         NA
> 6  NA NA         NA
> 7  NA NA  0.7987658
> 8  NA NA -0.5225229
> 9  NA NA  0.7673103
> 10 NA NA -0.5263897
>
> So my "nice compact command"
> dat[sapply(dat, is.nan)] <- NA
>
> works as expected, but summary gives as mean NAN.
>
> Cheers
> Petr
>
> > -----Original Message-----
> > From: R-help <r-help-bounces using r-project.org> On Behalf Of Luigi Marongiu
> > Sent: Thursday, September 2, 2021 3:46 PM
> > To: Andrew Simmons <akwsimmo using gmail.com>
> > Cc: r-help <r-help using r-project.org>
> > Subject: Re: [R] How to globally convert NaN to NA in dataframe?
> >
> > `data[sapply(data, is.nan)] <- NA` is a nice compact command, but I still
> get
> > NaN when using the summary function, for instance one of the columns give:
> > ```
> > Min.   : NA
> > 1st Qu.: NA
> > Median : NA
> > Mean   :NaN
> > 3rd Qu.: NA
> > Max.   : NA
> > NA's   :110
> > ```
> > I tried to implement the second solution but:
> > ```
> > df <- lapply(x, function(xx) {
> >   xx[is.nan(xx)] <- NA
> > })
> > > str(df)
> > List of 1
> >  $ sd_ef_rash_loc___palm: logi NA
> > ```
> > What am I getting wrong?
> > Thanks
> >
> > On Thu, Sep 2, 2021 at 3:30 PM Andrew Simmons <akwsimmo using gmail.com>
> > wrote:
> > >
> > > Hello,
> > >
> > >
> > > I would use something like:
> > >
> > >
> > > x <- c(1:5, NaN) |> sample(100, replace = TRUE) |> matrix(10, 10) |>
> > > as.data.frame() x[] <- lapply(x, function(xx) {
> > >     xx[is.nan(xx)] <- NA_real_
> > >     xx
> > > })
> > >
> > >
> > > This prevents attributes from being changed in 'x', but accomplishes the
> > same thing as you have above, I hope this helps!
> > >
> > > On Thu, Sep 2, 2021 at 9:19 AM Luigi Marongiu <marongiu.luigi using gmail.com>
> > wrote:
> > >>
> > >> Hello,
> > >> I have some NaN values in some elements of a dataframe that I would
> > >> like to convert to NA.
> > >> The command `df1$col[is.nan(df1$col)]<-NA` allows to work column-wise.
> > >> Is there an alternative for the global modification at once of all
> > >> instances?
> > >> I have seen from
> > >> https://stackoverflow.com/questions/18142117/how-to-replace-nan-
> > value
> > >> -with-zero-in-a-huge-data-frame/18143097#18143097
> > >> that once could use:
> > >> ```
> > >>
> > >> is.nan.data.frame <- function(x)
> > >> do.call(cbind, lapply(x, is.nan))
> > >>
> > >> data123[is.nan(data123)] <- 0
> > >> ```
> > >> replacing o with NA, but I got
> > >> ```
> > >> str(df)
> > >> > logi NA
> > >> ```
> > >> when modifying my dataframe df.
> > >> What would be the correct syntax?
> > >> Thank you
> > >>
> > >>
> > >>
> > >> --
> > >> Best regards,
> > >> Luigi
> > >>
> > >> ______________________________________________
> > >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Best regards,
> > Luigi
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.



-- 
Best regards,
Luigi



More information about the R-help mailing list