[R] Why missing values are not allowed in 'poly'?

Martin Maechler maechler at stat.math.ethz.ch
Thu Mar 24 12:54:14 CET 2016


>>>>> William Dunlap via R-help <r-help at r-project.org>
>>>>>     on Wed, 23 Mar 2016 13:56:35 -0700 writes:

    > I don't know what is in R's poly(), but if it is like S+'s or TERR's then
    > one could do

    > if (anyNA(x))  {
    > nax <- na.exclude(x)
    > px <- poly(x = nax, degree = degree, coefs = coefs, raw =
    > raw, simple = simple)
    > px <- structure(naresid(attr(nax, "na.action"), px), coefs
    > = attr(px, "coefs"), degree = attr(px, "degree"), class = attr(px, "class"))
    > return(px)
    > }

    > and get nice results in the usual raw=FALSE case as well.  Similar stuff
    > could be done in the multivariate cases.

I don't have too much time for that now,
and I know that Bill Dunlap cannot provide patches for R --- for 
good reasons, though it's a pity for us! ---
but you can, Liviu!
So, and  as you see at every startup of R :

   "R is a collaborative project with many contributors."

I'm willing to try "good-looking" patches.
(to the *sources*, *NOT* to a printout of the function in your R console!)

Martin Maechler
ETH Zurich and R Core Team.

    > Bill Dunlap
    > TIBCO Software
    > wdunlap tibco.com

    > On Wed, Mar 23, 2016 at 1:41 PM, Liviu Andronic <landronimirc at gmail.com>
    > wrote:

    >> On Wed, Mar 23, 2016 at 9:29 PM, William Dunlap <wdunlap at tibco.com> wrote:
    >> > I think the worst aspect of this restriction in poly() is that when
    >> > you use poly in the formula of a model-fitting function you cannot
    >> > have any missing values in the data, even if you supply
    >> > na.action=na.exclude.
    >> >
    >> >   > d <- transform(data.frame(y=c(-1,1:10)), x=log(y))
    >> >   Warning message:
    >> >   In log(y) : NaNs produced
    >> >   > fit <- lm(y ~ poly(x, 3), data=d, na.action=na.exclude)
    >> >   Error in poly(x, 3) : missing values are not allowed in 'poly'
    >> >
    >> > Thus people are pushed to using a less stable formulation like
    >> >   > fit <- lm(y ~ x + I(x^2) + I(x^3), data=d, na.action=na.exclude)
    >> >
    >> My difficulty precisely. What's more, I inspected the code for `poly`
    >> and at least for the simple case of raw=TRUE it seems trivial to
    >> support NAs. It suffices to change line 15 of the function:
    >> if (anyNA(x)) stop("missing values are not allowed in 'poly'")
    >> 
    >> to:
    >> if (!raw && anyNA(x)) stop("missing values are not allowed in 'poly'")
    >> 
    >> This way for raw polynomials estimation continues unimpeded. With the
    >> change above, I get this:
    >> > poly(x, degree = 2, raw=TRUE)
    >> 1 2
    >> [1,] NA NA
    >> [2,] 1 1
    >> [3,] 2 4
    >> [4,] 3 9
    >> [5,] 4 16
    >> [6,] 5 25
    >> [7,] 6 36
    >> [8,] 7 49
    >> [9,] 8 64
    >> [10,] 9 81
    >> [11,] 10 100
    >> attr(,"degree")
    >> [1] 1 2
    >> attr(,"class")
    >> [1] "poly" "matrix"
    >> 
    >> 
    >> Regards,
    >> Liviu
    >> 
    >> 
    >> >
    >> > Bill Dunlap
    >> > TIBCO Software
    >> > wdunlap tibco.com
    >> >
    >> > On Wed, Mar 23, 2016 at 12:59 PM, Liviu Andronic <landronimirc at gmail.com
    >> >
    >> > wrote:
    >> >>
    >> >> Dear all,
    >> >> I'm a bit surprised by this behavior in poly:
    >> >>
    >> >> x <- c(NA, 1:10)
    >> >> poly(x, degree = 2, raw=TRUE)
    >> >> ## Error in poly(x, degree = 2, raw = TRUE) :
    >> >> ##   missing values are not allowed in 'poly'
    >> >> x^2
    >> >> ## [1] NA 1 4 9 16 25 36 49 64 81 100
    >> >>
    >> >> As you can see, poly() will fail if the vector contains NAs, whereas
    >> >> it is perfectly possible to obtain the square of the vector manually.
    >> >>
    >> >> Is there a reason for this limitation in poly?
    >> >>
    >> >> Regards,
    >> >> Liviu
    >> >>
    >> >>
    >> >> --
    >> >> Do you think you know what math is?
    >> >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
    >> >> Or what it means to be intelligent?
    >> >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
    >> >> Think again:
    >> >> http://www.ideasroadshow.com/library
    >> >>
    >> >> ______________________________________________
    >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    >> >> https://stat.ethz.ch/mailman/listinfo/r-help
    >> >> PLEASE do read the posting guide
    >> >> http://www.R-project.org/posting-guide.html
    >> >> and provide commented, minimal, self-contained, reproducible code.
    >> >
    >> >
    >> 
    >> 
    >> 
    >> --
    >> Do you think you know what math is?
    >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
    >> Or what it means to be intelligent?
    >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
    >> Think again:
    >> http://www.ideasroadshow.com/library
    >> 

    > [[alternative HTML version deleted]]

    > ______________________________________________
    > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list