[Rd] model.matrix.default() silently ignores bad contrasts.arg

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Sat Feb 23 15:33:22 CET 2019


 thanks!

On 2019-02-23 5:42 a.m., Martin Maechler wrote:
>>>>>> Fox, John 
>>>>>>     on Fri, 22 Feb 2019 17:40:15 +0000 writes:
> 
>     > Dear Martin and Ben, I agree that a warning is a good idea
>     > (and perhaps that wasn't clear in my response to Ben's
>     > post).
> 
>     > Also, it would be nice to correct the omission in the help
>     > file, which as far as I could see doesn't mention that a
>     > contrast-generating function (as opposed to its quoted
>     > name) can be an element of the contrasts.arg list.
> 
>     > Best, John
> 
> Thank you John for the clarification and the reminder about
> filling the omission there!
> 
> Prepared to go (into the sources) now.
> Martin
> 
> 
>     >> -----Original Message----- From: Martin Maechler
>     >> [mailto:maechler using stat.math.ethz.ch] Sent: Friday,
>     >> February 22, 2019 11:50 AM To: Ben Bolker
>     >> <bbolker using gmail.com> Cc: Fox, John <jfox using mcmaster.ca>;
>     >> r-devel using r-project.org Subject: Re: [Rd]
>     >> model.matrix.default() silently ignores bad contrasts.arg
>     >> 
>     >> >>>>> Ben Bolker >>>>> on Thu, 21 Feb 2019 08:18:51 -0500
>     >> writes:
>     >> 
>     >> > On Thu, Feb 21, 2019 at 7:49 AM Fox, John
>     >> <jfox using mcmaster.ca> wrote:
>     >> >>
>     >> >> Dear Ben,
>     >> >>
>     >> >> Perhaps I'm missing the point, but contrasts.arg is
>     >> documented to be a list. From ?model.matrix:
>     >> "contrasts.arg: A list, whose entries are values (numeric
>     >> matrices or character strings naming functions) to be
>     >> used as replacement values for the contrasts replacement
>     >> function and whose names are the names of columns of data
>     >> containing factors."
>     >> 
>     >> > I absolutely agree that this is not a bug/behaves as
>     >> documented (I > could have said that more clearly).  It's
>     >> just that (for reasons I > attempted to explain) this is
>     >> a really easy mistake to make.
>     >> 
>     >> >> This isn't entirely accurate because a function also
>     >> works as a named element of the list (in addition to a
>     >> character string naming a function and a contrast
>     >> matrix), as your example demonstrates, but nowhere that
>     >> I'm aware of is it suggested that a non-list should work.
>     >> >>
>     >> >> It certainly would be an improvement if specifying
>     >> contrast.arg as a non- list generated an error or warning
>     >> message, and it at least arguably would be convenient to
>     >> allow a general contrast specification such as
>     >> contrasts.arg- "contr.sum", but I don't see a bug here.
>     >> 
>     >> > I agree.  That's what my patch does (throws a warning
>     >> message if > contrasts.arg is non-NULL and not a list).
>     >> 
>     >> I currently do think this is a good idea... "even though"
>     >> I'm 99% sure that this will make work for package
>     >> maintainers and others whose code may suddenly show
>     >> warnings.  I hope they would know better than
>     >> suppressWarnings(.) ...
>     >> 
>     >> I see a version of the patch using old style indentation
>     >> which makes the diff even "considerably" smaller -- no
>     >> need to submit this different, though -- and I plan to
>     >> test that a bit, and commit eventually to R-devel,
>     >> possibly in a 5 days or so.
>     >> 
>     >> Thank you Ben for the suggestion and patch !  Martin
>     >> 
>     >> > cheers > Ben Bolker
>     >> 
>     >> >> Best, >> John
>     >> >>
>     >> >> -------------------------------------------------
>     >> >> John Fox, Professor Emeritus >> McMaster University >>
>     >> Hamilton, Ontario, Canada >> Web:
>     >> http::/socserv.mcmaster.ca/jfox
>     >> >>
>     >> >> > On Feb 20, 2019, at 7:14 PM, Ben Bolker
>     >> <bbolker using gmail.com> wrote:
>     >> >> >
>     >> >> > An lme4 user pointed out
>     >> <https://github.com/lme4/lme4/issues/491> that >> >
>     >> passing contrasts as a string or symbol to [g]lmer (which
>     >> would work if >> > we were using `contrasts<-` to set
>     >> contrasts on a factor variable) is >> > *silently
>     >> ignored*. This goes back to model.matrix(), and seems bad
>     >> >> > (this is a very easy mistake to make, because of the
>     >> multitude of ways >> > to specify contrasts for factors
>     >> in R - e.g. options(contrasts=...); >> > setting
>     >> contrasts on the specific factors; passing contrasts as a
>     >> list >> > to the model function ... )
>     >> >> >
>     >> >> > The relevant code is here:
>     >> >> >
>     >> >> > https://github.com/wch/r-
>     >> source/blob/trunk/src/library/stats/R/models.R#L578-L603
>     >> >> >
>     >> >> > The following code shows the problem: a
>     >> plain-vanilla model.matrix() >> > call with no contrasts
>     >> argument, followed by two wrong contrasts >> > arguments,
>     >> followed by a correct contrasts argument.
>     >> >> >
>     >> >> > data(cbpp, package="lme4") >> > mf1 <-
>     >> model.matrix(~period, data=cbpp) >> > mf2 <-
>     >> model.matrix(~period, contrasts.arg="contr.sum",
>     >> data=cbpp) >> > all.equal(mf1,mf2) ## TRUE >> > mf3 <-
>     >> model.matrix(~period, contrasts.arg=contr.sum, data=cbpp)
>     >> >> > all.equal(mf1,mf3) ## TRUE >> > mf4 <-
>     >> model.matrix(~period,
>     >> contrasts.arg=list(period=contr.sum), >> > data=cbpp) >>
>     >> > isTRUE(all.equal(mf1,mf4)) ## FALSE
>     >> >> >
>     >> >> >
>     >> >> > I've attached a potential patch for this, which is
>     >> IMO the mildest >> > possible case (if contrasts.arg is
>     >> non-NULL and not a list, it produces >> > a warning).  I
>     >> haven't been able to test it because of some mysterious
>     >> >> > issues I'm having with re-making R properly ...
>     >> >> >
>     >> >> > Thoughts?  Should I submit this as a bug
>     >> report/patch?
>     >> >> >
>     >> >> > cheers >> > Ben Bolker
>     >> >>
>     >> >> >
>     >> <models.R.diff>______________________________________________
>



More information about the R-devel mailing list