[Rd] New vcov(*, complete=TRUE) etc -- coef(<lm>) vs coef(<aov>)

Fox, John jfox at mcmaster.ca
Tue Nov 7 23:09:03 CET 2017


Dear Martin,

I think that your plan makes sense. It's too bad that aov() behaved differently in this respect from lm(), and thus created more work, but it's not be a bad thing that the difference is now explicit and documented.

I expect that that other problems like this will surface, particularly with contributed packages (and I know that you're aware that this has already happened with the car package). That is, packages that made provision for aliased coefficients based on the old behaviour of coef() and vcov() will now have to adapt to the new, more consistent behaviour.

Best,
 John

> -----Original Message-----
> From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Martin
> Maechler
> Sent: Tuesday, November 7, 2017 4:48 PM
> To: r-devel at r-project.org
> Cc: Martin Maechler <maechler at stat.math.ethz.ch>
> Subject: [Rd] New vcov(*, complete=TRUE) etc -- coef(<lm>) vs coef(<aov>)
> 
> >>>>> Martin Maechler <maechler at stat.math.ethz.ch>
> >>>>>     on Thu, 2 Nov 2017 21:59:00 +0100 writes:
> 
> >>>>> Fox, John <jfox at mcmaster.ca>
> >>>>>     on Thu, 14 Sep 2017 13:46:44 +0000 writes:
> 
>     >> Dear Martin, I made three points which likely got lost
>     >> because of the way I presented them:
> 
>     >> (1) Singularity is an unusual situation and should be
>     >> made more prominent. It typically reflects a problem with
>     >> the data or the specification of the model. That's not to
>     >> say that it *never* makes sense to allow singular fits
>     >> (as in the situations you mentions).
> 
>     >> I'd favour setting singular.ok=FALSE as the default, but
>     >> in the absence of that a warning or at least a note. A
>     >> compromise would be to have a singular.ok option() that
>     >> would be FALSE out of the box.
> 
>     >> Any changes would have to be made very carefully so as
>     >> not to create chaos.
> 
>     > I for one, am too reluctant to want to change the default
>     > there.
> 
>     >> That goes for the points below as well.
> 
>     >> (2) coef() and vcov() behave inconsistently, which can be
>     >> problematic because one often uses them together in code.
> 
>     > indeed; and I had agreed on that.  As of today, in R-devel
>     > only they now behave compatibly.  NEWS entry
> 
>     >     • The “default” ("lm" etc) methods of vcov() have
>     > gained new optional argument complete = TRUE which makes
>     > the vcov() methods more consistent with the coef() methods
>     > in the case of singular designs.  The former behavior is
>     > now achieved by vcov(*, complete=FALSE).
> 
> 
>     >> (3) As you noticed in your second message, lm() has a
>     >> singular.ok argument and glm() doesn't.
> 
>     > and that has been amended even earlier (a bit more than a
>     > month ago) in R-devel svn rev 73380 with NEWS entry
> 
>     >     • glm() and glm.fit get the same singular.ok=TRUE
>     > argument that lm() has had forever.  As a consequence, in
>     > glm(*, method = <your_own>), user specified methods need
>     > to accept a singular.ok argument as well.
> 
>     >> I'll take a look at the code for glm() with an eye
>     >> towards creating a patch, but I'm a bit reluctant to mess
>     >> with the code for something as important as glm().
> 
>     > and as a matter of fact you did send me +- the R code part
>     > of that change.
> 
>     > My current plan is to also add the 'complete = TRUE'
>     > option to the "basic" coef() methods, such that you also
>     > have consistent coef(*, complete=FALSE) and vcov(*,
>     > complete=FALSE) behaviors.
> 
> and indeed I had added the above a bit later.
> 
> However, to my surprise, I have now found that we have a
> coef.aov() method -- completely undocumented which behaves *differently*:
> 
> where as the default coef() method which is called for lm(..) results gives *all*
> coefficients, and gives  NA  for "aliased" ones, the aov method *drops* the  NA
> coefficients  and has done so "forever"  (I've checked R version 1.1.1 of April 14,
> 2000).
> 
> vcov() on the other hand has not had a special "aov" method, but treats aov()
> and lm() results the same... which means that in R-devel the vcov() method for
> an aov() object  uses 'complete=TRUE' and gives NA rows and columns for the
> aliased coefficients, whereas  coef.aov()  removes all the NAs  and  gives only
> the
> "non-aliased" coefficients.   Consequently, in R-devel,
> vcov(<aov>) and coef(<aov>)  are *now* incoherent, whereas these two
> *were* coherent before the change.
> 
> I propose to
> 1. continue the strategy to keep coef() back-compatible  and 2. to *document*
> the "surprising" behavior of coef.aov() 3. introduce a  vcov.aov()  with
> complete=FALSE  default
>    behavior which is compatile to the coef.aov() one [where I'd
>    also introduce the no-change  'complete=FALSE' argument].
> 
> Hmm... again, this has been more work and more implications than originally
> optimistically assumed..
> 
> Opinions, caveats, other feedback -- are very welcome!
> 
> Martin
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


More information about the R-devel mailing list