[Rd] New vcov(*, complete=TRUE) etc -- coef(<lm>) vs coef(<aov>)
Martin Maechler
maechler at stat.math.ethz.ch
Thu Nov 9 11:27:01 CET 2017
>>>>> Fox, John <jfox at mcmaster.ca>
>>>>> on Tue, 7 Nov 2017 22:09:03 +0000 writes:
> Dear Martin, I think that your plan makes sense. It's too
> bad that aov() behaved differently in this respect from
> lm(), and thus created more work, but it's not be a bad
> thing that the difference is now explicit and documented.
> I expect that that other problems like this will surface,
> particularly with contributed packages (and I know that
> you're aware that this has already happened with the car
> package). That is, packages that made provision for
> aliased coefficients based on the old behaviour of coef()
> and vcov() will now have to adapt to the new, more
> consistent behaviour.
> Best, John
Thank you John for the confirmation (and see below).
>> -----Original Message-----
>> >>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>> >>>>> on Thu, 2 Nov 2017 21:59:00 +0100 writes:
>>
>> >>>>> Fox, John <jfox at mcmaster.ca>
>> >>>>> on Thu, 14 Sep 2017 13:46:44 +0000 writes:
>>
>> >> Dear Martin, I made three points which likely got lost
>> >> because of the way I presented them:
>>
>> >> (1) Singularity is an unusual situation and should be
>> >> made more prominent. It typically reflects a problem with
>> >> the data or the specification of the model. That's not to
>> >> say that it *never* makes sense to allow singular fits
>> >> (as in the situations you mentions).
>>
>> >> I'd favour setting singular.ok=FALSE as the default, but
>> >> in the absence of that a warning or at least a note. A
>> >> compromise would be to have a singular.ok option() that
>> >> would be FALSE out of the box.
>>
>> >> Any changes would have to be made very carefully so as
>> >> not to create chaos.
>>
>> > I for one, am too reluctant to want to change the default
>> > there.
>>
>> >> That goes for the points below as well.
>>
>> >> (2) coef() and vcov() behave inconsistently, which can be
>> >> problematic because one often uses them together in code.
>>
>> > indeed; and I had agreed on that. As of today, in R-devel
>> > only they now behave compatibly. NEWS entry
>>
>> > • The “default” ("lm" etc) methods of vcov() have
>> > gained new optional argument complete = TRUE which makes
>> > the vcov() methods more consistent with the coef() methods
>> > in the case of singular designs. The former behavior is
>> > now achieved by vcov(*, complete=FALSE).
>>
>>
>> >> (3) As you noticed in your second message, lm() has a
>> >> singular.ok argument and glm() doesn't.
>>
>> > and that has been amended even earlier (a bit more than a
>> > month ago) in R-devel svn rev 73380 with NEWS entry
>>
>> > • glm() and glm.fit get the same singular.ok=TRUE
>> > argument that lm() has had forever. As a consequence, in
>> > glm(*, method = <your_own>), user specified methods need
>> > to accept a singular.ok argument as well.
>>
>> >> I'll take a look at the code for glm() with an eye
>> >> towards creating a patch, but I'm a bit reluctant to mess
>> >> with the code for something as important as glm().
>>
>> > and as a matter of fact you did send me +- the R code part
>> > of that change.
>>
>> > My current plan is to also add the 'complete = TRUE'
>> > option to the "basic" coef() methods, such that you also
>> > have consistent coef(*, complete=FALSE) and vcov(*,
>> > complete=FALSE) behaviors.
>>
>> and indeed I had added the above a bit later.
>>
>> However, to my surprise, I have now found that we have a
>> coef.aov() method -- completely undocumented which behaves *differently*:
>>
>> where as the default coef() method which is called for lm(..) results gives *all*
>> coefficients, and gives NA for "aliased" ones, the aov method *drops* the NA
>> coefficients and has done so "forever" (I've checked R version 1.1.1 of April 14,
>> 2000).
>>
>> vcov() on the other hand has not had a special "aov" method, but treats aov()
>> and lm() results the same... which means that in R-devel the vcov() method for
>> an aov() object uses 'complete=TRUE' and gives NA rows and columns for the
>> aliased coefficients, whereas coef.aov() removes all the NAs and gives only
>> the
>> "non-aliased" coefficients. Consequently, in R-devel,
>> vcov(<aov>) and coef(<aov>) are *now* incoherent, whereas these two
>> *were* coherent before the change.
>> I propose to
>> 1. continue the strategy to keep coef() back-compatible and
>> 2. to *document* the "surprising" behavior of coef.aov()
>> 3. introduce a vcov.aov() with complete=FALSE default
>> behavior which is compatile to the coef.aov() one [where I'd
>> also introduce the no-change 'complete=FALSE' argument].
I have now committed the above proposal to R-devel,
svn rev 73692.
This does revert vcov(<aov>) default behavior in R-devel to
the R <= 3.4.x behavior...
so an effect in package-space should rather be beneficial.
Martin Maechler
ETH Zurich
More information about the R-devel
mailing list