[R-sig-ME] lmer and p-values (variable selection)

Thu Apr 7 01:01:55 CEST 2011

On 11-04-06 06:40 PM, John Maindonald wrote:
> On 07/04/2011, at 7:18 AM, Douglas Bates wrote:
> 
>> On Mon, Mar 28, 2011 at 5:40 PM, Ben Bolker <bbolker at gmail.com> wrote:
>>> On 03/28/2011 06:15 PM, John Maindonald wrote:
>>>
>>>> Elimination of a term with a p-value greater than say 0.15 or 0.2 is
>>>> however likely to make little differences to estimates of other terms
>>>> in the model.  Thus, it may be a reasonable way to proceed.  For
>>>> this purpose, an anti-conservative (smaller than it should be)
>>>> p-value will usually serve the purpose.
>>>
>>> Note that naive likelihood ratio tests of random effects are likely to
>>> be conservative (in the simplest case, true p-values are twice the
>>> nominal value)
> 
> Just to be sure what is being said here, Ben, you meant "(in the simplest case, 
> true p-values are <<half>> the nominal value)"?

  Yes.   Oops.

  I'm not sure what Doug is saying here though.

  I think this has to do with the "levels of focus" issue.  In the case
of metrics that are trying to minimize prediction error in some sense
(i.e. DIC, AIC), it matters whether you are trying to predict (make
inferences?) at the level of the population ("marginal") or at the level
of the random-effects unit ("conditional").  My understanding is that in
the the marginal case, the appropriate "model degrees of freedom" is
somewhat less than 1 (because of boundary issues); in the conditional
case, the appropriate df is between 1 and q (because of shrinkage).

  (I'm using 'df' in a vague sense of "how much information the fit
takes up" rather than suggesting that some null distribution is actually
t- or F-distributed)
> 
> John Maindonald             email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
> http://www.maths.anu.edu.au/~johnm
> 
> 
>>> because of boundary issues and those of fixed effects are
>>> probably anticonservative because of finite-size effects (see PB 2000
>>> for examples of both cases.)
>>
>> Well B of PB isn't quite so sure anymore.  You can have situations
>> where adding a single, simple, random-effects term can introduce
>> millions of coefficients into the linear predictor (although the
>> estimates of those coefficients will be shrunk towards zero relative
>> to estimating fixed-effects for such a term).  If I understand the
>> argument behind DIC (Spiegelhalter, Best, Carlin  and van der Linde,
>> 2002) http://www.jstor.org/stable/3088806 properly they would count
>> the effective number of degrees of freedom according to the trace of
>> the hat matrix, which would be somewhere between 1 and the number of
>> levels of the factor.  In some ways that makes more sense to me but I
>> still do recognize the argument that we made in the 2000 book.  So I
>> remain confused - a not unusual state.
>>
>>>> John Maindonald             email: john.maindonald at anu.edu.au
>>>> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
>>>> Centre for Mathematics & Its Applications, Room 1194,
>>>> John Dedman Mathematical Sciences Building (Building 27)
>>>> Australian National University, Canberra ACT 0200.
>>>> http://www.maths.anu.edu.au/~johnm
>>>>
>>>
>>> Ben
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>