[R-sig-eco] Question About Syntax For Complex ANOVA Design
Ben Bolker
bolker at ufl.edu
Mon Nov 10 21:02:27 CET 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hadley wickham wrote:
> On Mon, Nov 10, 2008 at 9:22 AM, Mike Dunbar <mdu at ceh.ac.uk> wrote:
>> (apologies - I should have written coast * MBL not ML)
>>
>> I'm not sure of my ground here, but surely do lose something -
you wouldn't retain coast:MBL if it's not significant, as you lose
degrees of freedom,
and this gets worse the more terms and the more interactions you consider.
>
> But if you drop the term you are effectively spending your degrees of
> freedom twice - once to estimate the effect that you drop, and then
> again in the new model. Another way of to see the problem is to think
> about the null distribution of the p-values - if you only include
> significant p values in your model, the standard null hypothesis is
> clearly not appropriate.
>
> I think there's a good discussion of this in Frank Harrell's
> regression modelling strategies, but unfortunately I don't have a copy
> on hand to point you to the exact location.
>
> Hadley
See e.g. sections 4.2 through 4.4 (pp. 56-60). The discussion
above does not mean that overfitted models are good, or that there
isn't a penalty to overspecifying models (or otherwise one would
always throw everything into the models), but that data-driven
model selection has some very fundamental problems ...
cheers
Ben Bolker
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkkYk1MACgkQc5UpGjwzenOcvgCePr2fJx+GfV++s6Q14pQe/Ryj
vf8An2Gxc3SCzsCHj7x53yOXAx/NZng4
=Os6f
-----END PGP SIGNATURE-----
More information about the R-sig-ecology
mailing list