[R-sig-ME] P value value for a large number of degree of freedom in lmer

John Maindonald john.maindonald at anu.edu.au
Wed Nov 24 05:13:58 CET 2010


I need to redraft the final sentence of the first paragraph,
to read: "The consequence is that effects that are well within
the bounds of statistical variation may, according to the
the usual rituals, appear statistically significant, "
----------------------------------------------------------------------------

There are other considerations, which may often be more
serious.  In any observational dataset, there is almost
bound to be structure.  This arises in different areas in 
different ways, but some of the possibilities are:
1) a time element
2) a space element
3) a location or culture or group or family element
4) an effect from collection instrument or person.

So the correlation structure is not iid or even i, something
we might be expected to know about on this list.  The
correlations will often be positive.  Even after multi-level
or spatial models have been used to take out what is
thought to be the structure, there will often be structure 
left.  The consequence is that effects that are well within
the bounds of statistical variation may, according to the
the usual rituals, appear statistically significant, 

There are other problems.  Some variables may be measured
very inaccurately.  Used on their own, this reduces the chances
of finding a significant effect, catastrophically if the error is of
the same order of magnitude as the SD of that variable.  
If other accurately measured explanatory variables are included
in the same analysis, they may appear falsely significant.  This
sort of issue has been extensively canvassed in connection
with the use of food frequency questionnaire (FFQ) measuring
instruments in large-scale studies of the effect of diet on disease.
See for example:
Schatzkin, A.; Kipnis, V.; Carroll, R.; Midthune, D.; Subar, A.; Bingham, S.; Schoeller, D.; Troiano, R.; and Freedman, L., 2003. A comparison of a food frequency ques- tionnaire with a 24-hour recall for use in an epidemiological cohort study: results from the biomarker-based observing protein and energy nutrition (open) study. International Journal of Epidemiology, 32:1054–1062.
Here was an instrument that many thought adequately accurate.

These problems may of course affect all observational studies.
Deficiencies in the data and in the modeling (because some
structure is not accounted for) become more likely to show up
as the modeling becomes more sensitive to smallish, but 
perhaps still consequential effects.

In modest sized experiments, careful design can largely
avoid such problems.  In experiments where the number
of subjects is very large, the same sorts of problems will
almost inevitably appear.  Minor deviations from the
protocol become almost impossible to avoid.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 24/11/2010, at 11:25 AM, Rolf Turner wrote:

> 
> On 24/11/2010, at 1:09 PM, Jonathan Baron wrote:
> 
>> For the record, I have to register my disagreement.  In the
>> experimental sciences, the name of the game is to design a
>> well-controlled experiment, which means that the null hypothesis will
>> be true if the alternative hypothesis is false.  People who say what
>> is below, which includes almost everyone who responded to this post,
>> have something else in mind.  What they say is true in most
>> disciplines.  But when I hear this sort of thing, it is like someone
>> is telling me that my research career as an EXPERIMENTAL psychologist
>> has been some sort of delusion.
>> 
>> If you have a very large sample and you are doing a correlational
>> study, yes, everything will be significant.  But if you do the kind of
>> experiment we struggle to design, with perfect control conditions, you
>> won't get significant results (except by chance) if your hypothesis is
>> wrong.
>> 
> 
> 	I'll bet you don't work with samples of size 200,000. :-)
> 
> 	Also I'll bet that you don't ***really*** care if the
> 	difference between mu_T and mu_C is bigger than 0.000001 mm,
> 	say, whereas you might care if the difference were bigger than
> 	10 mm.
> 
> 	Also there's no such thing as ``perfect'' anything, let alone
> 	control conditions.
> 
> 		cheers,
> 
> 			Rolf Turner
> 
>> Jon
>> 
>> On 11/24/10 07:59, Rolf Turner wrote:
>>> 
>>> It is well known amongst statisticians that having a large enough data set will
>>> result in the rejection of *any* null hypothesis, i.e. will result in a small
>>> p-value.  There is no ``bias'' involved.
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the R-sig-mixed-models mailing list