[R-sig-ME] P value value for a large number of degree of freedom in lmer

Tue Nov 23 20:45:44 CET 2010

Dear Arnaud,

Having a large amount of data *is* exactly what increases confidence
in results.  A p-value is the probability of obtaining your results
given the null hypothesis is true *in the population*.  If you have a
lot of data, you have a lot of the population, and can more
confidently say "this is what the population is or is note like".  The
p-value is serving its purpose exactly as it was meant to, there is no
need to "correct" or "alter" it.  The real question is, does anyone
care about your effect?  Effect sizes are often a good way to get at
the idea of is the effect meaningful, does it have practical
significance, could an average person notice the difference?

Cheers,

Josh

On Tue, Nov 23, 2010 at 11:25 AM, Arnaud Mosnier <a.mosnier at gmail.com> wrote:
> I agree but how to test that a significant result is not due to the amount
> of data but by a real effect.
> I though about subsetting my dataset and rerun the model X time to see if
> the result still persist ... but you can also say that doing so I will
> achieve to find a (small enough) size of subset at which I will not detect
> the effect :-)
> I also agree that the term "bias" was not correctly used ... but is there a
> method to increase the confidence in those results ?
>
> cheers,
>
> Arnaud
>
> 2010/11/23 Rolf Turner <r.turner at auckland.ac.nz>
>
>>
>> It is well known amongst statisticians that having a large enough data set
>> will
>> result in the rejection of *any* null hypothesis, i.e. will result in a
>> small
>> p-value.  There is no ``bias'' involved.
>>
>>        cheers,
>>
>>                Rolf Turner
>>
>> On 24/11/2010, at 4:06 AM, Arnaud Mosnier wrote:
>>
>> > Dear UseRs,
>> >
>> > I am using a database containing nearly 200 000 observations occurring in
>> 33
>> > groups.
>> > With a model of the form ( y ~ x + (1|group) ) in lmer, my number of
>> degree
>> > of freedom is really large.
>> > I am wondering if this large df have an impact on the p values, mainly if
>> > this could conduct to consider the effect of a variable as significant
>> while
>> > it is not .
>> > ... and if it is the case, does it exist a correction to apply on the
>> > results to take into account that bias.
>> >
>> > thanks !
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > R-sig-mixed-models at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/