[R] p values greater than 1 from lme4
Ben Bolker
bbolker at gmail.com
Tue Sep 6 20:30:54 CEST 2011
On 09/06/2011 02:29 PM, Bogaso Christofer wrote:
> Hello Bolker, Hope you will make available at least the problem and
> reasoning to this list. I am also very much interested to see the
> problem
From an earlier off-list e-mail:
The problem (which was not at all trivial) is that one of the
species in the example has only one level of pMoist, so the slope
parameter is not identifiable (aliased), so the coefficient matrix
produced by summary.lm() has only a single row rather than two, so
summary.lmList gets confused when it tries to boil down the coefficient
tables from all of the different fits into a single array. One normally
doesn't notice this in the output of summary() from a single lm fit
because print.summary.lm does a little bit of magic to replace the
missing rows (i.e. slope estimate, std. err, t statistic, p value) with
NAs in the printed summary.
Solutions: (1) e-mail me for the code to the hacked version of
summary.lmList; (2) remove units from your data that have this kind of
unidentifiability/aliasing problem; (3) wait for Doug Bates to implement
my fix in the next patched version of nlme (the summary.lmList function
lives in nlme, not lme4).
Ben Bolker
>
> -----Original Message----- From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Ben Bolker Sent:
> 06 September 2011 02:58 To: r-help at stat.math.ethz.ch Subject: Re:
> [R] p values greater than 1 from lme4
>
> RTSlider <rob.t.slider <at> gmail.com> writes:
>
>>
>> Hello, I'm running linear regressions using the following script
>> where I have separated out species using the "IDtotsInLn"
>> identifier
>>
>> x<-read.csv('tbl02TOTSInLn_ENV.csv', header=T) x attach (x)
>> library(lme4)
>>
>> rInLn<-lmList(InLn~pMoist | IDtotsInLn, x, pool=F)
>> write.table(summary(rInLn)$coefficients, "rInLnPlots.csv")
>> write.table(summary(rInLn)$r.squared, append=T, "rInLnPlots.csv")
>> write.table(summary(rInLn)$df, append=T, "rInLnPlots.csv")
>>
>> The script seems to be working for most of the species, but for
>> some it is returning a p value of greater than 1 (e.g. 20). I
>> thought this might be for the few cases where the independent
>> variable remained constant, but found other species where this was
>> not the case and the p value was still much greater than 1. Any
>> help would be appreciated -RTS
>
> This is very interesting but practically impossible to solve because
> it's not reproducible; is there any chance that you can make the data
> available? You can send it directly to me (Ben Bolker -- my e-mail is
> pretty easy to find on the web) if you like.
>
> Ben Bolker
>
> ______________________________________________ R-help at r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list