[R-sig-ME] Conflicting p-values from pvals.fnc

Thu Mar 29 04:36:32 CEST 2012

Hi all,

This example has gotten me pretty confused about how the "weights" argument works for lmer.  I'd always assumed that setting a weight of k for an observation would cause lmer to act as if it had seen k replicates of that observation (i.e., the contribution of the observation to the likelihood would be taken to the k-th power).  After reading this query I found the following post that they are "precision weights not sampling weights":

http://tolstoy.newcastle.edu.au/R/e17/help/12/01/2099.html

I'm not sure what that means -- was my interpretation one of "sampling weights"?

Regardless, I'm noticing what seems to me to be inconsistent behavior in how setting the weights argument affects the t statistic in lmer() output and how the output of pvals.fnc() is affected.  I'm including an example below with a fixed effect of "x" and a random intercept of "a": the higher one sets the weights, the larger the t statistic for x becomes (which is what I'd originally expected given my assumptions about the weights semantics), but the broader the posterior on x becomes in the MCMC output.  This doesn't seem right, does it?  (I also don't understand why the estimate of the random-intercept variance but not the residual variance reported in the lmer output changes.)

> set.seed(1)
> dat <- data.frame(x=rep(c(0,1),each=4),a=factor(rep(c("a","b"),4)),w=1)
> dat$y <- with(dat,10*x+10*(a=="a") + rnorm(8))
> print(m1 <- lmer(y~x+(1|a),dat,weights=1*w))
Linear mixed model fit by REML 
Formula: y ~ x + (1 | a) 
   Data: dat 
   AIC   BIC logLik deviance REMLdev
 32.32 32.64 -12.16    29.97   24.32
Random effects:
 Groups   Name        Variance Std.Dev.
 a        (Intercept) 44.09718 6.64057 
 Residual              0.87763 0.93682 
Number of obs: 8, groups: a, 2

Fixed effects:
            Estimate Std. Error t value
(Intercept)   5.0792     4.7186   1.076
x            10.1045     0.6624  15.254

Correlation of Fixed Effects:
  (Intr)
x -0.070
> pvals.fnc(m1)
$fixed
            Estimate MCMCmean HPD95lower HPD95upper  pMCMC Pr(>|t|)
(Intercept)    5.079    5.037     -1.553      11.32 0.1004   0.3231
x             10.104   10.110      4.917      15.28 0.0038   0.0000

$random
    Groups        Name Std.Dev. MCMCmedian MCMCmean HPD95lower HPD95upper
1        a (Intercept)   6.6406     2.6325   3.3750     0.0000     7.6330
2 Residual               0.9368     2.8649   3.2042     0.8809     6.0746

> print(m2 <- lmer(y~x+(1|a),dat,weights=100*w))
Linear mixed model fit by REML 
Formula: y ~ x + (1 | a) 
   Data: dat 
   AIC   BIC logLik deviance REMLdev
 69.17 69.48 -30.58    66.81   61.17
Random effects:
 Groups   Name        Variance Std.Dev.
 a        (Intercept) 0.44097  0.66406 
 Residual             0.87763  0.93682 
Number of obs: 8, groups: a, 2

Fixed effects:
            Estimate Std. Error t value
(Intercept)  5.07921    0.47185   10.76
x           10.10449    0.06624  152.54

Correlation of Fixed Effects:
  (Intr)
x -0.070
> pvals.fnc(m2)
$fixed
            Estimate MCMCmean HPD95lower HPD95upper  pMCMC Pr(>|t|)
(Intercept)    5.079     5.38    -23.816      38.45 0.3434        0
x             10.104    10.11      4.115      16.06 0.0068        0

$random
    Groups        Name Std.Dev. MCMCmedian MCMCmean HPD95lower HPD95upper
1        a (Intercept)   0.6641     5.8216  15.7201     0.0000    61.7422
2 Residual               0.9368    23.8808  30.8847     5.2868    71.7694

> print(m3 <- lmer(y~x+(1|a),dat,weights=w/100))
Linear mixed model fit by REML 
Formula: y ~ x + (1 | a) 
   Data: dat 
    AIC    BIC logLik deviance REMLdev
 -4.516 -4.199  6.258   -6.868  -12.52
Random effects:
 Groups   Name        Variance   Std.Dev.
 a        (Intercept) 4409.71731 66.40570
 Residual                0.87763  0.93682
Number of obs: 8, groups: a, 2

Fixed effects:
            Estimate Std. Error t value
(Intercept)    5.079     47.187   0.108
x             10.104      6.624   1.525

Correlation of Fixed Effects:
  (Intr)
x -0.070
> pvals.fnc(m3)
$fixed
            Estimate MCMCmean HPD95lower HPD95upper pMCMC Pr(>|t|)
(Intercept)    5.079    5.053    -0.8904      11.28 0.085   0.9178
x             10.104   10.107     5.3060      15.21 0.002   0.1780

$random
    Groups        Name Std.Dev. MCMCmedian MCMCmean HPD95lower HPD95upper
1        a (Intercept)  66.4057     2.6379   3.1098     1.1580     6.5406
2 Residual               0.9368     0.2828   0.3121     0.1214     0.5694

I'd be glad to be enlightened...!

Best

Roger

On Mar 26, 2012, at 10:35 AM PDT, Geoff Brookshire wrote:

> Hi listers,
> 
> If you get p-vals using Wald chi-square tests with lme4::anova, they look
> pretty close to the pMCMC output.
> fm.1 <- lmer(y ~ block * condition + (1 | as.factor(subject)),
>             weights = weights, REML = FALSE)
> fm.2 <- lmer(y ~ block + condition + (1 | as.factor(subject)),
>             weights = weights, REML = FALSE)
> anova(fm.1, fm.2)
> 
> This gives p = .35 for the block*condition interaction. For comparison,
> pvals.fnc gave pMCMC = .3 and p(<|t|) < .001. So it looks like p-vals
> derived from t-tests are just way off.
> 
> It seems to me like we should just totally ignore the P(<|t|) output. Does
> anyone who knows more about how these work think otherwise?
> 
> -- geoff
> 
> 
> On Mon, Mar 19, 2012 at 3:25 PM, Tom Gijssels <tom.gijssels at gmail.com>wrote:
> 
>> Dear R-listers,
>> 
>> I'm trying to run a mixed effect model using the lmer() function and have
>> run into some issues in interpreting the p-values generated by
>> pvals.fnc(). The design is a between-subjects design, with two fixed
>> effects (condition & block; each with two levels), and one random effect
>> (subject). Additionally, I have a set of weights that I want to include.
>> 
>> When looking at the pvals.fnc() output,there appears to be a large
>> discrepancy between the pMCMC values and the t-statistic p-values. Whereas
>> one of the main effects and the interaction are far from significant
>> judging by the pMCMC values, they are highly significant when looking at
>> the t-statistic p-values (e.g. Condition: pMCMC = 0.2294; Pr(>|t|) = 0.0000
>> & Condition*Block: pMCMC = 0.3296; Pr(>|t|) = 0.0000) . I have read that
>> the t-statistic based p-values are less conservative, but the difference
>> between these two values seems really extreme.
>> 
>> Below some code that simulates the model and the data. The original data
>> set has two precise characteristics that might influence the results, so I
>> tried to simulate those characteristics in the mock data. That is: 1)
>> there's fewer observations in block A than in block B; and 2) the weights
>> for observations in block A generally are lower than those for block B.
>> 
>> Running this code reproduces the original observation of conflicting pMCMC
>> and p-T-test values. However, when excluding the weights argument from the
>> lmer model, these values seem to converge, suggesting that the weights
>> specification might be underlying these problems.
>> 
>> In short, my question is whether anyone knows why these values diverge and
>> what I could do to address this issue.
>> 
>> Many thanks in advance!
>> 
>> Tom
>> 
>> block <- as.factor(c(rep('a', times = 20), rep('b', times = 200)))
>> condition <- as.factor(c(rep(c('x', 'y'), each = 10), rep(c('x','y'), each
>> = 100)))
>> contrasts(block) <- c(-0.5, 0.5)
>> contrasts(condition) <- c(-0.5, 0.5)
>> 
>> subject <- c(rep(1:4, each = 5), rep(1:4, each = 50))
>> 
>> intercept <- 100
>> block.me <- 20
>> condition.me <- 30
>> err <- rnorm(length(block), sd = 20)
>> weights <- c(rep(1, times = 20), rep(10, times = 200))
>> 
>> y <- intercept + ifelse(block == 'a', block.me, 0) + ifelse(condition ==
>> 'x', condition.me, 0) +
>>   ifelse(block == 'a' & condition == 'x', 30, 0) + (subject * 10) + err
>> 
>> 
>> fm.1 <- lmer(y ~ block * condition + (1 | as.factor(subject)),
>>            weights = weights, REML = FALSE)
>> fm.1.mcmc <- pvals.fnc(fm.1, addPlot=F)
>> 
>>       [[alternative HTML version deleted]]
>> 
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--

Roger Levy                      Email: rlevy at ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://idiom.ucsd.edu/~rlevy