[R-sig-ME] Inflated t-values when using weights with inadequate sample size in lme4?

Mon Mar 28 23:08:17 CEST 2011

Sorry for not mentioning that earlier -- yes, the lm summaries included the sample size weights (although the estimates do not differ much with or without weights).

Thanks again,
Jeremy

--- On Mon, 3/28/11, Ista Zahn <izahn at psych.rochester.edu> wrote:

> From: Ista Zahn <izahn at psych.rochester.edu>
> Subject: Re: [R-sig-ME] Inflated t-values when using weights with inadequate sample size in lme4?
> To: "Jeremy Koster" <helixed2 at yahoo.com>
> Cc: r-sig-mixed-models at r-project.org
> Date: Monday, March 28, 2011, 2:12 PM
> Hi Jeremy,
> Just to clarify, are the lm summaries you are comparing to
> computed
> using the sample size weights as well?
> 
> On Mon, Mar 28, 2011 at 1:15 PM, Jeremy Koster <helixed2 at yahoo.com>
> wrote:
> > A colleague approached me with concerns that there is
> clustering in her dataset that is unaddressed when using
> conventional OLS regression.  She had been using GEE, but
> she expressed a desire to replicate her analysis using
> mixed-effects modeling in part because her GEE package
> doesn't generate AIC measures (I don't use GEE, so I
> couldn't comment).
> >
> > The tricky part is that her sample size is so small.
>  She has 28 people who are (nested) members of 15
> households, and multiple households therefore have only
> singletons as representatives.  A scatterplot nevertheless
> suggests that there are within-household correlations in
> households with multiple members.
> >
> > So although I cautioned her that the estimated
> household-level variance would be practically useless, I
> thought it might be worthwhile to see how a mixed-effects
> model changed the estimates of fixed effects in comparison
> to a conventional OLS model.
> >
> > An added consideration in her model is that the second
> explanatory variable is a proportion, weighted by the number
> of times she was able to record the data for that person.
> >
> > The response variable is continuous, and I therefore
> specified this model:
> >
> > mixed.model <- glmer (Y ~ Age + X2 + (1|House),
> >        data = AD, REML = T, weights = Sample size
> of X2, verbose = T)
> >
> > Strangely, the t-values produced by this model were
> all about 10 times higher than I would have expected based
> on the lm summary.  For example, the t-value of "Age" went
> from 2.9 to 26.5.  The same holds true if I re-run the
> model using ML (i.e., REML = F).
> >
> > When I re-specify the model without weights, however,
> the t-values are generally comparable to the lm estimates .
> . . albeit a little more conservative, as I would have
> expected.  (I get similar results when re-running the model
> in MLwiN.)
> >
> > Any idea why the specification of weights would lead
> to the apparent inflation of t-values?
> >
> > As an aside, we both recognized that the sample size
> was generally inadequate for mixed-effects modeling, but
> we're curious to what extent the anomalous results are
> attributable to the sample size as compared to other
> possible explanations.
> >
> > Many thanks,
> > Jeremy
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> 
> 
> 
> -- 
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>