[R-sig-ME] mixed effects model glmer
Ben Bolker
bbolker at gmail.com
Thu Sep 24 05:07:26 CEST 2015
[I'm taking the liberty of cc'ing r-sig-mixed-models on this; please keep the
mailing list in the cc: list when responding]
the main problem, which should be very easy to fix, is simply that
your predictor variable accidentally got turned into a factor!
Try
data1 <- transform(data1,
disease.prevalence=as.numeric(as.character(disease.prevalence)))
Additionally:
there's nothing necessarily wrong with means/year. Take a second look
at things after you have un-mangled your data ...
On Wed, Sep 23, 2015 at 6:26 PM, M West <westm490 at gmail.com> wrote:
> Hi Dr. Bolker,
>
> Thanks for getting back to me.
>
> Just to be clear, disease prevalence is a number in [0,1]?
>
> Not currently, it's just percentage infected out of total (e.g., 20/400).
> I can change it if I need to....
>
>
> Four years is not very many, so you might need to treat Year as a
> fixed effect (e.g. I would consider that option if the random effects
> variance
> is estimated as zero).
>
>
>
> How many sites? How many total observations?
> 15 sites total each year.
>
> Can do. We sample every two weeks for ~3-4 months per year. So, I collapsed
> the data down to
> means/year....maybe that's no good either.
>
>
> I have to admit that I'm stumped by your apparent model output (i.e.
> that there are multiple parameters for disease prevalence when there
> should only be one)
>
> Perhaps you could send the results of summary(data1) and/or
> str(data1) and summary() of your whole model?
>
> Sure, thy are all down below...I can also send you the data and code if
> that's easier.
>
> summary(data1)
>
> Site Site.1 Year Max_P variable_X
> disease prevalence Average.of...male Average.of.Total.Female
> Average.of.total.all
> :57 Min. : 1 :56 :57
> :57 :57 :74 Min. : 2.00 Min.
> :106.0
> Asite : 6 1st Qu.: 4 2009 :15 0 : 2 1.260095428: 2
> 0 : 3 0.633898824: 1 1st Qu.: 8.75 1st Qu.:295.8
> Bsite: 6 Median : 8 2010 :15 0.912379678: 2 10.90217665: 2
> 14.50737238: 2 0.770941362: 1 Median :21.50 Median :414.5
> BCsite : 6 Mean : 8 2011 :15 1.841225901: 2 14.44718559: 2
> 2.089865262: 2 0.853637644: 1 Mean :24.10 Mean :400.8
> Csite: 6 3rd Qu.:12 2014 :15 1.956818222: 2 18.02232207: 2
> 2.208160622: 2 1.008902322: 1 3rd Qu.:30.25 3rd Qu.:513.8
> Dsite : 6 Max. :15 11.20859993: 2 10.69047265: 2 18.5213911 : 2
> 2.643993383: 2 1.00895193 : 1 Max. :89.00 Max. :648.0
> (Other) :62 NA's :89 (Other) :31 (Other) :82
>
>
>
> summary(moda)
>
>
>
> Generalized linear mixed model fit by maximum likelihood (Laplace
> Approximation) ['glmerMod']
> Family: binomial ( logit )
> Formula: y ~ disease.prevalence + (1 | Lake) + (1 | Year)
> Data: males1
>
> AIC BIC logLik deviance df.resid
> 419.3 545.0 -149.7 299.3 0
>
> Scaled residuals:
> Min 1Q Median 3Q Max
> -3.470 0.000 0.000 0.000 1.989
>
> Random effects:
> Groups Name Variance Std.Dev.
> Site (Intercept) 0 0
> Year (Intercept) 0 0
> Number of obs: 60, groups: Lake, 15; Year, 4
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -1.60267 0.11618 -13.794 < 2e-16 ***
> disease.prevalence0 -0.40212 0.15557 -2.585 0.009745 **
> disease.prevalence0.035088231 -1.46452 0.22860 -6.406 1.49e-10 ***
> disease.prevalence0.064935065 -0.36344 0.30810 -1.180 0.238157
> disease.prevalence0.078507945 -2.57479 0.46537 -5.533 3.15e-08 ***
> disease.prevalence0.120039255 -3.30998 0.71915 -4.603 4.17e-06 ***
>
> this goes on forever and then there's this:
>
> Correlation matrix not shown by default, as p = 58 > 20.
> Use print(...., correlation=TRUE) or
> vcov(....) if you need it
>
>
>
>
> str(data1)
> 'data.frame': 149 obs. of 12 variables:
> $ Site : Factor w/ 18 levels "","Alake","BDlake",..: 2
> 3 4 5 6 7 8 9 10 12 ...
> $ Site1 : int 1 2 3 4 5 6 7 8 9 10 ...
> $ Year : Factor w/ 22 levels "","11.20859993",..: 7 7
> 7 7 7 7 7 7 7 7 ...
> $ disease.prevalence : Factor w/ 75 levels "","0","0.035088231",..:
> 8 35 5 69 6 56 11 51 50 26 ...
> $ Average.of...male : Factor w/ 76 levels "","0.633898824",..: 14
> 32 31 61 5 45 11 62 68 60 ...
> $ Average.of.Total.Female : int 7 14 5 31 2 22 6 21 14 24 ...
> $ Average.of.total.all : int 405 553 331 512 274 648 338 388 230 454
> ....
> $ Change.in.percent.females : num 1.581 2.466 2.535 5.183 0.843 ...
>
>
>
>
> On Wed, Sep 23, 2015 at 4:41 PM, Ben Bolker <bbolker at gmail.com> wrote:
>>
>> On Wed, Sep 23, 2015 at 2:36 PM, M West <westm490 at gmail.com> wrote:
>> > I am trying to fit a mixed effects model with repeated measures data.
>> >
>> > Data are:
>> >
>> > y variable = percentage (# females/total)
>> > x variable = percentage
>> >
>> > measured across multiple sites for 4 years.
>> >
>> > here's the model:
>> >
>> > y <- cbind(total females, (Total - total females)))
>> >
>> > mod1 <- with(data, glmer(y ~ disease prevalence + (1|Site) + (1|Year),
>> > family = binomial, data = data1))
>>
>> Just to be clear, disease prevalence is a number in [0,1]?
>> >
>> > 1) This model runs, but the summary(mod1) just generates a series of the
>> > following....which doesn't make any sense so something must be wrong
>> > with
>> > the model specification...I'm just not sure what.
>> >
>> > 2) Also, what is the default AR correlation on these models (i.e., do I
>> > need to specify it or is the temporal psuedoreplication taken care of)?
>>
>> AR models are not currently easy in lme4. My suggestion (=hack) would
>> be to get the residuals and use nlme::gls(resid~1,correlation=corAR1())
>> (or
>> something like that) to see if you should worry about autoregressive
>> structure.
>>
>> Four years is not very many, so you might need to treat Year as a
>> fixed effect (e.g. I would consider that option if the random effects
>> variance
>> is estimated as zero)
>>
>> How many sites? How many total observations?
>>
>> I have to admit that I'm stumped by your apparent model output (i.e.
>> that there are multiple parameters for disease prevalence when there
>> should only be one)
>>
>> Perhaps you could send the results of summary(data1) and/or
>> str(data1) and summary() of your whole model?
>>
>> >
>> > 3) Finally, do you suggest another form of the model that's better etc.?
>> >
>> > Fixed effects:
>> > Estimate Std. Error z
>> > value
>> > Pr(>|z|)
>> > (Intercept) -1.60267 0.11618 -13.794 <
>> > 2e-16 ***
>> > disease prevalence -0.40212 0.15557 -2.585 0.009745
>> > **
>> > disease prevalence 0.035088231 -1.46452 0.22860 -6.406
>> > 1.49e-10
>> > ***
>> > disease prevalence 0.064935065 -0.36344 0.30810 -1.180
>> > 0.238157
>> >
>> > disease prevalence 0.078507945 -2.57479 0.46537 -5.533
>> > 3.15e-08
>> > ***
>> > disease prevalence 0.120039255 -3.30998 0.71915 -4.603
>> > 4.17e-06
>> > ***
>> > disease prevalence 0.182623706 -0.14362 0.19899 -0.722
>> > 0.470438
>> >
>> >
>
>
More information about the R-sig-mixed-models
mailing list