[R-sig-ME] mixed effects model proportion data that are not 0 and 1's

John Sorkin jsorkin at grecc.umaryland.edu
Mon Oct 12 20:15:17 CEST 2015


Would a log-linear model (e.g. Poisson regression) with an offset (to allow for fractions rather than counts) do what you want to do?John



John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 

>>> M West <westm490 at gmail.com> 10/12/15 1:42 PM >>>
Hello,

I am trying to decide the best approach for analyzing a short time series.

The goal is to see if there is a significant relationship between the
percentage change in females (# females/total population during epidemic -
# females/total population pre-epidemic) and disease prevalence (continuous
variable: percentage infected).

Samples are across 15 sites and 5 months. So a very short time series.

Here's the model and weights that I've used thus far.
v3 <- varComb(varIdent(form =~ 1 | Month) , varExp(form =~
Proportion_infected))
mod <- lme(Change_in_proportion_females ~ Proportion_infected + Month,
random = ~ 1 |Site/Month, weights = v3)

The main problems are:
1) If I plot the residuals vs. the fitted values, there is a strong
relationship demonstrating that the variance increase with the mean. How do
I account for this in a mixed effects model? I've tried the weights option
(with a couple of variations), however, I receive an error message:
"iteration limit reached without convergence"

2) I also have a question about the way that R treats proportion data that
are not 0 and 1's when using glm (or glmer) with the binomial distribution.
Crawley, for example, suggests that you create a y variable where you
account for the total number of observations (e.g., y <- cbind(total
females, (total population - total females)) and then run a glm (or in may
case, a generalized linear mixed model) using the binomial distribution.

But what is actually going on under the hood here? Is R running some sort
of proportion test?  All the examples that I have found for proportion data
or non-normal residuals suggest using a logistic regression approach with
the binomial distribution. Is this also the best option for proportion data
that are not 0 and 1's (i.e., my data on the percentage change would not
match the typical logistic regression plot)?

Many thanks in advance.
M.

    [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




Confidentiality Statement:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. 


More information about the R-sig-mixed-models mailing list