[R-sig-ME] Fwd: GLM-normal distribution

Ben Bolker bbolker at gmail.com
Tue Apr 4 23:47:19 CEST 2017

  This isn't really a mixed model question: it would be more appropriate
for a generic stats or stats-ecology forum (e.g.
r-sig-ecology at r-project.org, or CrossValidated

   A couple of quick points:

- you don't need lme4 at all since you don't have a random effect in
your model
- a rule of thumb is that you shouldn't try to fit more than 1 model
parameter per 10-15 data points, so this model (4 parameters for 19 data
points) is pushing it a bit
- you should not assess normality based on the *marginal* distribution;
instead you should look at the residuals from the model (e.g. see
plot(M2) below)
- if you weight the linear model by number of species (as is probably
appropriate) you get a p-value of 0.052
- your data are slightly underdispersed (less variance than expected
from binomial); if you account for this by using family=quasibinomial
you get almost identical results to the linear model.

  Overall I would say you have *weak* evidence at best for an effect of

M1 <- glm(cbind(exot, nativ) ~ anchom + tipdecamp + exph500,
          data =mis.datos1, family =binomial)# the syntax of my model
M2 <- lm(exot/(nativ+exot) ~ anchom + tipdecamp + exph500,
         data =mis.datos1, weight=nativ+exot)

library(ggplot2); theme_set(theme_bw())
d2 <- mis.datos %>%
           prop_exot=exot/tot) %>%
    select(prop_exot,tot,anchom,tipdecamp,exph500) %>%

ggplot(d2 ,aes(value,prop_exot,colour=tipdecamp))+

M3 <- update(M1, family =quasibinomial)

## scale parameters
d3 <- mis.datos %>%

M4 <- update(M3, data=d3)

On 17-04-04 05:15 PM, Marcos Monasterolo wrote:
> Dear all. I am doing an analysis on proportion data resulting from counts.
> As I do have the count data available I am running a glm with binomial
> distribution. However, after realizing the response variable is normal
> (Anderson-Darling test did not reject normality of the calculated
> proportions) I am now having second thoughts as to whether it might also be
> possible to run a normal lm with proportion as the response variable. The
> thing is one of the explanatory variables ("ancho", which I am really
> interested in) is not significant in the binomial glm but significant in
> the lm. My understanding is that I should stick with the binomial GLM, but
> I wanted to have an expert opinion on this.
> I provide a working code below. Thanks in advance for your help.
> Marcos
> id <- "0B6X3EoqLHXG-dnZqTXpWSkRPYkE" # google file ID
> mis.datos <- read.table(sprintf("https://docs.google.com/uc?id=%s&
> export=download", id), header = TRUE,sep=";",dec=",")
> mis.datos1<-mis.datos[-c(3,6,7,8),] #these data points I don't need
> library(nortest)
> ad.test(mis.datos1$propexot)#evaluate normality
> hist(mis.datos1$propexot)
> library(lme4)
> M1 <- glm(cbind(exot, nativ) ~ anchom + tipdecamp + exph500, data =
> mis.datos1, family =binomial)# the syntax of my model
> summary(M1)
> ----
> Biól. Marcos Monasterolo
> Becario doctoral - Cátedra de Botánica General, Facultad de Agronomía, UBA
> 	[[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list