[R-sig-ME] Fwd: GLM-normal distribution

Wed Apr 5 14:40:12 CEST 2017

Marcos,

In addition to Ben Bolker's very sound advice, I'd like to suggest that if you've got the technical ability to fit and interpret a glm that is more conceptually appropriate than the lm given the nature of the data, you should just stick with that. The scientific objective should not be to fish for some way to make the effect of interest come out with a significant p-value by running lots of variations on a given analysis, it is to get an accurate and trustworthy answer about your hypothesis without doing any p-hacking. You may be at risk of introducing opportunistic bias into your research here. Take a look at this paper by DeCoster et al. (2015), which discusses this issue and ways to avoid it. 

DeCoster, J., Sparks, E. A., Sparks, J. C., Sparks, G. G., & Sparks, C. W. (2015). Opportunistic biases: Their origins, effects, and an integrated solution. American Psychologist, 70(6), 499-514. doi:10.1037/a0039191

Best regards,

Steven J. Pierce, Ph.D.
Acting Director; Associate Director
Center for Statistical Training & Consulting (CSTAT)
Michigan State University

-----Original Message-----
From: Marcos Monasterolo [mailto:mmonasterolo at agro.uba.ar] 
Sent: Tuesday, April 04, 2017 5:15 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Fwd: GLM-normal distribution

Dear all. I am doing an analysis on proportion data resulting from counts.
As I do have the count data available I am running a glm with binomial
distribution. However, after realizing the response variable is normal
(Anderson-Darling test did not reject normality of the calculated
proportions) I am now having second thoughts as to whether it might also be
possible to run a normal lm with proportion as the response variable. The
thing is one of the explanatory variables ("ancho", which I am really
interested in) is not significant in the binomial glm but significant in
the lm. My understanding is that I should stick with the binomial GLM, but
I wanted to have an expert opinion on this.
I provide a working code below. Thanks in advance for your help.
Marcos

id <- "0B6X3EoqLHXG-dnZqTXpWSkRPYkE" # google file ID
mis.datos <- read.table(sprintf("https://docs.google.com/uc?id=%s&
export=download", id), header = TRUE,sep=";",dec=",")
mis.datos1<-mis.datos[-c(3,6,7,8),] #these data points I don't need
library(nortest)
ad.test(mis.datos1$propexot)#evaluate normality
hist(mis.datos1$propexot)
library(lme4)
M1 <- glm(cbind(exot, nativ) ~ anchom + tipdecamp + exph500, data =
mis.datos1, family =binomial)# the syntax of my model
summary(M1)

----
Biól. Marcos Monasterolo
Becario doctoral - Cátedra de Botánica General, Facultad de Agronomía, UBA

	[[alternative HTML version deleted]]