[R-sig-eco] deviance as a goodness of fit in GLM

Tue Jul 22 12:31:12 CEST 2014

Hi Samantha,

Following what was already said before, you can also build your models by
testing step-by-step which explanatory variables should be included, for
example by using a Likelihood Ratio Test to compare each univariate model to
the null model (intersect only). With that (determining which explanatory
variables are significant) you can then build a multivariate simple effects
models (using the significant explanatory variables only), and then start to
experiment with interactions, again comparing each model with interactions
to the simple effects model. By following these steps, you should get to the
most parsimonious model. Along the way, and as already said by others, you
should check the residuals and can also use something as AIC (or BIC) to
compare all the candidate models.

Another option additional to this, is to use a cross-validation technique to
compare candidate models (see for example function cv.glm in library boot).

I hope this helps.

Best regards,
Rui

=========================
Rui Coelho, PhD

Centre of Marine Sciences of Algarve (CCMAR)
University of Algarve, Campus de Gambelas, FCT Ed.7 
8005-139 Faro, Portugal 
Web: http://www.ccmar.ualg.pt

Portuguese Institute for the Ocean and Atmosphere, I.P. (IPMA)
Av. 5 de Outubro s/n
8700-305 Olhão, Portugal
Web: http://www.ipma.pt

Message: 1
Date: Mon, 21 Jul 2014 17:52:28 +0000
From: Samantha PameLa <manthasa_26 at hotmail.com>
To: "r-sig-ecology at r-project.org" <r-sig-ecology at r-project.org>
Subject: [R-sig-eco] deviance as a goodness of fit in GLM
Message-ID: <BLU176-W3254B1DACC52EC58B21AFF99F00 at phx.gbl>
Content-Type: text/plain

Good day everybody, 

I'm a marine biologist student, working on my bachelor thesis and I'm
stucked with a statistical doubt in the process, I hope someone here could
help me. My thesis aims to understand which biological and environmental
factors influences the male aggressive rate of male California sea lions.
For that purpose I'm using GLMs where the response variable is the male
aggressive rate. Right now I am testing the goodness of fit of the global
models and for that I'm using the deviances as a goodness of fit test. I
calculated pseudoR2 (Zuur, 2009) in order to know the percentage of
explanation of each candidate model. However Im not sure how to choose the
good models; since I am not sure over which percent of explanation
indicates a model with good fit. For my data I am working with three
different scenarios, and it seems that 20%, is a good value to could
indicate the best models, but Im not sure how to choose the value.

I thank you in advance for your time and the help you can give me.

Best regards,

Samantha.

	[[alternative HTML version deleted]]