[R] Can't find the error in a Binomial GLM I am doing, please help
lincoln
miseno77 at hotmail.com
Mon May 7 19:05:37 CEST 2012
Hi all,
I can't find the error in the binomial GLM I have done. I want to use that
because there are more than one explanatory variables (all categorical) and
a binary response variable.
This is how my data set looks like:
> str(data)
'data.frame': 1004 obs. of 5 variables:
$ site : int 0 0 0 0 0 0 0 0 0 0 ...
$ sex : Factor w/ 2 levels "0","1": NA NA NA NA 1 NA NA NA NA NA ...
$ age : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
$ cohort: Factor w/ 11 levels "1996","2000",..: 11 11 11 11 11 11 11 11 11
11 ...
$ birth : Factor w/ 3 levels "5","6","7": 3 3 2 2 2 2 2 2 2 2 ...
I know that, particularly for one level of variable "cohort" (2004 value),
it should be a strong effect of variable "cohort" on variable "site" so I do
a Chi square test that confirms the null hypothesis there is a difference in
sites on the way "cohort" is distributed:
> (chisq.test(data$site,data$cohort))
Pearson's Chi-squared test
data: data$site and data$cohort
X-squared = 82.6016, df = 10, *p-value = 1.549e-13*
Mensajes de aviso perdidos
In chisq.test(data$site, data$cohort) :
Chi-squared approximation may be incorrect
After that, I have tried to use a binomial GLM with all the explanatory
variables but I couldn't find any significance of any variable, neither
cohort, and for this reason I tried to use only cohort as predictor and I
get this:
> BinomialGlm <- glm(site ~ cohort, data=data,binomial)
> summary(BinomialGlm)
Call:
glm(formula = site ~ cohort, family = binomial, data = data)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9239 -0.9365 -0.9365 1.3584 1.6651
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -12.57 324.74 -0.039 0.969
cohort2000 11.47 324.75 0.035 0.972
cohort2001 13.82 324.74 0.043 0.966
cohort2002 12.97 324.74 0.040 0.968
cohort2003 13.66 324.74 0.042 0.966
*cohort2004 14.25 324.74 0.044 0.965*
cohort2006 12.21 324.74 0.038 0.970
cohort2007 11.81 324.74 0.036 0.971
cohort2008 12.41 324.74 0.038 0.970
cohort2009 12.15 324.74 0.037 0.970
cohort2010 11.97 324.74 0.037 0.971
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1369.3 on 1003 degrees of freedom
Residual deviance: 1283.7 on 993 degrees of freedom
AIC: 1305.7
Number of Fisher Scoring iterations: 11
I tired to use simple GLM (gaussian family) and I get results that are more
logicals:
> GaussGlm <- glm(site ~ cohort, data=data)
> summary(GaussGlm)
Call:
glm(formula = site ~ cohort, data = data)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.8429 -0.3550 -0.3550 0.6025 0.7500
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.740e-14 4.762e-01 0.000 1.0000
cohort2000 2.500e-01 5.324e-01 0.470 0.6388
cohort2001 7.778e-01 5.020e-01 1.549 0.1216
cohort2002 6.000e-01 4.880e-01 1.230 0.2192
cohort2003 7.500e-01 4.861e-01 1.543 0.1231
*cohort2004 8.429e-01 4.796e-01 1.757 0.0792 .*
cohort2006 4.118e-01 4.832e-01 0.852 0.3943
cohort2007 3.204e-01 4.785e-01 0.670 0.5033
cohort2008 4.600e-01 4.786e-01 0.961 0.3367
cohort2009 3.975e-01 4.772e-01 0.833 0.4051
cohort2010 3.550e-01 4.768e-01 0.745 0.4567
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for gaussian family taken to be 0.2267955)
Null deviance: 245.40 on 1003 degrees of freedom
Residual deviance: 225.21 on 993 degrees of freedom
AIC: 1372.5
Number of Fisher Scoring iterations: 2
What is going on? Any suggestion/commentary?
--
View this message in context: http://r.789695.n4.nabble.com/Can-t-find-the-error-in-a-Binomial-GLM-I-am-doing-please-help-tp4615340.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list