[R] Can't find the error in a Binomial GLM I am doing, please help

Bert Gunter gunter.berton at gene.com
Mon May 7 19:39:18 CEST 2012


1. As this is a statistical, rather than an R issue, you would do
better posting on a statistical help site like stats.stackexchange.com
(although some generous soul here may respond).

2. You would also probably do better consulting with a local
statistical resource if available, as it is difficult to explain such
issues remotely.

Cheers,
Bert

On Mon, May 7, 2012 at 10:05 AM, lincoln <miseno77 at hotmail.com> wrote:
> Hi all,
>
> I can't find the error in the binomial GLM I have done. I want to use that
> because there are more than one explanatory variables (all categorical) and
> a binary response variable.
> This is how my data set looks like:
>> str(data)
> 'data.frame':   1004 obs. of  5 variables:
>  $ site  : int  0 0 0 0 0 0 0 0 0 0 ...
>  $ sex   : Factor w/ 2 levels "0","1": NA NA NA NA 1 NA NA NA NA NA ...
>  $ age   : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
>  $ cohort: Factor w/ 11 levels "1996","2000",..: 11 11 11 11 11 11 11 11 11
> 11 ...
>  $ birth : Factor w/ 3 levels "5","6","7": 3 3 2 2 2 2 2 2 2 2 ...
>
> I know that, particularly for one level of variable "cohort" (2004 value),
> it should be a strong effect of variable "cohort" on variable "site" so I do
> a Chi square test that confirms the null hypothesis there is a difference in
> sites on the way "cohort" is distributed:
>
>> (chisq.test(data$site,data$cohort))
>
>        Pearson's Chi-squared test
>
> data:  data$site and data$cohort
> X-squared = 82.6016, df = 10, *p-value = 1.549e-13*
>
> Mensajes de aviso perdidos
> In chisq.test(data$site, data$cohort) :
>  Chi-squared approximation may be incorrect
>
>
>
>
> After that, I have tried to use a binomial GLM with all the explanatory
> variables but I couldn't find any significance of any variable, neither
> cohort, and for this reason I tried to use only cohort as predictor and I
> get this:
>
>
>> BinomialGlm <- glm(site ~  cohort, data=data,binomial)
>> summary(BinomialGlm)
>
> Call:
> glm(formula = site ~ cohort, family = binomial, data = data)
>
> Deviance Residuals:
>    Min       1Q   Median       3Q      Max
> -1.9239  -0.9365  -0.9365   1.3584   1.6651
>
> Coefficients:
>            Estimate Std. Error z value Pr(>|z|)
> (Intercept)   -12.57     324.74  -0.039    0.969
> cohort2000     11.47     324.75   0.035    0.972
> cohort2001     13.82     324.74   0.043    0.966
> cohort2002     12.97     324.74   0.040    0.968
> cohort2003     13.66     324.74   0.042    0.966
> *cohort2004     14.25     324.74   0.044    0.965*
> cohort2006     12.21     324.74   0.038    0.970
> cohort2007     11.81     324.74   0.036    0.971
> cohort2008     12.41     324.74   0.038    0.970
> cohort2009     12.15     324.74   0.037    0.970
> cohort2010     11.97     324.74   0.037    0.971
>
> (Dispersion parameter for binomial family taken to be 1)
>
>    Null deviance: 1369.3  on 1003  degrees of freedom
> Residual deviance: 1283.7  on  993  degrees of freedom
> AIC: 1305.7
>
> Number of Fisher Scoring iterations: 11
>
>
>
>
> I tired to use simple GLM (gaussian family) and I get results that are more
> logicals:
>
>> GaussGlm <- glm(site ~  cohort, data=data)
>> summary(GaussGlm)
>
> Call:
> glm(formula = site ~ cohort, data = data)
>
> Deviance Residuals:
>    Min       1Q   Median       3Q      Max
> -0.8429  -0.3550  -0.3550   0.6025   0.7500
>
> Coefficients:
>             Estimate Std. Error t value Pr(>|t|)
> (Intercept) 5.740e-14  4.762e-01   0.000   1.0000
> cohort2000  2.500e-01  5.324e-01   0.470   0.6388
> cohort2001  7.778e-01  5.020e-01   1.549   0.1216
> cohort2002  6.000e-01  4.880e-01   1.230   0.2192
> cohort2003  7.500e-01  4.861e-01   1.543   0.1231
> *cohort2004  8.429e-01  4.796e-01   1.757   0.0792 .*
> cohort2006  4.118e-01  4.832e-01   0.852   0.3943
> cohort2007  3.204e-01  4.785e-01   0.670   0.5033
> cohort2008  4.600e-01  4.786e-01   0.961   0.3367
> cohort2009  3.975e-01  4.772e-01   0.833   0.4051
> cohort2010  3.550e-01  4.768e-01   0.745   0.4567
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for gaussian family taken to be 0.2267955)
>
>    Null deviance: 245.40  on 1003  degrees of freedom
> Residual deviance: 225.21  on  993  degrees of freedom
> AIC: 1372.5
>
> Number of Fisher Scoring iterations: 2
>
>
>
> What is going on? Any suggestion/commentary?
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Can-t-find-the-error-in-a-Binomial-GLM-I-am-doing-please-help-tp4615340.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list