[R] Reporting binomial logistic regression from R results

Frodo Jedi frodojedi@m@ilingli@t @ending from gm@il@com
Mon Nov 12 20:09:03 CET 2018


Dear Peter and Eik,
I am very grateful to you for your replies.
My current understanding is that from the GLM analysis I can indeed
conclude that the response predicted by System A is significantly different
from that of System B, while the pairwise comparison A vs C leads to non
significance. Now the Wald test seems to be correct only for Systems B vs
C, indicating that the pairwise System B vs System C is significant. Am I
correct?

However, my current understanding is also that I should use contrasts
instead of the wald test. So the default contrasts is with the System A,
now I should re-perform the GLM with another base. I tried to use the
option "contrasts" of the glm:

> fit1 <- glm(Response ~ System, data = scrd, family = "binomial",
contrasts = contr.treatment(3, base=1,contrasts=TRUE))
> summary(fit1)

> fit2 <- glm(Response ~ System, data = scrd, family = "binomial",
contrasts = contr.treatment(3, base=2,contrasts=TRUE))
> summary(fit2)

> fit3 <- glm(Response ~ System, data = scrd, family = "binomial",
contrasts = contr.treatment(3, base=3,contrasts=TRUE))
> summary(fit3)

However, the output of these three summary functions are identical. Why?
That option should have changed the base, but apparently this is not the
case.


Another analysis I found online (at this link
https://stats.stackexchange.com/questions/60352/comparing-levels-of-factors-after-a-glm-in-r
)
to understand the differences between the 3 levels is to use glth with
Tuckey. I performed the following:

> library(multcomp)
> summary(glht(fit, mcp(System="Tukey")))

Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: glm(formula = Response ~ System, family = "binomial", data = scrd)

Linear Hypotheses:
                      Estimate Std. Error z value Pr(>|z|)
B - A == 0  -1.2715     0.3379  -3.763 0.000445 ***
C - A == 0    0.8588     0.4990   1.721 0.192472
C - B == 0     2.1303     0.4512   4.722  < 1e-04 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)


Is this Tukey analysis correct?


I am a bit confused on what analysis I should do. I am doing my very best
to study all resources I can find, but I would really need some help from
experts, especially in using R.


Best wishes

FJ






On Mon, Nov 12, 2018 at 1:46 PM peter dalgaard <pdalgd using gmail.com> wrote:

> Yes, only one of the pairwise comparisons (B vs. C) is right. Also, the
> overall test has 3 degrees of freedom whereas a comparison of 3 groups
> should have 2. You (meaning Frodo) are testing that _all 3_ regression
> coefficients are zero, intercept included. That would imply that all three
> systems have response probablilities og 0.5, which is not likely what you
> want.
>
> This all suggests that you are struggling with the interpretation of the
> regression coefficients and their role in the linear predictor. This should
> be covered by any good book on logistic regression.
>
> -pd
>
> > On 12 Nov 2018, at 14:15 , Eik Vettorazzi <E.Vettorazzi using uke.de> wrote:
> >
> > Dear Jedi,
> > please use the source carefully. A and C are not statistically different
> at the 5% level, which can be inferred from glm output. Your last two
> wald.tests don't test what you want to, since your model contains an
> intercept term. You specified contrasts which tests A vs B-A, ie A-
> (B-A)==0 <-> 2*A-B==0 which is not intended I think. Have a look at
> ?contr.treatment and re-read your source doc to get an idea what dummy
> coding and indicatr variables are about.
> >
> > Cheers
> >
> >
> > Am 12.11.2018 um 02:07 schrieb Frodo Jedi:
> >> Dear list members,
> >> I need some help in understanding whether I am doing correctly a
> binomial
> >> logistic regression and whether I am interpreting the results in the
> >> correct way. Also I would need an advice regarding the reporting of the
> >> results from the R functions.
> >> I want to report the results of a binomial logistic regression where I
> want
> >> to assess difference between the 3 levels of a factor (called System) on
> >> the dependent variable (called Response) taking two values, 0 and 1. My
> >> goal is to understand if the effect of the 3 systems (A,B,C) in System
> >> affect differently Response in a significant way. I am basing my
> analysis
> >> on this URL: https://stats.idre.ucla.edu/r/dae/logit-regression/
> >> This is the result of my analysis:
> >>> fit <- glm(Response ~ System, data = scrd, family = "binomial")
> >>> summary(fit)
> >> Call:
> >> glm(formula = Response ~ System, family = "binomial", data = scrd)
> >> Deviance Residuals:
> >>     Min       1Q   Median       3Q      Max
> >> -2.8840   0.1775   0.2712   0.2712   0.5008
> >> Coefficients:
> >>              Estimate Std. Error z value Pr(>|z|)
> >> (Intercept)    3.2844     0.2825  11.626  < 2e-16 ***
> >> SystemB  -1.2715     0.3379  -3.763 0.000168 ***
> >> SystemC    0.8588     0.4990   1.721 0.085266 .
> >> ---
> >> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >> (Dispersion parameter for binomial family taken to be 1)
> >>     Null deviance: 411.26  on 1023  degrees of freedom
> >> Residual deviance: 376.76  on 1021  degrees of freedom
> >> AIC: 382.76
> >> Number of Fisher Scoring iterations: 6
> >> Following this analysis I perform the wald test in order to understand
> >> whether there is an overall effect of System:
> >> library(aod)
> >>> wald.test(b = coef(fit), Sigma = vcov(fit), Terms = 1:3)
> >> Wald test:
> >> ----------
> >> Chi-squared test:
> >> X2 = 354.6, df = 3, P(> X2) = 0.0
> >> The chi-squared test statistic of 354.6, with 3 degrees of freedom is
> >> associated with a p-value < 0.001 indicating that the overall effect of
> >> System is statistically significant.
> >> Now I check whether there are differences between the coefficients using
> >> again the wald test:
> >> # Here difference between system B and C:
> >>> l <- cbind(0, 1, -1)
> >>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
> >> Wald test:
> >> ----------
> >> Chi-squared test:
> >> X2 = 22.3, df = 1, P(> X2) = 2.3e-06
> >> # Here difference between system A and C:
> >>> l <- cbind(1, 0, -1)
> >>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
> >> Wald test:
> >> ----------
> >> Chi-squared test:
> >> X2 = 12.0, df = 1, P(> X2) = 0.00052
> >> # Here difference between system A and B:
> >>> l <- cbind(1, -1, 0)
> >>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
> >> Wald test:
> >> ----------
> >> Chi-squared test:
> >> X2 = 58.7, df = 1, P(> X2) = 1.8e-14
> >> My understanding is that from this analysis I can state that the three
> >> systems lead to a significantly different Response. Am I right? If so,
> how
> >> should I report the results of this analysis? What is the correct way?
> >> Thanks in advance
> >> Best wishes
> >> FJ
> >>      [[alternative HTML version deleted]]
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Eik Vettorazzi
> >
> > Department of Medical Biometry and Epidemiology
> > University Medical Center Hamburg-Eppendorf
> >
> > Martinistrasse 52
> > building W 34
> > 20246 Hamburg
> >
> > Phone: +49 (0) 40 7410 - 58243
> > Fax:   +49 (0) 40 7410 - 57790
> > Web: www.uke.de/imbe
> > --
> >
> > _____________________________________________________________________
> >
> > Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
> Rechts; Gerichtsstand: Hamburg | www.uke.de
> > Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr.
> Dr. Uwe Koch-Gromus, Joachim Prölß, Marya Verdel
> > _____________________________________________________________________
> >
> > SAVE PAPER - THINK BEFORE PRINTING
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com
>
>
>
>
>
>
>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list