[R] Linear Logistic Regression - Understanding the output (and possibly the test to use!)
Peng, C
cpeng.usm at gmail.com
Sun Sep 5 15:32:55 CEST 2010
Calum-4 wrote:
>
> Hi I know asking which test to use is frowned upon on this list... so
> please do read on for at least a couple on sentences...
>
> I have some multivariate data slit as follows
>
> Tumour Site (one of 5 categories) #
> Chemo Schedule (one of 3 cats) ##
> Cycle (one of 3 cats*) ##
> Dose (one of 3 cats*) #
>
> *These are actually integers but for all our other analysis so far we
> have grouped them into logical bands of categories.
>
> The dependant variable is "Reaction" or "No Reaction"
>
> I have individually analysed each of the independant variables against
> Reaction/No Reaction using ChiSq and Fisher Tests. Those marked ##
> produced p values less than 0.05, and those marked # produce p values
> close to 0.05.
>
> We believe that Cycle is the crucial piece of data - the others just
> appear to be different because there are more early cycles in certain
> groups than others.
>
> SO - I believe what I need to do is a Linear Logistic Regression on the
> 4 independant variables. And I'm expecting it to show that the tumour
> site, schedule and dose don't matter, only the cycle matters. Done a lot
> of reading and I'm clueless!!
>
> I think I want to do something like:
>
> glm (reaction ~ site + sched + cycle + dose, data=mydata, family=poisson)
> =========================
> Comment 1: If you stick to Linear Logistic Regression, the family should
> be "binomial" assuming that reaction has only two values (Yes/No).
> "family=poisson" should be used when the response is a frequency count
> such as the number of tumors.
> =========================
>
> I am then expecting to see some very long output with lots of numbers...
> ...my question is TWO fold -
>
> 1. is glm the right thing to use before I waste my time
>
> and 2. how do I interpret the result! (I'm kind of expect a lecture here
> as I'm really looking for a nice snappy 'p<0.05 means this variable is
> the one having the influence' type answer and I suspect I'm going to be
> told thats not possible...!
> ================================================================
> Comment 2: The regression coefficients in binary logistic regression
> models are called log-odds ratio. The interpretation of odds ratio can be
> tricky but the p-value is interpreted in the usual way.
> ================================================================
> To be clear the example given in the docs is:
>
>> library(MASS)
>
>> data(anorexia)
>
>> anorex.1<- glm(Postwt ~ Prewt + Treat + offset(Prewt), family =
>> gaussian, data = anorexia)
>
> ===================================
> Comment 3. Here Postwt is a continuous variable. The specification "family
> = gaussian" assumes the that Postwt is a normal variable, therefore, the
> fitted model is the ordinary normal linear regression model.
> ===================================
>
> The output of anorex.1 is:
>
> Call: glm(formula = Postwt ~ Prewt + Treat + offset(Prewt), family =
> gaussian, data = anorexia)
>
> Coefficients:
>
> (Intercept) Prewt TreatCont TreatFT
>
> 49.7711 -0.5655 -4.0971 4.5631
>
> Degrees of Freedom: 71 Total (i.e. Null); 68 Residual
>
> Null Deviance: 4525
>
> Residual Deviance: 3311 AIC: 490
>
>
>
> and the output of summary(anorex.1) is:
>
> Call:
>
> glm(formula = Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian,
>
> data = anorexia)
>
> Deviance Residuals:
>
> Min 1Q Median 3Q Max
>
> -14.1083 -4.2773 -0.5484 5.4838 15.2922
>
> Coefficients:
>
> Estimate Std. Error t value Pr(>|t|)
>
> (Intercept) 49.7711 13.3910 3.717 0.000410 ***
>
> Prewt -0.5655 0.1612 -3.509 0.000803 ***
>
> TreatCont -4.0971 1.8935 -2.164 0.033999 *
>
> TreatFT 4.5631 2.1333 2.139 0.036035 *
>
> ---
>
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> (Dispersion parameter for gaussian family taken to be 48.69504)
>
> Null deviance: 4525.4 on 71 degrees of freedom
>
> Residual deviance: 3311.3 on 68 degrees of freedom
>
> AIC: 489.97
>
> Number of Fisher Scoring iterations: 2
>
>
>
> ---
> Either can someone point me to a decent place that would explain what
> the means or provide me some pointers? i.e. which of the variables has
> the influence on the outcome in the anorexia data?
>
> Please don't shout!! happy to be pointed to a reference but would prefer
> one in common english not some stats mumbo jumbo!
>
> Calum
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context: http://r.789695.n4.nabble.com/non-zero-exit-status-error-when-install-GenomeGraphs-tp2526950p2527317.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list