[R] Dependent Variable in Logistic Regression
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Sat Aug 1 21:48:12 CEST 2020
Hello,
Inline.
Às 20:01 de 01/08/2020, John Fox escreveu:
> Dear Paul,
>
> I think that this thread has gotten unnecessarily complicated. The
> answer, as is easily demonstrated, is that a binary response for a
> binomial GLM in glm() may be a factor, a numeric variable, or a
> logical variable, with identical results; for example:
>
> --------------- snip -------------
>
> > set.seed(123)
>
> > head(x <- rnorm(100))
> [1] -0.56047565 -0.23017749 1.55870831 0.07050839 0.12928774
> 1.71506499
>
> > head(y <- rbinom(100, 1, 1/(1 + exp(-x))))
> [1] 0 1 1 1 1 0
>
> > head(yf <- as.factor(y))
> [1] 0 1 1 1 1 0
> Levels: 0 1
>
> > head(yl <- y == 1)
> [1] FALSE TRUE TRUE TRUE TRUE FALSE
>
> > glm(y ~ x, family=binomial)
>
> Call: glm(formula = y ~ x, family = binomial)
>
> Coefficients:
> (Intercept) x
> 0.3995 1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
> Null Deviance: 134.6
> Residual Deviance: 114.9 AIC: 118.9
>
> > glm(yf ~ x, family=binomial)
>
> Call: glm(formula = yf ~ x, family = binomial)
>
> Coefficients:
> (Intercept) x
> 0.3995 1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
> Null Deviance: 134.6
> Residual Deviance: 114.9 AIC: 118.9
>
> > glm(yl ~ x, family=binomial)
>
> Call: glm(formula = yl ~ x, family = binomial)
>
> Coefficients:
> (Intercept) x
> 0.3995 1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
> Null Deviance: 134.6
> Residual Deviance: 114.9 AIC: 118.9
>
> --------------- snip -------------
>
> The original poster claimed to have encountered an error with a 0/1
> numeric response, but didn't show any data or even a command. I
> suspect that the response was a character variable, but of course
> can't really know that.
So continuing with your example:
> head(yc <- as.character(y))
[1] "0" "1" "1" "1" "1" "0"
> glm(yc ~ x, family=binomial)
Error in weights * y : non-numeric argument to binary operator
But the OP says that
[...] R complains that I should make the dependent variable a factor.
That is not what the error message says, it "asks" for a numeric
argument to the '*' operator.
We haven't seen the exact R message yet, so, like others have said, the
OP should post it along with code.
Hope this helps,
Rui Barradas
>
> Best,
> John
>
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
> On 2020-08-01 2:25 p.m., Paul Bernal wrote:
>> Dear friend,
>>
>> I am aware that I have a binomial dependent variable, which is covid
>> status
>> (1 if covid positive, and 0 otherwise).
>>
>> My question was if R requires to turn a binomial response variable
>> into a
>> factor or not, that's all.
>>
>> Cheers,
>>
>> Paul
>>
>> El sáb., 1 de agosto de 2020 1:22 p. m., Bert Gunter
>> <bgunter.4567 using gmail.com>
>> escribió:
>>
>>> ... yes, but so does lm() for a categorical **INdependent** variable
>>> with
>>> more than 2 numerically labeled levels. n levels = (n-1) df for a
>>> categorical covariate, but 1 for a continuous one (unless more complex
>>> models are explicitly specified of course). As I said, the OP seems
>>> confused about whether he is referring to the response or
>>> covariates. Or
>>> maybe he just made the same typo I did.
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming
>>> along and
>>> sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) <
>>> malone using malonequantitative.com> wrote:
>>>
>>>> No, R does not. glm() does in order to do logistic regression.
>>>>
>>>> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal <paulbernal07 using gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Bert,
>>>>>
>>>>> Thank you for the kind reply.
>>>>>
>>>>> But what if I don't turn the variable into a factor. Let's say
>>>>> that in
>>>>> excel I just coded the variable as 1s and 0s and just imported the
>>>>> dataset
>>>>> into R and fitted the logistic regression without turning any
>>>>> categorical
>>>>> variable or dummy variable into a factor?
>>>>>
>>>>> Does R requires every dummy variable to be treated as a factor?
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Paul
>>>>>
>>>>> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
>>>>> bgunter.4567 using gmail.com> escribió:
>>>>>
>>>>>> x <- factor(0:1)
>>>>>> x <- factor("yes","no")
>>>>>>
>>>>>> will produce identical results up to labeling.
>>>>>>
>>>>>>
>>>>>> Bert Gunter
>>>>>>
>>>>>> "The trouble with having an open mind is that people keep coming
>>>>>> along
>>>>> and
>>>>>> sticking things into it."
>>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>>>
>>>>>>
>>>>>> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal <paulbernal07 using gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Dear friends,
>>>>>>>
>>>>>>> Hope you are doing great. I want to fit a logistic regression in R,
>>>>> where
>>>>>>> the dependent variable is the covid status (I used 1 for covid
>>>>> positives,
>>>>>>> and 0 for covid negatives), but when I ran the glm, R complains
>>>>>>> that I
>>>>>>> should make the dependent variable a factor.
>>>>>>>
>>>>>>> What would be more advisable, to keep the dependent variable
>>>>>>> with 1s
>>>>> and
>>>>>>> 0s, or code it as yes/no and then make it a factor?
>>>>>>>
>>>>>>> Any guidance will be greatly appreciated,
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Paul
>>>>>>>
>>>>>>> [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>
>>>>>>
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>> --
>>>> Patrick S. Malone, Ph.D., Malone Quantitative
>>>> NEW Service Models: http://malonequantitative.com
>>>>
>>>> He/Him/His
>>>>
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Este e-mail foi verificado em termos de vírus pelo software antivírus Avast.
https://www.avast.com/antivirus
More information about the R-help
mailing list