[R] Coding of categorical variables for logistic regression?

David Winsemius dwinsemius at comcast.net
Sun Mar 28 19:30:29 CEST 2010


On Mar 28, 2010, at 1:15 PM, Ravi Kulkarni wrote:

>
> Hello,
>  I am trying to do a logistic regression and have one predictor  
> variable
> (x) that is ratio and two predictor variables (y and z) that are
> categorical. These have three levels each which I have called "High",
> "Medium" and "Low".
>  My question: do I need to use a numerical coding scheme for the
> categorical variables as required by some statistical software  
> packages,
> with some sort of numeric dummy-variable coding?

No. If you have constructed those variables as factors, the regression  
functions in R will interpret them correctly, i.e. as though the  
dummies were in there. If you have not constructed them as factors,  
you should do so now.

?factor
?levels


>
>  I am using:
>                 glm(binvar~x+y+z, family=binomial(link="logit"))
>
>  Thanks,
>
>    Ravi Kulkarni
> -- 


David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list