[R] Appropriate regression model for categorical variables
Robert A LaBudde
ral at lcfltd.com
Tue Jun 12 20:08:37 CEST 2007
At 01:45 PM 6/12/2007, Tirtha wrote:
>Dear users,
>In my psychometric test i have applied logistic regression on my data. My
>data consists of 50 predictors (22 continuous and 28 categorical) plus a
>binary response.
>
>Using glm(), stepAIC() i didn't get satisfactory result as misclassification
>rate is too high. I think categorical variables are responsible for this
>debacle. Some of them have more than 6 level (one has 10 level).
>
>Please suggest some better regression model for this situation. If possible
>you can suggest some article.
1. Using if a factor has many levels, there is a natural order to the
levels. If so, consider fitting the factor as an ordered factor.
2. Break the factor levels into 2 or 3 groups that have some rational
connection. Then fit the factor with a smaller number of levels.
E.g., "race" might have levels "white", "black", "asian", "pacific",
"Spanish surname", "other". Consider a change to "white", "nonwhite".
================================================================
Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: ral at lcfltd.com
Least Cost Formulations, Ltd. URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239 Fax: 757-467-2947
"Vere scire est per causas scire"
More information about the R-help
mailing list