[R] Collinearity? Cannot get logisticRidge{ridge} to work
Kengo Inagaki
kengoing.gj at gmail.com
Wed May 27 19:10:29 CEST 2015
I am currently working on a health care related project using R. I am
learning R while working on data analysis.
Below is the part of the data in which i am encountering a problem.
Case# Sex Therapy1 Therapy2 Outcome
1 male no
no Alive
2 female no
no Death
3 male no
no Alive
4 female no
no Death
5 male no
no Death
6 male no
no Alive
7 male yes
no Alive
8 female no
no Death
9 male no
yes Alive
10 female no
no Death
11 female yes
yes Death
12 female yes
no Death
13 female yes
no Death
14 female yes
no Alive
15 male yes
no Alive
16 male yes
no Alive
17 male no
yes Death
18 male no
yes Death
19 male yes
no Alive
20 female no
yes Death
21 female yes
no Alive
22 female no
yes Death
23 male yes
no Alive
24 female yes
no Alive
25 female yes
no Alive
"Outcome" is the response variable and "Sex", "Therapy1", "Therapy2" are
predictor variables.
All of the predictors are significantly associated with the outcome by
univariate analysis.
Logistic regression runs fine with most of the predictors when "Sex" and
"Therapy1" are not included at the same time (This is a part of table that
I cut out from a larger table for ease of
presentation and there are more predictors that i tested).
However, when "Sex" and "Therapy1" are included in logistic regression
model at the same time, standard error inflates and p value gets close to 1.
The formula used is,
>Model<-glm(Outcome~Sex+Therapy1,data=a,family=binomial) #I assigned a
vector "a" to represent above table.
After doing some reading, I suspect this might be collinearity, as vif
values (using "vif()" function in car package) were sky high (8,875,841 for
both "Sex" and "Therapy1").
Learning that ridge regression may be a solution, I attempted using
logisticRidge {ridge} using the following formula, but i get the
accomapnying error message.
>logisticRidge(a$Outcome~a$Sex+a$Therapy1)
Error in ifelse(y, log(p), log(1 - p)) :
invalid to change the storage mode of a factor
At this point I do not have an idea how to solve this and would like to
seek help.
I really really appreciate your input!!!
[[alternative HTML version deleted]]
More information about the R-help
mailing list