[R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction
khosoda at med.kobe-u.ac.jp
khosoda at med.kobe-u.ac.jp
Fri Aug 19 11:00:59 CEST 2011
Dear Mark,
Thank you very much for your kind advice.
Actually, I already performed penalized logistic regression by pentrace
and lrm in package "rms".
The reason why I wanted to reduce dimensionality of those 9 variables
was that these variables were not so important according to the subject
matter knowledge and that I wanted to avoid events per variable problem.
Your answer about dudi.mix$l1 helped me a lot.
I finally was able to perform penalized logistic regression for data
consisting of 4 important variables and x18.dudi.mix$l1[, 1]. Thanks a
lot again.
One more question, I investigated homals package too. I found it has
"ndim" option.
mydata is followings;
> head(x10homals.df)
age sex symptom HT DM IHD smoking
hyperlipidemia Statin Response
1 62 M asymptomatic positive negative negative positive
positive positive negative
2 82 M symptomatic positive negative negative negative
positive positive negative
3 64 M asymptomatic negative positive negative negative
positive positive negative
4 55 M symptomatic positive positive positive negative
positive positive negative
5 67 M symptomatic positive negative negative negative
negative positive negative
6 79 M asymptomatic positive positive negative negative
positive positive negative
age is continuous variable, and Response should not be active for
computation, so, ...
x10.homals4 <- homals(x10homals.df, active = c(rep(TRUE, 9), FALSE),
level=c("numerical", rep("nominal", 9)), ndim=4)
I did it with ndim from 2 to 9, compared Classification rate of Response
by predict(x10.homals).
> p.x10.homals4
Classification rate:
Variable Cl. Rate %Cl. Rate
1 age 0.4712 47.12
2 sex 0.9808 98.08
3 symptom 0.8269 82.69
4 HT 0.9135 91.35
5 DM 0.8558 85.58
6 IHD 0.8750 87.50
7 smoking 0.9423 94.23
8 hyperlipidemia 0.9519 95.19
9 Statin 0.8942 89.42
10 Response 0.6154 61.54
This is the best for classification of Response, so, I selected ndim=4.
Then, I found objscores.
> head(x10.homals4$objscores)
D1 D2 D3 D4
1 -0.002395321 -0.034032230 -0.008140378 0.02369123
2 0.036788626 -0.010308707 0.005725984 -0.02751958
3 0.014363031 0.049594466 -0.025627467 0.06254055
4 0.083092285 0.065147519 0.045903394 -0.03751551
5 -0.013692504 0.005106661 -0.007656776 -0.04107009
6 0.002320747 0.024375393 -0.017785415 -0.01752556
I used x10.homals4$objscores[, 1] as a predictor for logistic regression
as in the same way as PC1 in PCA.
Am I going the right way?
Thanks a lot for your help in advance.
Best regards
--
Kohkichi Hosoda
(11/08/19 4:21), Mark Difford wrote:
> On Aug 18, 2011 khosoda wrote:
>
>> I'm trying to do model reduction for logistic regression.
>
> Hi Kohkichi,
>
> My general advice to you would be to do this by fitting a penalized logistic
> model (see lrm in package rms and glmnet in package glmnet; there are
> several others).
>
> Other points are that the amount of variance explained by mixed PCA and MCA
> are not comparable. Furthermore, homals() is a much better choice than MCA
> because it handles different types of variables whereas MCA is for
> categorical variables.
>
> On the more specific question of whether you should use dudi.mix$l1 or
> dudi.mix$li, it doesn't matter: the former is a scaled version of the
> latter. Same for dudi.acm. To see this do the following:
>
> ##
> plot(x18.dudi.mix$li[, 1], x18.dudi.mix$l1[, 1])
>
> Regards, Mark.
>
> -----
> Mark Difford (Ph.D.)
> Research Associate
> Botany Department
> Nelson Mandela Metropolitan University
> Port Elizabeth, South Africa
> --
> View this message in context: http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3753437.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
*************************************************
神戸大学大学院医学研究科 脳神経外科学分野
細田 弘吉
〒650-0017 神戸市中央区楠町7丁目5-1
Phone: 078-382-5966
Fax : 078-382-5979
E-mail address
Office: khosoda at med.kobe-u.ac.jp
Home : khosoda at venus.dti.ne.jp
More information about the R-help
mailing list