[R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

khosoda at med.kobe-u.ac.jp khosoda at med.kobe-u.ac.jp
Fri Aug 19 11:00:59 CEST 2011


Dear Mark,

Thank you very much for your kind advice.

Actually, I already performed penalized logistic regression by pentrace 
and lrm in package "rms".

The reason why I wanted to reduce dimensionality of those 9 variables 
was that these variables were not so important according to the subject 
matter knowledge and that I wanted to avoid events per variable problem.

Your answer about dudi.mix$l1 helped me a lot.
I finally was able to perform penalized logistic regression for data 
consisting of 4 important variables and x18.dudi.mix$l1[, 1]. Thanks a 
lot again.

One more question, I investigated homals package too. I found it has 
"ndim" option.

mydata is followings;

 > head(x10homals.df)
   age sex      symptom       HT       DM      IHD  smoking 
hyperlipidemia   Statin Response
1  62   M asymptomatic positive negative negative positive 
positive positive     negative
2  82   M  symptomatic positive negative negative negative 
positive positive     negative
3  64   M asymptomatic negative positive negative negative 
positive positive     negative
4  55   M  symptomatic positive positive positive negative 
positive positive     negative
5  67   M  symptomatic positive negative negative negative 
negative positive     negative
6  79   M asymptomatic positive positive negative negative 
positive positive     negative

age is continuous variable, and Response should not be active for 
computation, so, ...

x10.homals4 <- homals(x10homals.df, active = c(rep(TRUE, 9), FALSE), 
level=c("numerical", rep("nominal", 9)), ndim=4)

I did it with ndim from 2 to 9, compared Classification rate of Response 
by predict(x10.homals).

 > p.x10.homals4

Classification rate:
          Variable Cl. Rate %Cl. Rate
1             age   0.4712     47.12
2             sex   0.9808     98.08
3         symptom   0.8269     82.69
4              HT   0.9135     91.35
5              DM   0.8558     85.58
6             IHD   0.8750     87.50
7         smoking   0.9423     94.23
8  hyperlipidemia   0.9519     95.19
9          Statin   0.8942     89.42
10       Response   0.6154     61.54

This is the best for classification of Response, so, I selected ndim=4. 
Then, I found objscores.

 > head(x10.homals4$objscores)
             D1           D2           D3          D4
1 -0.002395321 -0.034032230 -0.008140378  0.02369123
2  0.036788626 -0.010308707  0.005725984 -0.02751958
3  0.014363031  0.049594466 -0.025627467  0.06254055
4  0.083092285  0.065147519  0.045903394 -0.03751551
5 -0.013692504  0.005106661 -0.007656776 -0.04107009
6  0.002320747  0.024375393 -0.017785415 -0.01752556

I used x10.homals4$objscores[, 1] as a predictor for logistic regression 
as in the same way as PC1 in PCA.

Am I going the right way?

Thanks a lot for your help in advance.

Best regards

--
Kohkichi Hosoda


(11/08/19 4:21), Mark Difford wrote:
> On Aug 18, 2011 khosoda wrote:
>
>> I'm trying to do model reduction for logistic regression.
>
> Hi Kohkichi,
>
> My general advice to you would be to do this by fitting a penalized logistic
> model (see lrm in package rms and glmnet in package glmnet; there are
> several others).
>
> Other points are that the amount of variance explained by mixed PCA and MCA
> are not comparable. Furthermore, homals() is a much better choice than MCA
> because it handles different types of variables whereas MCA is for
> categorical variables.
>
> On the more specific question of whether you should use dudi.mix$l1 or
> dudi.mix$li, it doesn't matter: the former is a scaled version of the
> latter. Same for dudi.acm. To see this do the following:
>
> ##
> plot(x18.dudi.mix$li[, 1], x18.dudi.mix$l1[, 1])
>
> Regards, Mark.
>
> -----
> Mark Difford (Ph.D.)
> Research Associate
> Botany Department
> Nelson Mandela Metropolitan University
> Port Elizabeth, South Africa
> --
> View this message in context: http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3753437.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
*************************************************
 神戸大学大学院医学研究科 脳神経外科学分野
 細田 弘吉
 
 〒650-0017 神戸市中央区楠町7丁目5-1
     Phone: 078-382-5966
     Fax  : 078-382-5979
     E-mail address
         Office: khosoda at med.kobe-u.ac.jp
	Home  : khosoda at venus.dti.ne.jp



More information about the R-help mailing list