[R] PLS method

Sat Jul 8 21:12:30 CEST 2006

	  I believe that regression coefficients can change signs in partial 
least squares (pls) or in the related structural equation modeling (sem) 
for roughly the same reasons they can change sign in ordinary least 
squares (ols).  Both PLS and SEM essentially assume that the 
'independent' variables (X's) in the model are linear combinations of 
unobserved 'structural' variables plus noise.  Then the response 
variable(s) are linear combinations of these unobserved structural 
variables plus error.

	  I have used neither pls nor sem, so I can't go beyond this.  If it 
were my problem and I wanted to understand it better, I might cut the 
example down still further, e.g., to 4 observations with only 2 X's, and 
then try to program the entire thing in Excel using the 'solver'.  Or 
make local copies of functions like 'mvr' and use 'debug' to walk 
through the code line by line, looking carefully at what it does.

	  Before you do that, however, if you aren't clear on the similarities 
and differences between pls and sem, I suggest you explore that, e.g., 
using Google.  Just now, I found the following pls / sem comparison: 
"http://www2.gsu.edu/~mkteer/relmeth.html".

	  Hope this helps.
	  Spencer Graves

Sun Jia wrote:
> dear all,
> 
> I am a new comer to R and statistic.  Now I have a little confuse about the
> package pls.
> 
> I have to use 5 components to form a model. There are strong relationship
> between some of the components, which leads to the changes of the sign of
> each coeficeince, of course this is unwanted when using the normal
> regression way. So I choose the way of PLS, which is good at solve this kind
> of problem.
> 
> In my work,
> 
> q is the response and w,c,d,r,o are the 5 components.
> 
>          w     c       d      r           o        q
> 
> 1  219.580 0.880 102.742 12.988 0.9380 11
> 
> 2  245.806 0.900  97.798 11.764 1.0080 12
> 
> 3  219.850 0.910  93.764  5.608 1.1006 16
> 
> 4  226.904 0.842 110.080 14.614 0.8398  7
> 
> 5  250.792 0.868 108.212 14.714 0.8990 10
> 
> 6  225.264 0.930  96.748  6.906 1.1784 16
> 
> 7  229.562 0.856 103.204 12.900 0.8730 12
> 
> 8  239.560 0.880 101.036 11.766 0.9452 12
> 
> 9  199.008 0.920  91.338  3.918 1.1234 17
> 
> 10 220.458 0.910  88.322  9.868 1.0746 13
> 
> 11 201.228 0.910  89.202 10.328 1.0514 14
> 
> 12 199.160 0.920  90.126  2.088 1.1326 15
> 
> 13 135.540 0.786 121.506 19.140 0.6934  2
> 
> 14 296.272 0.864 130.896 22.614 0.9104  6
> 
> 15 190.766 0.840 108.050  7.336 0.8210  8
> 
> I have used the following sentence.
> 
> b.pls<-mvr(q~w+c+d+r+o,data=b,method="simpls")
> 
>> coef.mvr(b.pls)
> 
> , , 5 comps
> 
> 
> 
>             q
> 
> w  0.01993749
> 
> c 12.42713250
> 
> d -0.12050551
> 
> r -0.20287088
> 
> o  9.63670488
> 
> 
> 
>  I have found that the sign of each component still cannot be explained by
> the reality. For instance, the sign of w should be negative rather than
> positive.
> 
>> b.pls<-mvr(q~c+d+r+o,data=b,method="simpls")
> 
>> coef.mvr(b.pls)
> 
> , , 4 comps
> 
> 
> 
>             q
> 
> c 76.39196611
> 
> d -0.06512864
> 
> r -0.18272329
> 
> o -3.02212146
> 
> 
> 
> 
> 
> When I delete one of the components, the w, I found the coefficients of the
> rest ones also changes the sign, the component of o.
> 
> As far as I concerned, this kind of situation should only happened when use
> the normal regression rather than PLS regression.
> 
> Is there any wrong with my understanding?
> 
>  Why does this problem happen?
> 
> 
> 
> I am appreciated for your help and answer.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html