[R] PLS method
Spencer Graves
spencer.graves at pdf.com
Sat Jul 8 21:12:30 CEST 2006
I believe that regression coefficients can change signs in partial
least squares (pls) or in the related structural equation modeling (sem)
for roughly the same reasons they can change sign in ordinary least
squares (ols). Both PLS and SEM essentially assume that the
'independent' variables (X's) in the model are linear combinations of
unobserved 'structural' variables plus noise. Then the response
variable(s) are linear combinations of these unobserved structural
variables plus error.
I have used neither pls nor sem, so I can't go beyond this. If it
were my problem and I wanted to understand it better, I might cut the
example down still further, e.g., to 4 observations with only 2 X's, and
then try to program the entire thing in Excel using the 'solver'. Or
make local copies of functions like 'mvr' and use 'debug' to walk
through the code line by line, looking carefully at what it does.
Before you do that, however, if you aren't clear on the similarities
and differences between pls and sem, I suggest you explore that, e.g.,
using Google. Just now, I found the following pls / sem comparison:
"http://www2.gsu.edu/~mkteer/relmeth.html".
Hope this helps.
Spencer Graves
Sun Jia wrote:
> dear all,
>
> I am a new comer to R and statistic. Now I have a little confuse about the
> package pls.
>
> I have to use 5 components to form a model. There are strong relationship
> between some of the components, which leads to the changes of the sign of
> each coeficeince, of course this is unwanted when using the normal
> regression way. So I choose the way of PLS, which is good at solve this kind
> of problem.
>
> In my work,
>
> q is the response and w,c,d,r,o are the 5 components.
>
> w c d r o q
>
> 1 219.580 0.880 102.742 12.988 0.9380 11
>
> 2 245.806 0.900 97.798 11.764 1.0080 12
>
> 3 219.850 0.910 93.764 5.608 1.1006 16
>
> 4 226.904 0.842 110.080 14.614 0.8398 7
>
> 5 250.792 0.868 108.212 14.714 0.8990 10
>
> 6 225.264 0.930 96.748 6.906 1.1784 16
>
> 7 229.562 0.856 103.204 12.900 0.8730 12
>
> 8 239.560 0.880 101.036 11.766 0.9452 12
>
> 9 199.008 0.920 91.338 3.918 1.1234 17
>
> 10 220.458 0.910 88.322 9.868 1.0746 13
>
> 11 201.228 0.910 89.202 10.328 1.0514 14
>
> 12 199.160 0.920 90.126 2.088 1.1326 15
>
> 13 135.540 0.786 121.506 19.140 0.6934 2
>
> 14 296.272 0.864 130.896 22.614 0.9104 6
>
> 15 190.766 0.840 108.050 7.336 0.8210 8
>
> I have used the following sentence.
>
> b.pls<-mvr(q~w+c+d+r+o,data=b,method="simpls")
>
>> coef.mvr(b.pls)
>
> , , 5 comps
>
>
>
> q
>
> w 0.01993749
>
> c 12.42713250
>
> d -0.12050551
>
> r -0.20287088
>
> o 9.63670488
>
>
>
> I have found that the sign of each component still cannot be explained by
> the reality. For instance, the sign of w should be negative rather than
> positive.
>
>> b.pls<-mvr(q~c+d+r+o,data=b,method="simpls")
>
>> coef.mvr(b.pls)
>
> , , 4 comps
>
>
>
> q
>
> c 76.39196611
>
> d -0.06512864
>
> r -0.18272329
>
> o -3.02212146
>
>
>
>
>
> When I delete one of the components, the w, I found the coefficients of the
> rest ones also changes the sign, the component of o.
>
> As far as I concerned, this kind of situation should only happened when use
> the normal regression rather than PLS regression.
>
> Is there any wrong with my understanding?
>
> Why does this problem happen?
>
>
>
> I am appreciated for your help and answer.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list