<!doctype html public "-//w3c//dtd html 4.0 transitional//en">

<html>

Hello R people,

<p>i'm trying to implement the Partial Least Squares algorithm called SAMPLS

from "J.Comp-Aided Molecular Design", 7 (1993), 587-619. It's faster than

the classical PLS algorithm for fat matrix (m>>n).

<p>Here's the algorithm from the article of Bush B. L. and Nachbar R.B.:

<br>&nbsp;&nbsp;&nbsp; X is the matrix of explanatories proprieties (m*n)

, y the matrix of responses, h the number of latent variables extracted

<br>&nbsp;&nbsp;&nbsp; XT is for X matrix transposed

<br>&nbsp;&nbsp;&nbsp; x* is for the quantities for one sample (y* is the

response predicted from the model derived; i used one to test my R traduction

compared to the R pls module )

<br>&nbsp;

<p>&nbsp;&nbsp; Calculate the covariance matrix C=XX<sup>T&nbsp;&nbsp;&nbsp;

</sup>and&nbsp;<sup>&nbsp;&nbsp;&nbsp; </sup>c*=Xx* for prediction

<br><sup>&nbsp;&nbsp;&nbsp;&nbsp; </sup>y is centered and become y<sub>1</sub>

<br>&nbsp;&nbsp;&nbsp; y*<sub>1</sub>=0

<br>&nbsp;

<p><sup>&nbsp;&nbsp;&nbsp;</sup> For h =1,2,3...hmax

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s=Cy<sub>h</sub>

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; center s

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; working scalar for prediction

sample s*=c*<sup>T</sup>y<sub>h</sub>

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; orthogonalize s to previous

t: for g=1,...(h-1), s=s-(t<sub>g</sub><sup>T</sup>s/t<sub>g</sub>Tt<sub>g</sub>)t<sub>g</sub>

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; orthogonalize s* to previous

t*: for g=1,...(h-1), s*=s*-(t<sub>g</sub><sup>T</sup>s/t<sub>g</sub>Tt<sub>g</sub>)t*<sub>g</sub>

<br><sub>&nbsp;</sub>

<br><sub>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

</sub>t*<sub>h</sub>=s*<sub></sub>

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; t<sub>h</sub>=s

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; t<sub>h</sub><sup>2</sup>=t<sup>T</sup>t

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; beta<sub>h</sub>=(t<sup>T</sup>y<sub>h</sub>)/t<sub>h</sub><sup>2</sup>

<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; update y<sub>h+1</sub>=y<sub>h</sub>-beta<sub>h</sub>t<sub>h</sub>

<br><sub>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </sub>buid

up prediction y*<sub>h+1</sub>=y*<sub>h</sub>+beta<sub>h</sub>t*<sub>h</sub>

<p>&nbsp;&nbsp;&nbsp; end of cycle

<br>----------------------------------- R-code

<br>##xe and&nbsp; ye&nbsp; are the explanatories and responses matrices,

xtest and ytestsampls the variables for 1 sample

<p>x2&lt;-scale(xe,scale=FALSE)

<br>y2&lt;-scale(ye,scale=FALSE)

<p>lv&lt;-1

<br>xtest&lt;-as.matrix(x2[1,])

<br>t&lt;-matrix(0,nrow(ye),1)

<br>c&lt;-xe%*%t(xe)

<br>yh&lt;-y2

<br>ytestsampls&lt;-0

<br>ctest&lt;-xe%*%xtest

<br>&nbsp;

<p>for (h in 1:lv) {

<br>&nbsp;s&lt;-c%*%yh

<br>&nbsp;s&lt;-scale(s,scale=FALSE)

<br>stest&lt;-t(ctest)%*%yh

<p>##what follows works only for h=1 and 2, i know

<p>&nbsp;if (h>1) { s&lt;-s- ( as.numeric( (t(t)%*%s)&nbsp; / (t(t)%*%t)

) *t )

<br>&nbsp; stest&lt;-stest-( as.numeric( (t(t)%*%s)&nbsp; / (t(t)%*%t)

) *ttest )

<br>&nbsp; }

<br>ttest&lt;-stest

<br>&nbsp;t&lt;-s

<br>&nbsp;t2&lt;-t(t)%*%t

<br>&nbsp;beta&lt;-t(t)%*%yh

<br>&nbsp;beta&lt;-as.numeric(beta/t2)

<br>&nbsp;

<br>ytestsampls&lt;-ytestsampls + as.numeric(beta)*(ttest)

<br>&nbsp;yh&lt;-yh-(beta*t)

<br>}

<p>ytestsampls2&lt;-ytestsampls+mean(ye)

<br>&nbsp;

<br>&nbsp;

<p>-------------------

<p>When lv (number of variables extracted ) is 1 , no problem the y predicted

(ytestsampls2) is the same as when using the R module pls (library(pls)).

But when using lv=2, there is a difference , thus an error in my code that

must come from the update steps.

<p>Does it come from the original algorithm or from my traduction.

<p>Merci d'avance,

<p>sorry for the size of this e-mail and thanks for reading it till all,

<p>--

<br>Nicolas Baurin

<p>Doctorant

<br>Institut de Chimie Organique et Analytique, UPRES-A 6005

<br>Universit&eacute; d'Orl&eacute;ans, BP 6759

<br>45067 ORLEANS Cedex 2, France

<br>Tel: (33+) 2 38 49 45 77

<br>&nbsp;</html>