[R] getting p-value and standard error in PLS
Bjørn-Helge Mevik
b.h.mevik at usit.uio.no
Wed Oct 19 09:45:18 CEST 2011
arunkumar1111 <akpbond007 at gmail.com> writes:
> How to get p-value and the standard error in PLS
There is (to my knowledge) no theory able to calculate p-values for the
regression coefficients in PLS regression. Most practicioners use
cross-validation to estimate the Root Mean Squared Error (RMSEP) and use
that as a measure of the quality of the fit. PLS regression is
typically used when you have many (hundreds, thousands, tens of
thousands) of predictors, where individual p-values are not very useful.
The pls package does implement the jackknife to estimate the
variance/standard error of the regression coefficients. There is even a
function to calculate p-values from that, but please _do_ read the
warning in the documentation: the distribution of the "t values" used in
the test is _unknown_. See the example in ?jack.test for how to use the
jackknife.
> I have used the following function to calculate PLS
>
> fit1 <- mvr(formula=Y~X1+X2+X3+X4, data=Dataset, comp=4)
>From a previous message on this list, I see that each of these predictor
terms (X1, ...) is a vector. Thus you have only 4 predictor variables,
so it would probably be better to use Ordinary Least Squares (OLS)
regression (the lm() function in R). There you get p-values automatically.
Furthermore, a PLS regression with the same number of components as
predictor variables is equivalent to OLS, so there seems no reason to
use PLS at all in your case.
--
Cheers,
Bjørn-Helge Mevik
More information about the R-help
mailing list