[R] How to formulate quadratic function with interaction terms for the PLS fitting model?

Sun Jul 16 17:47:04 CEST 2017

??
If I haven't misunderstood, they are completely different!

1) NIR must be a matrix, or poly(NIR,...) will fail.
2) Due to the previously identified bug in poly, degree must be
explicitly given as poly(NIR, degree =2,raw = TRUE).

Now consider the following example:

> df <-matrix(runif(60),ncol=3)
> y <- runif(20)
> mdl1 <-lm(y~df*I(df^2))
> mdl2 <-lm(y~df*poly(df,degree=2,raw=TRUE))
> length(coef(mdl1))
[1] 16
> length(coef(mdl2))
[1] 40

Explanation:
In mdl1, I(df^2) gives the squared values of the 3 columns of df. The
formula df*I(df^2) gives the 3 (linear) terms of df, the 3 pure
quadratics of I(df^2), the 9 cubic terms obtained by crossing these,
and the constant coefficient = 16 coefs.

In mdl2,  the poly() expression gives 9 variiables: 3 linear, 3 pure
quadratic, 3 interactions (1.2, 1.3, 2.3) of these.  The df*poly()
term would then give the 3 linear terms of df, the 9 terms of poly(),
the crossings between these, and the constant coef = 40 coefs. Many of
these will be NA since terms are repeated (e.g. the 3 linear terms of
poly() and df) and therefore cannot be estimated.

Have I totally misunderstood what you meant or committed some other blunder?

Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sun, Jul 16, 2017 at 7:36 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
>> On Jul 13, 2017, at 7:43 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>>
>> Below.
>>
>> -- Bert
>> Bert Gunter
>>
>>
>>
>> On Thu, Jul 13, 2017 at 3:07 AM, Luigi Biagini <luigi.biagini at gmail.com> wrote:
>>> I have two ideas about it.
>>>
>>> 1-
>>> i) Entering variables in quadratic form is done with the command I
>>> (variable ^ 2) -
>>> plsr (octane ~ NIR + I (nir ^ 2), ncomp = 10, data = gasTrain, validation =
>>> "LOO"
>>> You could also use a new variable NIR_sq <- (NIR) ^ 2
>>>
>>> ii) To insert a square variable, use syntax I (x ^ 2) - it is very
>>> important to insert I before the parentheses.
>>
>> True, but better I believe: see ?poly.
>> e.g. poly(cbind(x1,x2,x3), degree = 2, raw = TRUE) is a full quadratic
>> polynomial in x1,x2,x3 .
>>
>
> Is there any real difference between
>
> octane ~ NIR * I(NIR^2)
> octane ~ NIR * poly(NIR, degree=2, raw=TRUE)
>
> ?
> (I though that adding raw = TRUE prevented the beneficial process of centering the second degree terms.)
> __
> David
>>
>>>
>>> iii) If you want to make the interaction between x and x ^ 2 use the
>>> command ":" -> x: I(x ^ 2)
>>>
>>> iv) For multiple interactions between x and x ^ 2 use the command "*" -> x
>>> *I (x ^ 2)
>>>
>>> i) plsr (octane ~ NIR + NIR_sq, ncomp = 10, data = gasTrain, validation =
>>> "LOO") I (x ^ 2)
>>> ii)p lsr (octane ~ NIR + I(NIR^2), ncomp = 10, data = gasTrain, validation
>>> = "LOO") I (x ^ 2)
>>> iii)p lsr (octane ~ NIR : I(NIR^2), ncomp = 10, data = gasTrain, validation
>>> = "LOO") I (x ^ 2)
>>> iv)p lsr (octane ~ NIR * I(NIR^2), ncomp = 10, data = gasTrain, validation
>>> = "LOO") I (x ^ 2)
>>>
>>> 2 - For your regression, did you plan to use MARS instead of PLS?
>>>
>>>
>>>
>>>
>>> Dear all,
>>>> I am using the pls package of R to perform partial least square on a set of
>>>> multivariate data.  Instead of fitting a linear model, I want to fit my
>>>> data with a quadratic function with interaction terms.  But I am not sure
>>>> how.  I will use an example to illustrate my problem:
>>>> Following the example in the PLS manual:
>>>> ## Read data
>>>> data(gasoline)
>>>> gasTrain <- gasoline[1:50,]
>>>> ## Perform PLS
>>>> gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO")
>>>> where octane ~ NIR is the model that this example is fitting with.
>>>> NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse
>>>> reflectance measurements from 900 to 1700 nm.
>>>> Instead of fitting with octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ...
>>>> I want to fit the data with:
>>>> octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... +
>>>> b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ...
>>>> i.e. quadratic with interaction terms.
>>>> But I don't know how to formulate this.
>>>> May I have some help please?
>>>> Thanks,
>>>> Kelvin
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>