[R] Earth (MARS) package with categorical predictors
Chris Wilkinson
kinsham at verizon.net
Mon Nov 11 17:58:38 CET 2013
Steve, thanks for your reply. Here is what I get.
pkg is a 4-level categorical vector.
> is.factor(pkg)
[1] TRUE>
> summary(pkg)
BGA PGA QCC QFP
225 36 19 178
>
> dat <- earth(lifetime ~ pkg+pins+volts+temp+doi+logspd, degree=3) ## The
other vars are continuous.
> s <- 243
> pr <- c(pkg[s],pins[s],volts[s],temp[s],doi[s],logspd[s])
> pkg[s]
[1] BGA
Levels: BGA PGA QCC QFP
> pr
[1] 1.000000 256.000000 3.300000 125.000000 2002.258105 4.890349
> pred <- predict(dat, newdata=pr)
Error : variable 'pkg' was fitted with type "factor" but type "numeric" was
supplied
Forging on regardless, first few rows of x are
pkg pins volts temp doi logspd
1 1 256 3.3 125 2002.258 4.890349
Error: get.earth.x from model.matrix.earth from predict.earth: the number 6
of columns of x
(after factor expansion) does not match the number 8 of columns of the earth
object
expanded x: pkg pins volts temp doi logspd
object$dirs: pkgPGA pkgQCC pkgQFP pins volts temp doi logspd
Possible remedy: check factors in the input data
>
Pkg is being passed as numeric 1. I'm unsure how to correctly specify pkg
for predict. In the example you gave, does the data include a categorical?
Chris
-----Original Message-----
From: Stephen Milborrow [mailto:milbo at sonic.net]
Sent: Monday, November 11, 2013 7:21 AM
To: kinsham at verizon.net
Subject: [R] Earth (MARS) package with categorical predictors
See if you can provide a simple reproducible example. It's not clear
exactly what the issue is from your question. The following simple example
gives the correct response:
data(etitanic)
a <- earth(survived~., data=etitanic)
predict(a, newdata=etitanic[1,])
Regards,
Steve
Message: 42
Date: Thu, 07 Nov 2013 23:16:18 -0500
From: Chris Wilkinson <kinsham at verizon.net>
To: r-help at r-project.org, Chris Wilkinson <kinsham at verizon.net>
Subject: [R] Earth (MARS) package with categorical predictors
Message-ID: <ml99syxejec3ep0u4h0je78h.1383884178002 at email.android.com>
Content-Type: text/plain; charset=utf-8
It appears to be legitimate to include multi-level categorical and
continuous variables in defining the model for earth (e.g. y ~ cat +
cont1 + cont2) but is it also then possible use categoricals in the
predict method using the earth result? I tried but it returns an error
which is not very informative.
Thanks
Chris
More information about the R-help
mailing list