[R] Using rms Package's Predict function with transformed variables

Marc Burgess mburge@@z@ @end|ng |rom gm@||@com
Mon Jul 1 13:24:36 CEST 2019


Hello,

I would like to use the rms Package's Glm and Predict functions in
order to use some of the additional details that they provide. I'm
struggling to get them to work in cases where the formula contains
multiple transformed versions of the same variable however. For
example Y ~ x + x^2 + log(x+1).

I have tried to do a fair bit of experimentation and research on this,
but closest I have found is a stackoverflow question for which no
complete answer was ever posted
(https://stackoverflow.com/questions/42168371/rms-package-glm-how-to-add-indicators-in-the-formula).
As the post indicated, it's possible to do the fit by creating
explicit transformed variables, but then I can't come up with a way to
get them to vary together correctly when using the Predict function.

Any help would be appreciated
---
Marc Burgess

Example code:

# Generate test data:
library(rms)

set.seed(100)

testdata <- rnorm(10000, 50, 10)
ltestdata <- log(testdata+1)
testdata2 <- testdata^2

yvals <- rpois(10000, 10+ltestdata+testdata+testdata2)

# Can fit using standard glm as follows:
testfit1 <- glm(yvals ~ I(log(testdata+1)) + testdata + I(testdata^2),
family=poisson())

# Trying different methods fail for rms Glm that don't work:
ddist = datadist(testdata, ltestdata, testdata2)
options(datadist="ddist")
testfit2 <- Glm(yvals ~ I(log(testdata+1)) + testdata + I(testdata^2),
family=poisson())
testfit3 <- Glm(yvals ~ asis(log(testdata+1)) + testdata +
asis(testdata^2), family=poisson())
testfit3 <- Glm(yvals ~ asis(log(testdata+1)) + pol(testdata,2),
family=poisson())

# It does work to use the explicit transformed variables:
testfit4 <- Glm(yvals ~ ltestdata + testdata +  testdata2, family=poisson())

# We can't then use Predict with testfit4 because it doesn't make
sense to independently vary the 3 predictors
plot(Predict(testfit4, testdata=30:50, fun="exp")) # ltestdata and
testdata2 are being held constant
plot(Predict(testfit4, testdata=30:50, ltestdata=log(30:50+1),
fun="exp")) # Creates predictions for all combinations, most of which
are impossible.



More information about the R-help mailing list