[Rd] [R] predict()

peter dalgaard pdalgd at gmail.com
Fri Apr 15 09:10:25 CEST 2011


On Apr 14, 2011, at 16:52 , Ivo Shterev wrote:

> Dear Dr. Therneau,
> 
> Thank you for your response. Just to point out that we didn't experience any problems with the lm() function under R version 2.12.2 (2011-02-25):

I couldn't reproduce it from Terry's description either, but there _is_ an issue which parallels the coxph one: In model.frame.lm, we have

function (formula, ...) 
{
    dots <- list(...)
    nargs <- dots[match(c("data", "na.action", "subset"), names(dots), 
        0)]
    if (length(nargs) || is.null(formula$model)) {
        fcall <- formula$call
        fcall$method <- "model.frame"
        fcall[[1L]] <- as.name("lm")
        fcall[names(nargs)] <- nargs
        env <- environment(formula$terms)
        if (is.null(env)) 
            env <- parent.frame()
        eval(fcall, env, parent.frame())
    }
    else formula$model
}
<environment: namespace:stats>

So under some circumstances, we evaluate formula$call (where "formula" is actually an "lm" object --- and age-old design infelicity) with modified arguments. This evaluation takes place in the environment of the model formula, nested in the parent frame, but neither necessarily contains the arguments to formula$call:

> t3
function(dat,fm)
 {
   model.frame(lm(fm,data=dat),subset=1:5)
 }
> testdat
        otime event          x
1  0.84345726     0 -0.4456620
2  0.57661027     0  1.2240818
3  1.32905487     0  0.3598138
4  0.03157736     0  0.4007715
5  0.05621098     0  0.1106827
6  0.31650122     1 -0.5558411
7  0.31422729     1  1.7869131
8  0.14526680     1  0.4978505
9  2.72623646     1 -1.9666172
10 0.02915345     1  0.7013559
> fm2
otime ~ x
> t3(testdat,fm2)
Error in model.frame(formula = fm, data = dat, subset = 1:5, drop.unused.levels = TRUE) : 
  object 'fm' not found

It looks more than a bit iffy to try and set this right. One suggestion could be to explicitly replace the formula part of fcall with the formula contained in the model object. Or, the original evaluation frame should be kept with the model object and used rather than parent.frame(). 

This should probably move to r-devel.



> 
>> set.seed(123)
>> testdat=data.frame(y=rexp(10),event=rep(0:1,each=5),x=rnorm(10))
>> testfm=as.formula('y~x')
>> 
>> testfun=function(dat,fm)
> +   {
> +     predict(lm(fm,data=dat),newdata=dat)
> +   }
>> testfun(testdat,testfm)
>          1           2           3           4           5           6 
> 1.00414971  0.07061335  0.55381658  0.53091759  0.69310319  1.06574974 
>          7           8           9          10 
> -0.24405980  0.47664172  1.85449993  0.36286396 
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: x86_64-pc-linux-gnu (64-bit)
> 
> locale:
> [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
> [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
> [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8   
> [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
> [9] LC_ADDRESS=C               LC_TELEPHONE=C            
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
> 
> loaded via a namespace (and not attached):
> [1] tools_2.12.2
>> 
> 
> 
> Regards
> Ivo
> 
> 
> 
> --- On Thu, 4/14/11, Terry Therneau <therneau at mayo.edu> wrote:
> 
>> From: Terry Therneau <therneau at mayo.edu>
>> Subject: Re: predict()
>> To: "Ivo Shterev" <idc318 at yahoo.com>
>> Cc: r-help at r-project.org
>> Date: Thursday, April 14, 2011, 2:59 PM
>> --- begin included message ---
>> I am experimenting with the function predict() in two
>> versions of R and
>> the R extension package "survival".
>> 
>> library(survival)
>> 
>> set.seed(123)
>> testdat=data.frame(otime=rexp(10),event=rep(0:1,each=5),x=rnorm(10))
>> testfm=as.formula('Surv(otime,event)~x')
>> 
>> testfun=function(dat,fm)
>>   {
>>    
>> predict(coxph(fm,data=dat),type='lp',newdata=dat)
>>   }
>> 
>> -- end inclusion ----
>> 
>> The question was: this works under survival 2.35-8, but
>> not survival
>> 2.36-5
>> 
>> Answer: The predict and underlying model.frame functions
>> for coxph were
>> brought into line with lm and other models.  The major
>> advantage is that
>> I now deal with factors and data dependent predictors like
>> ns()
>> correctly.
>>   You've shown a disadvantage of which I was not
>> aware.  Using your
>> example but replacing coxph() by lm() with otime ~x as the
>> model I get a
>> similar failure.  I'd like to ask a wider audience of
>> R-devel since it
>> is bigger than coxph.
>> 
>> Terry Therneau
>> 
>> 
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list