[Rd] predict (PR#2685)
ripley at stats.ox.ac.uk
ripley at stats.ox.ac.uk
Wed Mar 26 07:08:41 MET 2003
You forgot to give the R version, platform etc.
This is already fixed in R-devel, and your example works there provided a
valid assignment operator is used.
It is the same PR#2206, and is marked as fixed in R-bugs.
On Wed, 26 Mar 2003 Mark.Bravington at csiro.au wrote:
> There is a bug in `predict' whereby the order of variables sometimes gets
> re-arranged compared to the original fit, and then disaster results.
> Specifically, the 'variables' and 'predvars' attributes of a 'terms' object
> get out of synch. This only happens when the terms in the original formula
> get re-ordered during fitting:
>
> test> scrunge.data_ data.frame( contin=1:10, discrete=factor( rep( c( 'cat',
> 'dog'), 5)), resp=runif( 10))
> test> lm.ok_ lm( resp ~ discrete + contin %in% discrete, data=scrunge.data)
> test> predict( lm.ok, scrunge.data) # no problemo
> 1 2 3 4 5 6 7
> 8 9 10
> 0.29663793 0.04572655 0.42661779 0.31668732 0.55659764 0.58764809 0.68657750
> 0.85860886 0.81655736 1.12956963
>
> test> lm.bug_ lm( resp ~ contin %in% discrete + discrete, data=scrunge.data)
> # terms will be re-ordered
> test> predict( lm.bug, scrunge.data)
> Error in "contrasts<-"(*tmp*, value = "contr.treatment") :
> contrasts apply only to factors
> In addition: Warning message:
> variable discrete is not a factor in: model.frame.default(object, data, xlev
> = xlev)
>
> This actually turns out to be a bug in `model.frame.default', to do with an
> inconsistency between `predvars' and `vars' when `model.frame.default' is
> called inside `predict'. AFAICS it can be fixed by including the commented
> line below in `model.frame.default':
>
> <<...>>
> vars <- attr(formula, "variables")
> predvars <- attr(formula, "predvars")
> if (is.null(predvars))
> predvars <- vars
> varnames <- as.character(predvars[-1]) # MVB: was vars[-1] not
> predvars[-1]
> variables <- eval(predvars, data, env)
> <<...>>
>
> This has the side-effect that there are some ugly column names in the
> model.frame if e.g. a `poly' term is used, but doesn't actually seem to hurt
> the prediction.
>
> However, that doesn't fix it all. There is still a bug in `predict', even
> after replacing `model.frame.default' with the above:
>
> test> predict( lm.bug, scrunge.data)
> Error in drop(X[, piv, drop = FALSE] %*% beta[piv]) :
> subscript out of bounds
> test> # wot???
>
> This time, the bug is in `delete.response', which call `terms' to set most
> of the attributes including `variables', but adjusts the original `predvars'
> by hand. Because `terms' returns the variables in a different order when
> it's called by `predict.lm' to when it was originally called by `lm', things
> get out of synch.
>
> This is a slightly tricky bug to fix, because `predvars' and `variables' can
> look a bit different e.g. if there are `poly' terms, but I think the
> following change near the end of `delete.response' does the trick:
>
> <<...>>
> if (length(formula(termobj)) == 3) {
> # Old code, reliant on maintaining the order of terms: attr(tt,
> "predvars") <- attr(termobj, "predvars")[-2]
> reorder <- match( sapply( attr( tt, 'variables'), deparse), sapply(
> attr( termobj, 'variables'), deparse)) # MVB
> attr( tt, 'predvars') <- attr( termobj, 'predvars')[ reorder] # MVB
> }
> <<...>>
>
> cheers
> Mark
>
> *******************************
>
> Mark Bravington
> CSIRO (CMIS)
> PO Box 1538
> Castray Esplanade
> Hobart
> TAS 7001
>
> phone (61) 3 6232 5118
> fax (61) 3 6232 5012
> Mark.Bravington at csiro.au
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list