[Rd] predict (PR#2685)
Mark.Bravington at csiro.au
Mark.Bravington at csiro.au
Thu Mar 27 11:42:27 MET 2003
( summary: predict now gets round problems caused by re-ordering of terms )
Sorry-- glad it's fixed & surprised I left the info off-- it's in
yesterday's other bug reports tho' :).
I know that I searched the bug tracking page yesterday to see whether the
bugs were already reported before sending them. Didn't find anything
yesterday, but there they are today in black & white. Must have done
something daft.
cheers
Mark
*******************************
Mark Bravington
CSIRO (CMIS)
PO Box 1538
Castray Esplanade
Hobart
TAS 7001
phone (61) 3 6232 5118
fax (61) 3 6232 5012
Mark.Bravington at csiro.au
#-----Original Message-----
#From: ripley at stats.ox.ac.uk [mailto:ripley at stats.ox.ac.uk]
#Sent: Wednesday, 26 March 2003 6:09 PM
#To: Mark.Bravington at csiro.au
#Cc: r-devel at stat.math.ethz.ch; R-bugs at biostat.ku.dk
#Subject: Re: [Rd] predict (PR#2685)
#
#
#You forgot to give the R version, platform etc.
#
#This is already fixed in R-devel, and your example works there
#provided a
#valid assignment operator is used.
#
#It is the same PR#2206, and is marked as fixed in R-bugs.
#
#On Wed, 26 Mar 2003 Mark.Bravington at csiro.au wrote:
#
#> There is a bug in `predict' whereby the order of variables
#sometimes gets
#> re-arranged compared to the original fit, and then disaster results.
#> Specifically, the 'variables' and 'predvars' attributes of a
#'terms' object
#> get out of synch. This only happens when the terms in the
#original formula
#> get re-ordered during fitting:
#>
#> test> scrunge.data_ data.frame( contin=1:10,
#discrete=factor( rep( c( 'cat',
#> 'dog'), 5)), resp=runif( 10))
#> test> lm.ok_ lm( resp ~ discrete + contin %in% discrete,
#data=scrunge.data)
#> test> predict( lm.ok, scrunge.data) # no problemo
#> 1 2 3 4 5
# 6 7
#> 8 9 10
#> 0.29663793 0.04572655 0.42661779 0.31668732 0.55659764
#0.58764809 0.68657750
#> 0.85860886 0.81655736 1.12956963
#>
#> test> lm.bug_ lm( resp ~ contin %in% discrete + discrete,
#data=scrunge.data)
#> # terms will be re-ordered
#> test> predict( lm.bug, scrunge.data)
#> Error in "contrasts<-"(*tmp*, value = "contr.treatment") :
#> contrasts apply only to factors
#> In addition: Warning message:
#> variable discrete is not a factor in:
#model.frame.default(object, data, xlev
#> = xlev)
#>
#> This actually turns out to be a bug in
#`model.frame.default', to do with an
#> inconsistency between `predvars' and `vars' when
#`model.frame.default' is
#> called inside `predict'. AFAICS it can be fixed by including
#the commented
#> line below in `model.frame.default':
#>
#> <<...>>
#> vars <- attr(formula, "variables")
#> predvars <- attr(formula, "predvars")
#> if (is.null(predvars))
#> predvars <- vars
#> varnames <- as.character(predvars[-1]) # MVB: was vars[-1] not
#> predvars[-1]
#> variables <- eval(predvars, data, env)
#> <<...>>
#>
#> This has the side-effect that there are some ugly column names in the
#> model.frame if e.g. a `poly' term is used, but doesn't
#actually seem to hurt
#> the prediction.
#>
#> However, that doesn't fix it all. There is still a bug in
#`predict', even
#> after replacing `model.frame.default' with the above:
#>
#> test> predict( lm.bug, scrunge.data)
#> Error in drop(X[, piv, drop = FALSE] %*% beta[piv]) :
#> subscript out of bounds
#> test> # wot???
#>
#> This time, the bug is in `delete.response', which call
#`terms' to set most
#> of the attributes including `variables', but adjusts the
#original `predvars'
#> by hand. Because `terms' returns the variables in a
#different order when
#> it's called by `predict.lm' to when it was originally called
#by `lm', things
#> get out of synch.
#>
#> This is a slightly tricky bug to fix, because `predvars' and
#`variables' can
#> look a bit different e.g. if there are `poly' terms, but I think the
#> following change near the end of `delete.response' does the trick:
#>
#> <<...>>
#> if (length(formula(termobj)) == 3) {
#> # Old code, reliant on maintaining the order of terms: attr(tt,
#> "predvars") <- attr(termobj, "predvars")[-2]
#> reorder <- match( sapply( attr( tt, 'variables'),
#deparse), sapply(
#> attr( termobj, 'variables'), deparse)) # MVB
#> attr( tt, 'predvars') <- attr( termobj, 'predvars')[
#reorder] # MVB
#> }
#> <<...>>
#>
#> cheers
#> Mark
#>
#> *******************************
#>
#> Mark Bravington
#> CSIRO (CMIS)
#> PO Box 1538
#> Castray Esplanade
#> Hobart
#> TAS 7001
#>
#> phone (61) 3 6232 5118
#> fax (61) 3 6232 5012
#> Mark.Bravington at csiro.au
#>
#> ______________________________________________
#> R-devel at stat.math.ethz.ch mailing list
#> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
#>
#
#--
#Brian D. Ripley, ripley at stats.ox.ac.uk
#Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
#University of Oxford, Tel: +44 1865 272861 (self)
#1 South Parks Road, +44 1865 272866 (PA)
#Oxford OX1 3TG, UK Fax: +44 1865 272595
#
More information about the R-devel
mailing list