[Rd] predict (PR#2686)
ripley at stats.ox.ac.uk
ripley at stats.ox.ac.uk
Thu Mar 27 08:23:02 MET 2003
On Thu, 27 Mar 2003 Mark.Bravington at csiro.au wrote:
> <Bravington wrote:>
> #> `predict' complains about new factor levels, even if the
> #"new" levels are
> #> merely levels in the original that didn't occur in the
> #original fit and were
> #> sensibly dropped, and that don't occur in the prediction
> #data either.
>
> <Ripley replied:>
> #This is intentional. The coding for factors is based on the
> #full set of
> #levels, and should be comparable for different prediction sets.
> #
> #If you are using factors with fictitious levels the fix is obvious:
> #improve the design.
>
> There is still an inconsistency bug between `lm' and `predict.lm', though.
> `lm' intentionally overlooks inactive levels of a factor, but `predict.lm'
Only if an argument is set, and originally lm did not do so.
> doesn't, even when it legitimately could. In particular, it is a bit odd to
> have no problem predicting without a `newdata' argument even when the
> original data had inactive factor levels, but then to get an error if
> `newdata=<<original data>>' is supplied explicitly! (See example.)
Read again. predict.lm is consistent across its inputs: unlike lm it can
take variable `newdata'. As I said the intention is to be consistent
across *prediction sets*. Omitting newdata is not giving a prediction
set.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list