[Rd] (PR#8877) predict.lm does not have a weights argument for
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed May 24 11:36:11 CEST 2006
On Wed, 24 May 2006, Peter Dalgaard wrote:
> ripley at stats.ox.ac.uk writes:
>
>> (a) case weights: w_i = 3 means `I have three observations like (y, x)'
>>
>> (b) inverse-variance weights, most often an indication that w_i = 1/3
>> means that y_i is actually the average of 3 observations at x_i.
>>
>> (c) multiple imputation, where a case with missing values in x is split
>> into say 5 parts, with case weights less than and summing to one.
>>
>> (d) Heteroscedasticity, where the model is rather
>>
>> y = x\beta + \epsilon, \epsilon \sim N(0, \sigma^2(x))
>>
>> And there may well be other scenarios, but those are the most common (in
>> decreasing order) in my experience.
>
> I'd have (d) higher on the list, but never mind. There's also
I find that if you detect heteroscedasticity, then one of the following
applies:
- a transformation of y would be beneficial
- a non-normal model, e.g. a Poisson regression, is more appropriate
- the error variance really depends on y or Ey not x, as in most
measurement-error scenarios (and the example in ?nls and the example
in the addendum to the bug report).
- in analytical chemistry as in the example on the addendum to the bug
report, there are errors in both y and x to consider, and a functional
relationship model is better.
So I very rarely actually get as far as predicting from a heteroscedastic
regression model.
> (e) Inverse probability weights: Knowing that part of the population
> is undersampled and wanting results that are compatible with what you
> would have gotten in a balanced sample. Prototypically: You sample X,
> taking only a third of those with X > c; find population mean of X,
> (or univariate regression on some other variable, which is only
> recorded in the subsample).
I would call this an example of case weights (you are just weighting cases
and saying `I have 1/p like this', and in rlm there is a difference
between (a) and (b) and you would want to use wt.method="case" for (e)).
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list