[Rd] predict.lm(); zero weights; follow-on to PR#3043
John Maindonald
john.maindonald at anu.edu.au
Wed May 21 10:46:26 MEST 2003
I have now looked at implications for zero weights
The code (lines 45-47)
if (type != "terms") {
if (missing(newdata))
XRinv <- qr.Q(object$qr)[, p1, drop = FALSE]
requires the more extensive change:
if (missing(newdata)&(is.null(w)|all(w>0))){
XRinv <- if(is.null(w)) qr.Q(object$qr)[, p1, drop =
FALSE]
else qr.Q(object$qr)[, p1, drop = FALSE]/sqrt(w)}
The issue here is that the object returned by qr.Q(object$qr)
has only as many rows as there are non-zero values of w.
So the easy way to fix the matter is that taken above, to revert
to the more straightforward but probably lengthier calculation
Rinv <- qr.solve(qr.R(object$qr)[p1, p1])
XRinv <- X[, piv] %*% Rinv
I was aware that this was probably an issue when I
submitted the earlier bug report, but did not not want to
delay reporting that issue.
SEs of prediction values (i.e., predicted value for a new
observation) seem to me an ambiguous notion when there
are weights. As it stands the code effectively assumes a
weight of one for SEs of prediction values. I guess this is
reasonable.
Here is code that exercises this patch:
"xy0" <-
structure(list(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
y = c(1.036, 1.883, 3.11, 3.148, 4.226,
6.291, 7.312, 7.796, 9.563, 9.563),
w = c(0.067, 0, 1.588, 0.239, 0.015,
0.938, 0.041, 0.473, 0.483, 0.044)),
.Names = c("x", "y", "w"), class = "data.frame",
row.names = c("1", "2", "3", "4", "5", "6", "7",
"8", "9", "10"))
> xy0.lm <- lm(y ~ x, weights=w, data=xy0)
> my.predict(xy0.lm, se=T)$se
1 2 3 4 5 6 7
8
0.2485517 0.2052760 0.1666100 0.1365279 0.1215779 0.1272119 0.1511453
0.1864598
9 10
0.2279254 0.2727509
John Maindonald email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473 fax : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
More information about the R-devel
mailing list