[R] locfit weights not working as expected

Fri Dec 10 21:51:49 CET 2010

Hello! I am having a problem understanding what the weights option in
the locfit command of the locfit package is doing. I
have written a sample program which illustrates the issue (below). The
example involves using bootstrap however, that is not my main
goal but it illustrates where my problem lies.

As you know, to compute a bootstrap estimate of a particular quantity
using a sample size of size N, I would sample a multinomial vector
which should be of size N with possible values 1:N. I should get the
exact same answer whether I a) estimate the quantity using this
multinomial vector as the indices i.e. literally make a new sample by
choosing each element according to the number of time in the
corresponding index or b) estimate the quantity using the original
data and specify this multinomial vector as the "weights". In all
commands that I have tried, the weights functions works such that this
is always true. This is not true for the locfit command. The models
are different, the coefficients are different, the predictions are
different. I have tried several combinations of things including
specifying different options for the evaluation points, different
kernels, different datasets to predict and I cannot find any way to
make these equal. For my particular simulation, I need to understand
exactly what the weights option is doing since it is clearly not doing
what I expected it to. I would gratefully appreciate any advice or
help that anyone can give on this issue and I appreciate your time very much.

library(MASS)
library(splines)
library(sm)
library(quantreg)
library(locfit)

set.seed(20)
Zi = rnorm(5000,0, 2)
ei = mvrnorm(5000, c(.6,2), matrix(c(.7,0,0,.9),2,2))
X1i = exp(-.5*Zi + ei[,1])
X2i = exp(-.5*(Zi*log(T1i) -log(T1i) + Zi) + ei[,2])
X1i = log(X1i)

boot.weight = rmultinom(1,5000, prob=rep(1/5000, 5000))
boot.subset = rep(1:5000, boot.weight)

loc.model.1 = locfit(1*(X2i[boot.subset]< 10) ~ lp(X1i[boot.subset],
Zi[boot.subset], style = c      ("n","cpar"), h=0.4, deg = 1, nn=0), family
= "binomial", link = "logit")

loc.model.2 = locfit(1*(X2i< 10) ~ lp(X1i, Zi, style = c("n","cpar"),
h=0.4, deg = 1,nn=0), family = "binomial", link = "logit",
weights=boot.weight)
print(loc.model.1)
print(loc.model.2)

t.vector = log(seq(0.5,10, length=300))
p.0.1 = predict(loc.model.1, newdata=cbind(t.vector, rep(0,length(t.vector))))
p.0.2 = predict(loc.model.2, newdata=cbind(t.vector, rep(0,length(t.vector))))

#quantity below should be vector of zeros but it is not.
p.0.1-p.0.2