[R] package 'np' and point estimation with multiple predictors

Wed Nov 3 01:38:02 CET 2010

(disclaimer: I'm in physics, not stats... )

I have a multivariate problem.
One variable, call it R1, and 3 "predictor" variables, P1, P2, P3.
My goal is to take a load of training data (I know R1,P1,P2,P3 for about 
700 total points), and then predict R1 for a new set of data for which I 
have all the predictors.  Simple, no?

I understand how to calculate bandwidths, and I have a kind of 
bastardized way of getting the conditional distribution, i.e.,

f(R1|P1=0.8,P2=0.2,P3=2)

using

fitted(npudens(bw=bw,edat=newdata))

evaluating over a vector of R1.

I have then been using this "density" to get a maximum likelihood 
estimator of R1- I have no idea if that is really valid, and if anyone 
wants to yell at me go ahead, I want to do this the correct way and I'm 
sure I'm making it harder than it is.

Moving past that, the technical problem I am facing is getting a 
prediction interval from this.

There's npqreg, and I get how it works when you have one predictor, but 
what happens when you have many?

What I want to do is get the 0.05 and 0.95 quantile for a given 
P1,P2,P3. to use as my prediction interval.

Thanks,
EM