[R] boostrap astronomy problem
Gregory Ruchti
gruchti at pha.jhu.edu
Mon Jan 2 18:29:01 CET 2006
Hi,
I am an astronomer and somewhat new to boostrap statistics. I understand
the basic idea of bootstrap resampling, but am uncertain if it would be
useful in my case or not. My problem consists of maximizing a likelihood
function based on the velocities of a number of stars. My assumed
distribution of velocities of these stars is:
---------------------------------------------------------------------------
dmy=function(x,v,k,t)(k+1)/(v-t)^(k+1)*(v-x)^k
---------------------------------------------------------------------------
where x would be my stellar velocities. (Essentially it is a beta
distribution.)
---------------------------------------------------------------------------
My likelihood function looks something like this:
lm<-function(x){
e<-x[1]
k<-x[2]
log(e) - n*log(k+1) + (k+1)*n*log(e-t)-k*sum(log(e-vg))
}
--------------------------------------------------------------------------
The quantities n and t are known, and vg is my velocity data. I am
minimizing this function using the function "optim" (BFGS option) to find
k and e that minimize this. Also, my data set is small, only about 50
stars. Therefore, I was thinking that I could use the boot function to
resample my data and solve the minimization for each resample. This way I
believe I'll get better estimates for standard errors and confidence
intervals. Is it safe to assume that the distributions for k and e are
approximately Normal, therefore making the bootstrap useful? I have
actually used the boot function with this set up:
--------------------------------------------------------------------------
mystat=function(s,b){
#Negative Log Likelihood Function
lm<-function(x){
e<-x[1]
k<-x[2]
log(e) - n*log(k+1) + (k+1)*n*log(e-t) -
k*sum(log(e-s[b]))
}
#Gradient of Negative Likelihood Function
glm=function(x){
e<-x[1]
k<-x[2]
c(1/e + (k+1)*(n/(e-t)) - k*sum(1/(e-s[b])),-n/(k+1) +
n*log(e-t) - sum(log(e-s[b])))
}
optim(c(480.,2.),lm,glm,method="BFGS",control=list(maxit=10000000))$par
}
#Compute Bootstrap replicates of escape velocity and kr
m2B2=boot(vg,mystat,5000)
------------------------------------------------------------------------
Does this appear to be correct for what I'd like to achieve? I have
looked at the distribution and it appears to be about Normal, but can I
say that this is true for the sampling distribution as well? Also, the
bootstrap distribution is fairly biased, should I be using "bca" or tilted
bootstrap confidence intervals? If so, I am having some trouble getting
the tilted bootstrap to work. Specifically, it is having trouble finding
"multipliers".
Also, should I be in some way taking into account my velocity distribution
when resampling? Any suggestions would be very helpful, thanks.
Thank you for your time.
Greg Ruchti
--
Gregory Ruchti
Bloomberg Center for Physics and Astronomy
Johns Hopkins University
3400 N. Charles St.
Baltimore, MD 21218-1216
gruchti at pha.jhu.edu
Tel: (410)516-8520
More information about the R-help
mailing list