[R] Robust variance estimation with rq (failure of the bootstrap?)

James Shaw shawjw at gmail.com
Tue Mar 1 12:35:41 CET 2011


Matt:

Thanks for your prompt reply.

The disparity between the bootstrap and sandwich variance estimates
derived when modeling the highly skewed outcome suggest that either
(A) the empirical robust variance estimator is underestimating the
variance or (B) the bootstrap is breaking down.  The bootstrap
variance estimate of a robust location estimate is not necessarily
robust, see Statistics & Probability Letters 50 (2000) 49-53.  Since
submitting my earlier post, I have noticed that the the robust kernel
variance estimate is similar to the bootstrap estimate.  Under what
conditions would one expect Koenker and Machado's sandwich variance
estimator, which uses a local estimate of the sparsity, to fail?

--
Jim



On Mon, Feb 28, 2011 at 8:59 PM, Matt Shotwell <matt at biostatmatt.com> wrote:
> Jim,
>
> If repeated measurements on patients are correlated, then resampling all
> measurements independently induces an incorrect sampling distribution
> (=> incorrect variance) on a statistic of these data. One solution, as
> you mention, is the block or cluster bootstrap, which preserves the
> correlation among repeated observations in resamples. I don't
> immediately see why the cluster bootstrap is unsuitable.
>
> Beyond this, I would be concerned about *any* variance estimates that
> are blind to correlated observations.
>
> The bootstrap variance estimate may be larger than the asymptotic
> variance estimate, but that alone isn't evidence to favor one over the
> other.
>
> Also, I can't justify (to myself) why skew would hamper the quality of
> bootstrap variance estimates. I wonder how it affects the sandwich
> variance estimate...
>
> Best,
> Matt
>
> On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote:
>> I am fitting quantile regression models using data collected from a
>> sample of 124 patients.  When modeling cross-sectional associations, I
>> have noticed that nonparametric bootstrap estimates of the variances
>> of parameter estimates are much greater in magnitude than the
>> empirical Huber estimates derived using summary.rq's "nid" option.
>> The outcome variable is severely skewed, and I am afraid that this may
>> be affecting the consistency of the bootstrap variance estimates.  I
>> have read that the m out of n bootstrap can be used to overcome this
>> problem.  However, this procedure requires both the original sample
>> (n) and the subsample (m) sizes to be large.  The version implemented
>> in rq.boot does not appear to provide any improvement over the naive
>> bootstrap.  Ultimately, I am interested in using median regression to
>> model changes in the outcome variable over time.  Summary.rq's robust
>> variance estimator is not applicable to repeated-measures data.  I
>> question whether the block (cluster) bootstrap variance estimator,
>> which can accommodate intraclass correlation, would perform well.  Can
>> anyone suggest alternatives for variance estimation in this situation?
>> Regards,
>>
>> Jim
>>
>>
>> James W. Shaw, Ph.D., Pharm.D., M.P.H.
>> Assistant Professor
>> Department of Pharmacy Administration
>> College of Pharmacy
>> University of Illinois at Chicago
>> 833 South Wood Street, M/C 871, Room 266
>> Chicago, IL 60612
>> Tel.: 312-355-5666
>> Fax: 312-996-0868
>> Mobile Tel.: 215-852-3045
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>



-- 
James W. Shaw, Ph.D., Pharm.D., M.P.H.
Assistant Professor
Department of Pharmacy Administration
College of Pharmacy
University of Illinois at Chicago
833 South Wood Street, M/C 871, Room 266
Chicago, IL 60612
Tel.: 312-355-5666
Fax: 312-996-0868
Mobile Tel.: 215-852-3045



More information about the R-help mailing list