[RsR] minimum sample size for the robust counterpart of the t-test #3

Richard Friedman |r|edm@n @end|ng |rom c@ncercenter@co|umb|@@edu
Wed Jun 22 00:56:12 CEST 2011


Dear Manuel (and list).

	Thank you for doing so much  the work to answer my question.
Much of the theory is beyond me at present although I plan on learning
more about robust methods with time along with many other fields
I have to learn. I would like to rephrase the question- your response
deals mainly with power. What I need to know about more is Tyoe I
error. Do robust methods ever increase type I -error lead to a greater
number of false positives, for n=5 than does the classical t-test?
In particular does the Huber with k=1.345? I have a particular
reason for returning to this method even though it is apparently
no longer the robust method of choice. A collaborator used it to find
significance where the classical t-test did not, and I am wondering
which test to believe.

Thanks and best wishes,
Rich



On Jun 17, 2011, at 11:55 AM, Manuel Koller wrote:

> Dear Richard,
>
> Since we did not quite cover your specific case, I ran another small
> simulation. See the attached file. It is basically the simulation
> study of our paper, but for models having only an intercept. I hope I
> did not overlook anything when I did this. There was a lot going on
> today... I apologize for the overloaded plots. I guess Figures 4, 7
> and 8 are the most interesting figures.
>
> As Rand already stated, the asymmetric error distributions are a
> problem: all the methods perform quite badly. Otherwise, the levels of
> the tests are pretty much ok (even for OLS, i.e., t-test). But of
> course, the power will be pretty bad. In numbers, for n = 5 you will
> have approximately the correct level (+/- 2%), but a power of about
> 40% only for an effect size of 1 (10% for an effect size of 0.4). And
> this does not really depend on which method you are using.
>
> To conclude, I would recommend to use lmrob from robustbase with the
> argument setting="KS2011".
>
> I hope this helps,
>
> Manuel
>
> On Thu, Jun 16, 2011 at 8:19 PM, Richard Friedman
> <friedman using cancercenter.columbia.edu> wrote:
>> Rand,
>>
>>        Thanks, I know very little about robust methods. I am  
>> interested in
>> whether rlm can be used in its default
>> state or if I have to tearn much more to do use the methods  
>> correctly.
>>
>> Best wishes,
>> Rich
>>
>> On Jun 16, 2011, at 2:14 PM, Rand Wilcox wrote:
>>
>>> When dealing with M-estimators and the goal is to compute confidence
>>> intervals, one thing you have to be careful about is skewed  
>>> distributions.
>>> Have not encountered any non-bootstrap method that performs well in
>>> simulations where the confidence interval is based on an estimate  
>>> of the
>>> standard error. Just how symmetric the distribution must be seems  
>>> unclear.
>>> What works better is a percentile bootstrap method, even with  
>>> fairly small
>>> sample sizes. This is why the methods in my book focus on bootstrap
>>> techniques when dealing with M-estimators.
>>>
>>>
>>> However, have not yet seen the Koller and Stahel paper. Maybe this  
>>> problem
>>> has been addressed.
>>>
>>> Rand
>>>
>>> Rand Wilcox
>>> Professor
>>> Dept of Psychology
>>> USC
>>> Los Angeles, CA 90089-1061
>>>
>>> FAX: 213-746-9082
>>> For information about statistics books and software, see
>>> http://www-rcf.usc.edu/~rwilcox/
>>> as well as
>>> http://college.usc.edu/labs/rwilcox/home
>>>
>>> ----- Original Message -----
>>> From: Richard Friedman <friedman using cancercenter.columbia.edu>
>>> Date: Thursday, June 16, 2011 9:02 am
>>> Subject: Re: [RsR] minimum sample size for the robust counterpart  
>>> of the
>>> t-test #2
>>> To: Rand Wilcox <rwilcox using usc.edu>, r-sig-robust using r-project.org
>>>
>>>> Dear Rand (and List),
>>>>
>>>>        I read the relevant sections of your book and while  
>>>> informative it
>>>> did not answer my question
>>>> directly as best I can see. I will restate the question more
>>>> explicitly:
>>>> A robust analog of the two sample  t-test is performed with the rlm
>>>> function with the default parameters of
>>>> the Huber method with K=1.345. Is there a minimum sample size for
>>>> which it should be trusted?
>>>> are 5 samples enough? 10 samples?
>>>>
>>>> If this question does not have a simple answer please let me know.
>>>>
>>>> Thanks and best wishes,
>>>> Rich
>>>>
>>>>
>>>> On Jun 15, 2011, at 3:19 PM, Rand Wilcox wrote:
>>>>
>>>>> There is general information about sample sizes and p-values, when
>>>>
>>>> using robust analogs of t, in my 2005 book (Introduction to Robust
>>>> Estimation and Hypothesis Testing, Academic Press) .
>>>>>
>>>>> (A third edition will be out early in 2012. )
>>>>>
>>>>> Hope this helps.
>>>>>
>>>>> Rand
>>>>>
>>>>> Rand Wilcox
>>>>> Professor
>>>>> Dept of Psychology
>>>>> USC
>>>>> Los Angeles, CA 90089-1061
>>>>>
>>>>> FAX: 213-746-9082
>>>>> For information about statistics books and software, see
>>>>
>>>> http://www-rcf.usc.edu/~rwilcox/
>>>>>
>>>>> as well as
>>>>> http://college.usc.edu/labs/rwilcox/home
>>>>>
>>>>> ----- Original Message -----
>>>>> From: Richard Friedman <friedman using cancercenter.columbia.edu>
>>>>> Date: Wednesday, June 15, 2011 12:11 pm
>>>>> Subject: [RsR] minimum sample size for the robust counterpart of
>>>>
>>>> the t-test
>>>>>
>>>>> To: r-sig-robust using r-project.org
>>>>>
>>>>>> Dear List,
>>>>>>
>>>>>>        I am a beginner in the use of robust methods. Is there a  
>>>>>> minimum
>>>>>> sample size
>>>>>> for which the robust analog of a two sample t-test using rlm with
>>>>>> default parameters and categorical
>>>>>> explanatory variables may be trusted to yield reliable p-values?
>>>>>> Is so, can you please point me at a reference which treats this
>>>>>> problem.
>>>>>> Thanks and best wishes,
>>>>>> Rich
>>>>>> ------------------------------------------------------------
>>>>>> Richard A. Friedman, PhD
>>>>>> Associate Research Scientist,
>>>>>> Biomedical Informatics Shared Resource
>>>>>> Herbert Irving Comprehensive Cancer Center (HICCC)
>>>>>> Lecturer,
>>>>>> Department of Biomedical Informatics (DBMI)
>>>>>> Educational Coordinator,
>>>>>> Center for Computational Biology and Bioinformatics (C2B2)/
>>>>>> National Center for Multiscale Analysis of Genomic Networks  
>>>>>> (MAGNet)
>>>>>> Room 824
>>>>>> Irving Cancer Research Center
>>>>>> Columbia University
>>>>>> 1130 St. Nicholas Ave
>>>>>> New York, NY 10032
>>>>>> (212)851-4765 (voice)
>>>>>> friedman using cancercenter.columbia.edu
>>>>>> http://cancercenter.columbia.edu/~friedman/
>>>>>>
>>>>>> I am a Bayesian. When I see a multiple-choice question on a test
>>>>>> and I don't
>>>>>> know the answer I say "eeney-meaney-miney-moe".
>>>>>>
>>>>>> Rose Friedman, Age 14
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-SIG-Robust using r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-robust
>>>>>>
>>>>
>>>>
>>
>> _______________________________________________
>> R-SIG-Robust using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-robust
>>
>
>
>
> -- 
> Manuel Koller <koller using stat.math.ethz.ch>
> Seminar für Statistik, HG G 18, Rämistrasse 101
> ETH Zürich  8092 Zürich  SWITZERLAND
> phone: +41 44 632-4673 fax: ...-1228
> http://stat.ethz.ch/people/kollerma/
> <intercept_only.pdf>




More information about the R-SIG-Robust mailing list