[R] gamma distribution

Fri Jul 29 04:20:13 CEST 2005

Hi Christopher and Uwe. thanks for your time and guidance.
I deeply appreciate it.

-dev

Quoting Christoph Buser <buser at stat.math.ethz.ch>:

> Hi
>
> As Uwe mentioned be careful about the difference the
> significance level alpha and the power of a test.
>
> To do power calculations you should specify and alternative
> hypothesis H_A, e.g. if you have two populations you want to
> compare and we assume that they are normal distributed (equal
> unknown variance for simplicity). We are interested if there is
> a difference in the mean and want to use the t.test.
> Our Null hypothesis H_0: there is no difference in the means
>
> To do a power calculation for our test, we first have to specify
> and alternative H_A: the mean difference is 1 (unit)
> Now for a fix number of observations we can calculate the power
> of our test, which is in that case the probability that (if the
> true unknown difference is 1, meaning that H_A is true) our test
> is significant, meaning if I repeat the test many times (always
> taking samples with mean difference of 1), the number of
> significant test divided by the total number of tests is an
> estimate for the power.
>
>
> In you case the situation is a little bit more complicated. You
> need to specify an alternative hypothesis.
> In one of your first examples you draw samples from two gamma
> distributions with different shape parameter and the same
> scale. But by varying the shape parameter the two distributions
> not only differ in their mean but also in their form.
>
> I got an email from Prof. Ripley in which he explained in
> details and very precise some examples of tests and what they
> are testing. It was in addition to the first posts about t tests
> and wilcoxon test.
> I attached the email below and recommend to read it carefully. It
> might be helpful for you, too.
>
> Regards,
>
> Christoph Buser
>
> --------------------------------------------------------------
> Christoph Buser <buser at stat.math.ethz.ch>
> Seminar fuer Statistik, LEO C13
> ETH (Federal Inst. Technology)	8092 Zurich	 SWITZERLAND
> phone: x-41-44-632-4673		fax: 632-1228
> http://stat.ethz.ch/~buser/
> --------------------------------------------------------------
>
> ________________________________________________________________________
>
> From: Prof Brian Ripley <ripley at stats.ox.ac.uk>
> To: Christoph Buser <buser at stat.math.ethz.ch>
> cc: "Liaw, Andy" <andy_liaw at merck.com>
> Subject: Re: [R] Alternatives to t-tests (was Code Verification)
> Date: Thu, 21 Jul 2005 10:33:28 +0100 (BST)
>
> I believe there is a rather more to this than Christoph's account.  The
> Wilcoxon test is not testing the same null hypothesis as the t-test, and
> that may very well matter in practice and it does in the example given.
>
> The (default in R) Welch t-test tests a difference in means between two
> samples, not necessarily of the same variance or shape.  A difference in
> means is simple to understand, and is unambiguously defined at least if
> the distributions have means, even for real-life long-tailed
> distributions.  Inference from the t-test is quite accurate even a long
> way from normality and from equality of the shapes of the two
> distributions, except in very small sample sizes.  (I point my beginning
> students at the simulation study in `The Statistical Sleuth' by Ramsey and
> Schafer, stressing that the unequal-variance t-test ought to be the
> default choice as it is in R.  So I get them to redo the simulations.)
>
> The Wilcoxon test tests a shift in location between two samples from
> distributions of the same shape differing only by location.  Having the
> same shape is part of the null hypothesis, and so is an assumption that
> needs to be verified if you want to conclude there is a difference in
> location (e.g. in means).  Even if you assume symmetric distributions (so
> the location is unambiguously defined) the level of the test depends on
> the shapes, tending to reject equality of location in the presence of
> difference of shape.  So you really are testing equality of distribution,
> both location and shape, with power concentrated on location-shift
> alternatives.
>
> Given samples from a gamma(shape=2) and gamma(shape=20) distributions, we
> know what the t-test is testing (equality of means).  What is the Wilcoxon
> test testing?  Something hard to describe and less interesting, I believe.
>
> BTW, I don't see the value of the gamma simulation as this
> simultaneously changes mean and shape between the samples.  How about
> checking holding the mean the same:
>
> n <- 1000
> z1 <- z2 <- numeric(n)
> for (i in 1:n) {
>    x <- rgamma(40, 2.5, 0.1)
>    y <- rgamma(40, 10, 0.1*10/2.5)
>    z1[i] <- t.test(x, y)$p.value
>    z2[i] <- wilcox.test(x, y)$p.value
> }
> ## Level
> 1 - sum(z1>0.05)/1000  ## 0.049
> 1 - sum(z2>0.05)/1000  ## 0.15
>
> ? -- the Wilcoxon test is shown to be a poor test of equality of means.
> Christoph's simulation shows that it is able to use difference in shape as
> well as location in the test of these two distributions, whereas the
> t-test is designed only to use the difference in means.  Why compare the
> power of two tests testing different null hypotheses?
>
> I would say a very good reason to use a t-test is if you are actually
> interested in the hypothesis it tests ....
>
>
>
>
>
> pantd at unlv.nevada.edu writes:
>  > thanks for your response. btw i am calculating the power of the wilcoxon
> test. i
>  > divide the total no. of rejections by the no. of simulations. so for 1000
>  > simulations, at 0.05 level of significance if the no. of rejections are 50
> then
>  > the power will be 50/1000 = 0.05. thats y im importing in excel the p
> values.
>  >
>  > is my approach correct??
>  >
>  > thanks n regards
>  > -dev
>  >
>  >
>