[R] significance testing for the difference in the ratio of means

Sat Jun 15 01:13:42 CEST 2013

Sigh...

(Again!) These are primarily statistical, not R, issues.  I would urge
that you seek local statistical help. You appear to be approaching
this with a good deal of semi-informed adhoc-ery. Standard methodology
should be applicable, but it would be presumptuous and ill-advised of
me to offer specifics remotely  without understanding in detail the
goals of your research, the nature of your design (e.g. protocols,
randomization?), and the behavior of your data (what do appropriate
plots tell you??)

Others may be bolder. Proceed at your own risk.

Cheers,
Bert

On Fri, Jun 14, 2013 at 2:07 PM, Rahul Mahajan <mahajanr at vcu.edu> wrote:
> I have a question regarding significance testing for the difference in the
> ratio of means.
> The data consists of a control and a test group, each with and without
> treatment.  I am interested in testing if the treatment has a significantly
> different effect (say, in terms of fold-activation) on the test group
> compared to the control.
>
> The form of the data with arbitrary n and not assuming equal variance:
>
> m1 = mean of (control group) n = 7
> m2 = mean of (control group w/ treatment) n=  10
> m3 = mean of (test group) n = 8
> m4 = mean of (test group w/ treatment) n = 9
>
> H0: m2/m1 = m4/m3
> restated,
> H0: m2/m1 - m4/m3 = 0;
>
> Method 1: Fieller's Intervals
> Use fieller's theorum available in R as part of the mratios package.  This
> is a promising way to compute standard error/confidence intervals for each
> of the two ratios but will not yield p-values for significance testing.
>  Significance by non-overlap of confidence intervals is too stringent a
> test and will lead to frequent type II errors.
>
> Method 2: Bootstrap
> Abandoning an analytical solution, we try a numerical solution.  I can
> repeatedly (1000 or 10,000 times)  draw with replacement samples of size
> 7,10,8,9 from m1,m2,m3,m4 respectively.  Each iteration, I can compute the
> ratio for m2/m1 and m4/m3 as well as the difference.  Standard deviations
> of the m2/m1 and the m4/m3 bootstrap distributions can give me standard
> errors for these two ratios.  Then, I can test to see where "0" falls on
> the third distribution, the distribution of the difference of the ratios.
>  If 0 falls on one of the tails, beyond the 2.5th or 97.5th percentile, I
> can declare a significant difference in the two ratios.  My question here
> is if I can correctly report the percentile location of "0" as the p-value?
>
> Method 3: Permutation test
> I understand the best way to obtain a p-value for the significance test
> would be to resample under the null hypothesis.  However, as I am comparing
> the ratio of means, I do not have individual observations to randomize
> between the groups.  The best I can think to do is create an exhaustive
> list of all (7x10) = 70 possible observations for m2/m1 from the data.
>  Then create a similar list of all (8x9) = 72 possible observations for
> m4/m3. Pool all (70+72) = 142 observations and repeatedly randomly assign
> them to two groups  of size 70 and 72 to represent the two ratios and
> compute the difference in means.  This distribution could represent the
> distribution under the null hypothesis and I could then measure where my
> observed value falls to compute the p-value.  This however, makes me
> uncomfortable as it seems to treat the data as a "mean of ratios" rather
> than a "ratio of means".
>
> Method 4: Combination of bootstrap and permutation test
> Sample with replacement samples of size 7,10,8,9 from m1,m2,m3,m4
> respectively as in method 2 above.  Calculate the two ratios for these 4
> samples (m2/m1 and m4/m3).  Record these two ratios into a list.  Repeat
> this process an arbitrary (B) number of times and record the two ratios
> into your growing list each time.  Hence if B = 10, we will have 20
> observations of the ratios.  Then proceed with permutation testing with
> these 20 ratio observations by repeatedly randomizing them into two equal
> groups of 10 and computing the difference in means of the two groups as we
> did in method 3 above.  This could potentially yeild a distribution under
> the null hypothesis and p-values could be obtained by localizing the
> observed value on this distribution.  I am unsure of appropriate values for
> B or if this method is valid at all.
>
> Another complication would be the concern for multiple comparisons if I
> wished to include additional  test groups (m5 = testgroup2; m6 = testgroup2
> w/ treatment; m7 = testgroup3, m8 = testgoup3 w/ treatment...etc) and how
> that might be appropriately handled.
>
> Method 2 seems the most intuitive to me.  Bootstrapping this way will
> likely yield appropriate Starndard Errors for the two ratios.  However, I
> am very much interested in appropriate p-values for the comparison and I am
> not sure if localizing "0" on the bootstrap distribution of the difference
> of means is appropriate.
>
> Thank you in advance for your suggestions.
>
> -Rahul
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm