[R] Comparing multiple distributions
Ravi Varadhan
rvaradhan at jhmi.edu
Thu May 31 18:09:33 CEST 2007
Your data is "compositional data". The R package "compositions" might be
useful. You might also want to consult the book by J. Aitchison: statistical
analysis of compositional data.
Ravi.
----------------------------------------------------------------------------
-------
Ravi Varadhan, Ph.D.
Assistant Professor, The Center on Aging and Health
Division of Geriatric Medicine and Gerontology
Johns Hopkins University
Ph: (410) 502-2619
Fax: (410) 614-9625
Email: rvaradhan at jhmi.edu
Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
----------------------------------------------------------------------------
--------
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of jiho
Sent: Thursday, May 31, 2007 11:37 AM
To: R-help
Subject: Re: [R] Comparing multiple distributions
Nobody answered my first request. I am sorry if I did not explain my
problem clearly. English is not my native language and statistical
english is even more difficult. I'll try to summarize my issue in
more appropriate statistical terms:
Each of my observations is not a single number but a vector of 5
proportions (which add up to 1 for each observation). I want to
compare the "shape" of those vectors between two treatments (i.e. how
the quantities are distributed between the 5 values in treatment A
with respect to treatment B).
I was pointed to Hotelling T-squared. Does it seem appropriate? Are
there other possibilities (I read many discussions about hotelling
vs. manova but I could not see how any of those related to my
particular case)?
Thank you very much in advance for your insights. See below for my
earlier, more detailed, e-mail.
On 2007-May-21 , at 19:26 , jiho wrote:
> I am studying the vertical distribution of plankton and want to
> study its variations relatively to several factors (time of day,
> species, water column structure etc.). So my data is special in
> that, at each sampling site (each observation), I don't have *one*
> number, I have *several* numbers (abundance of organisms in each
> depth bin, I sample 5 depth bins) which describe a vertical
> distribution.
>
> Then let say I want to compare speciesA with speciesB, I would end
> up trying to compare a group of several distributions with another
> group of several distributions (where a "distribution" is a vector
> of 5 numbers: an abundance for each depth bin). Does anyone know
> how I could do this (with R obviously ;) )?
>
> Currently I kind of get around the problem and:
> - compute mean abundance per depth bin within each group and
> compare the two mean distributions with a ks.test but this
> obviously diminishes the power of the test (I only compare 5*2
> "observations")
> - restrict the information at each sampling site to the mean depth
> weighted by the abundance of the species of interest. This way I
> have one observation per station but I reduce the information to
> the mean depths while the actual repartition is important also.
>
> I know this is probably not directly R related but I have already
> searched around for solutions and solicited my local statistics
> expert... to no avail. So I hope that the stats' experts on this
> list will help me.
>
> Thank you very much in advance.
JiHO
---
http://jo.irisson.free.fr/
--
Ce message a iti virifii par MailScanner
pour des virus ou des polluriels et rien de
suspect n'a iti trouvi.
CRI UPVD http://www.univ-perp.fr
More information about the R-help
mailing list