[R] compare histograms

Michael Bedward michael.bedward at gmail.com
Wed Oct 13 13:58:31 CEST 2010


Ah, that's interesting. I'll have a look because it's bound to be
better than my effort.

Many thanks Dennis.

Michael

On 13 October 2010 22:36, Dennis Murphy <djmuser at gmail.com> wrote:
> Hi:
>
> This recent thread revealed that a package on R-forge for calculating earth
> movers distance is available:
>
> http://r.789695.n4.nabble.com/Measure-Difference-Between-Two-Distributions-td2712281.html#a2713505
>
> HTH,
> Dennis
>
> On Tue, Oct 12, 2010 at 7:39 PM, Michael Bedward <michael.bedward at gmail.com>
> wrote:
>>
>> Just to add to Greg's comments: I've previously used 'Earth Movers
>> Distance' to compare histograms. Note, this is a distance metric
>> rather than a parametric statistic (ie. not a test) but it at least
>> provides a consistent way of quantifying similarity.
>>
>> It's relatively easy to implement the metric in R (formulating it as a
>> linear programming problem). Happy to dig out the code if needed.
>>
>> Michael
>>
>> On 13 October 2010 02:44, Greg Snow <Greg.Snow at imail.org> wrote:
>> > That depends a lot on what you mean by the histograms being equivalent.
>> >
>> > You could just plot them and compare visually.  It may be easier to
>> > compare them if you plot density estimates rather than histograms.  Even
>> > better would be to do a qqplot comparing the 2 sets of data rather than the
>> > histograms.
>> >
>> > If you want a formal test then the ks.test function can compare 2
>> > datasets.  Note that the null hypothesis is that they come from the same
>> > distribution, a significant result means that they are likely different (but
>> > the difference may not be of practical importance), but a non-significant
>> > test could mean they are the same, or that you just do not have enough power
>> > to find the difference (or the difference is hard for the ks test to see).
>> >  You could also use a chi-squared test to compare this way.
>> >
>> > Another approach would be to use the vis.test function from the
>> > TeachingDemos package.  Write a small function that will either plot your 2
>> > histograms (density plots), or permute the data between the 2 groups and
>> > plot the equivalent histograms.  The vis.test function then presents you
>> > with an array of plots, one of which is the original data and the rest based
>> > on permutations.  If there is a clear meaningful difference in the groups
>> > you will be able to spot the plot that does not match the rest, otherwise it
>> > will just be guessing (might be best to have a fresh set of eyes that have
>> > not seen the data before see if they can pick out the real plot).
>> >
>> > --
>> > Gregory (Greg) L. Snow Ph.D.
>> > Statistical Data Center
>> > Intermountain Healthcare
>> > greg.snow at imail.org
>> > 801.408.8111
>> >
>> >
>> >> -----Original Message-----
>> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> >> project.org] On Behalf Of solafah bh
>> >> Sent: Monday, October 11, 2010 4:02 PM
>> >> To: R help mailing list
>> >> Subject: [R] compare histograms
>> >>
>> >> Hello
>> >> How to compare  two statistical histograms? How i can know if these
>> >> histograms are equivalent or not??
>> >>
>> >> Regards
>> >>
>> >>
>> >>
>> >>       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list