# [R] distribution of daily rainfall values in binned categories

Francisco J. Zagmutt gerifalte28 at hotmail.com
Wed Jun 28 18:34:49 CEST 2006

```Hi Martin

I agree with all your previous concerns.  I was just answering her question
about visualizing frequencies for a continuous variable that is artificially
categorized.  However, she did mention the word *distribution* (a part that
appropriate. I am surprised nobody else jumped with the usual discussion
about violin plots and his friends   ;-)

Cheers

Francisco

Dr. Francisco J. Zagmutt
College of Veterinary Medicine and Biomedical Sciences

>From: Martin Maechler <maechler at stat.math.ethz.ch>
>Reply-To: Martin Maechler <maechler at stat.math.ethz.ch>
>To: "Francisco J. Zagmutt" <gerifalte28 at hotmail.com>
>CC: etiennesky at yahoo.com, r-help at stat.math.ethz.ch
>Subject: Re: [R] distribution of daily rainfall values in binned categories
>Date: Wed, 28 Jun 2006 10:39:58 +0200
>
> >>>>> "FJZ" == Francisco J Zagmutt <gerifalte28 at hotmail.com>
> >>>>>     on Wed, 28 Jun 2006 03:51:31 +0000 writes:
>
>     FJZ> Hi Etienne,
>     FJZ> Somebody asked a somehow related question recently.
>     FJZ> http://tolstoy.newcastle.edu.au/R/help/06/06/29485.html
>
>     FJZ> Take a look at cut? table? and barplot?
>     FJZ> i.e.
>
>       # Creates fake data from uniform(0,30)
>       set.seed(1) ## <<- added by MM
>       x=runif(50, 0,30)
>
>       # Creates categories
>       rain=cut(x,breaks=c( 0, 1,2.5,5, 10, 20, Inf))
>
>       # Creates contingency table of categories
>       tab=table(rain)
>
>       # Plots frequencies of rainfall
>       barplot(tab)
>
>
>No, no, no!  Do not confuse histograms with bar plots!
>
>-  barplot() is {one possibility} for visualizing discrete
>    ("categorical", "factor") data,
>-  hist() is for visualizing *continuous* data  (*)
>
>As Jim Porzak replied, do use hist(): the example really is a matter
>of visualization of a continuous distribution which should *not*
>be done by a barplot.  Instead, e.g.,
>
>   hist(x, breaks = c(0, 1,2.5,5, 10,20, max(pretty(max(x)))),
>        freq = TRUE, col = "gray")
>
>will give a graphic similar to the above --- BUT also
>warns you about the hidden deception (aka sillyness) of *both* graphics:
>Namely, the above hist() call warns you with
>
> >> Warning message:
> >> the AREAS in the plot are wrong -- rather use freq=FALSE in: ....
>
>and finally,
>
>   hist(x, breaks = c(0, 1,2.5,5, 10,20, max(pretty(max(x)))), col="gray")
>
>gives you a more honest graphic --- which -- for the runif()
>example -- may finally lead to you to realize that using unequal
>break may really not be such a good idea.
>Note however that for the OP rainfall data, that may well be different
>and if I look at rainfall data, I find I would rather view
>
>    hist(log10( <rainfall> ))
>or then
>    plot(density( log10( <rainfall> ) ))
>
>Martin Maechler, ETH Zurich
>
>(*) From statistical point of view, histograms just density estimators,
>     and -- as known for a while -- have quite some drawbacks.
>     Hence they should nowadays often be replaced by
>         plot(density(.), ..)
>
>
>     >> From: etienne <etiennesky at yahoo.com>
>     >> To: r-help at stat.math.ethz.ch
>     >> Subject: [R] distribution of daily rainfall values in binned
>categories
>     >> Date: Tue, 27 Jun 2006 11:28:59 -0700 (PDT)
>     >>
>     >> Hi,
>     >>
>     >> I'm a newbie in using R and I would like to have a few
>     >> clues as to how I could compute and plot a
>     >> distribution of daily rainfall intensity in different
>     >> categories.  I have daily values (mm/day) for several
>     >> years and I need to show the frequency of 0-1, 1-2.5,
>     >> 2.5-5, 5-10, 10-20, 20+ mm/day.  Can this be done
>     >> easily?
>     >>
>     >> Thanks,
>     >> Etienne
>     >>
>     >> ______________________________________________
>     >> R-help at stat.math.ethz.ch mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-help