[R] distribution of daily rainfall values in binned categories
Martin Maechler
maechler at stat.math.ethz.ch
Wed Jun 28 10:39:58 CEST 2006
>>>>> "FJZ" == Francisco J Zagmutt <gerifalte28 at hotmail.com>
>>>>> on Wed, 28 Jun 2006 03:51:31 +0000 writes:
FJZ> Hi Etienne,
FJZ> Somebody asked a somehow related question recently.
FJZ> http://tolstoy.newcastle.edu.au/R/help/06/06/29485.html
FJZ> Take a look at cut? table? and barplot?
FJZ> i.e.
# Creates fake data from uniform(0,30)
set.seed(1) ## <<- added by MM
x=runif(50, 0,30)
# Creates categories
rain=cut(x,breaks=c( 0, 1,2.5,5, 10, 20, Inf))
# Creates contingency table of categories
tab=table(rain)
# Plots frequencies of rainfall
barplot(tab)
No, no, no! Do not confuse histograms with bar plots!
- barplot() is {one possibility} for visualizing discrete
("categorical", "factor") data,
- hist() is for visualizing *continuous* data (*)
As Jim Porzak replied, do use hist(): the example really is a matter
of visualization of a continuous distribution which should *not*
be done by a barplot. Instead, e.g.,
hist(x, breaks = c(0, 1,2.5,5, 10,20, max(pretty(max(x)))),
freq = TRUE, col = "gray")
will give a graphic similar to the above --- BUT also
warns you about the hidden deception (aka sillyness) of *both* graphics:
Namely, the above hist() call warns you with
>> Warning message:
>> the AREAS in the plot are wrong -- rather use freq=FALSE in: ....
and finally,
hist(x, breaks = c(0, 1,2.5,5, 10,20, max(pretty(max(x)))), col="gray")
gives you a more honest graphic --- which -- for the runif()
example -- may finally lead to you to realize that using unequal
break may really not be such a good idea.
Note however that for the OP rainfall data, that may well be different
and if I look at rainfall data, I find I would rather view
hist(log10( <rainfall> ))
or then
plot(density( log10( <rainfall> ) ))
Martin Maechler, ETH Zurich
(*) From statistical point of view, histograms just density estimators,
and -- as known for a while -- have quite some drawbacks.
Hence they should nowadays often be replaced by
plot(density(.), ..)
>> From: etienne <etiennesky at yahoo.com>
>> To: r-help at stat.math.ethz.ch
>> Subject: [R] distribution of daily rainfall values in binned categories
>> Date: Tue, 27 Jun 2006 11:28:59 -0700 (PDT)
>>
>> Hi,
>>
>> I'm a newbie in using R and I would like to have a few
>> clues as to how I could compute and plot a
>> distribution of daily rainfall intensity in different
>> categories. I have daily values (mm/day) for several
>> years and I need to show the frequency of 0-1, 1-2.5,
>> 2.5-5, 5-10, 10-20, 20+ mm/day. Can this be done
>> easily?
>>
>> Thanks,
>> Etienne
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
FJZ> ______________________________________________
FJZ> R-help at stat.math.ethz.ch mailing list
FJZ> https://stat.ethz.ch/mailman/listinfo/r-help
FJZ> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list