[R] "hist" combines two lowest categories -- is there a workaround?

Dieter Menne dieter.menne at menne-biomed.de
Thu Jan 31 08:58:42 CET 2008


Ben Fairbank <BEN <at> SSANET.COM> writes:

> 
> When preparing a series of histograms I found that hist was combining
> the two lowest categories or bins, 1 and 2.  Specifying breaks, as
> illustrated below, resulted in the correct histogram:
> 
> values <- sample(10,500,replace=TRUE)
> 
> hist(values)
> 
> hist(values,breaks = 0:10)
> 
> Apparently, the number of values strictly less than 1 is shown in the
> first bin (and since none is less than 1, the value is 0), while the
> other bins appear to show the number of values less than or equal to the
> bin's upper bound.  Is there a setting that will show the number of
> values less than or equal to the first bin's upper bound?
> 

For irregular spacing, it's best when you do the factoring first, for example
with cut; and use histogram (lattice), which is more flexible than hist. Below
an example I use for age groups: 

Dieter
-----------------------

library(lattice)
set.seed(4711)
age = floor(rnorm(100,50,15))
ageg = cut(age %/% 10  *10,c(0,seq(20,70,10),100),included.lowest=TRUE,
  right=FALSE,  ordered_result=TRUE)
# default plot
histogram(~ageg)
# if you really need it:
levels(ageg)  = c("<20","20-29","30-39","40-49","50-59","60-69","70+")
histogram(~ageg)



More information about the R-help mailing list