hist {graphics}  R Documentation 
Histograms
Description
The generic function hist
computes a histogram of the given
data values. If plot = TRUE
, the resulting object of
class "histogram"
is plotted by
plot.histogram
, before it is returned.
Usage
hist(x, ...)
## Default S3 method:
hist(x, breaks = "Sturges",
freq = NULL, probability = !freq,
include.lowest = TRUE, right = TRUE, fuzz = 1e7,
density = NULL, angle = 45, col = "lightgray", border = NULL,
main = paste("Histogram of" , xname),
xlim = range(breaks), ylim = NULL,
xlab = xname, ylab,
axes = TRUE, plot = TRUE, labels = FALSE,
nclass = NULL, warn.unused = TRUE, ...)
Arguments
x 
a vector of values for which the histogram is desired. 
breaks 
one of:
In the last three cases the number is a suggestion only; as the
breakpoints will be set to 
freq 
logical; if 
probability 
an alias for 
include.lowest 
logical; if 
right 
logical; if 
fuzz 
nonnegative number, for the case when the data is
“pretty” and some observations 
density 
the density of shading lines, in lines per inch.
The default value of 
angle 
the slope of shading lines, given as an angle in degrees (counterclockwise). 
col 
a colour to be used to fill the bars. 
border 
the color of the border around the bars. The default is to use the standard foreground color. 
main , xlab , ylab 
main title and axis labels: these arguments to

xlim , ylim 
the range of x and y values with sensible defaults.
Note that 
axes 
logical. If 
plot 
logical. If 
labels 
logical or character string. Additionally draw labels on top
of bars, if not 
nclass 
numeric (integer). For S(PLUS) compatibility only,

warn.unused 
logical. If 
... 
further arguments and graphical parameters passed to

Details
The definition of histogram differs by source (with
countryspecific biases). R's default with equispaced breaks (also
the default) is to plot the counts in the cells defined by
breaks
. Thus the height of a rectangle is proportional to
the number of points falling into the cell, as is the area
provided the breaks are equallyspaced.
The default with nonequispaced breaks is to give a plot of area one, in which the area of the rectangles is the fraction of the data points falling in the cells.
If right = TRUE
(default), the histogram cells are intervals
of the form (a, b]
, i.e., they include their righthand endpoint,
but not their left one, with the exception of the first cell when
include.lowest
is TRUE
.
For right = FALSE
, the intervals are of the form [a, b)
,
and include.lowest
means ‘include highest’.
A numerical tolerance of 10^{7}
times the median bin size
(for more than four bins, otherwise the median is substituted) is
applied when counting entries on the edges of bins. This is not
included in the reported breaks
nor in the calculation of
density
.
The default for breaks
is "Sturges"
: see
nclass.Sturges
. Other names for which algorithms
are supplied are "Scott"
and "FD"
/
"FreedmanDiaconis"
(with corresponding functions
nclass.scott
and nclass.FD
).
Case is ignored and partial matching is used.
Alternatively, a function can be supplied which
will compute the intended number of breaks or the actual breakpoints
as a function of x
.
Value
an object of class "histogram"
which is a list with components:
breaks 
the 
counts 

density 
values 
mids 
the 
xname 
a character string with the actual 
equidist 
logical, indicating if the distances between

References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
Venables, W. N. and Ripley. B. D. (2002) Modern Applied Statistics with S. Springer.
See Also
nclass.Sturges
, stem
,
density
, truehist
in package
MASS.
Typical plots with vertical bars are not histograms. Consider
barplot
or plot(*, type = "h")
for such bar plots.
Examples
op < par(mfrow = c(2, 2))
hist(islands)
utils::str(hist(islands, col = "gray", labels = TRUE))
hist(sqrt(islands), breaks = 12, col = "lightblue", border = "pink")
## For nonequidistant breaks, counts should NOT be graphed unscaled:
r < hist(sqrt(islands), breaks = c(4*0:5, 10*3:5, 70, 100, 140),
col = "blue1")
text(r$mids, r$density, r$counts, adj = c(.5, .5), col = "blue3")
sapply(r[2:3], sum)
sum(r$density * diff(r$breaks)) # == 1
lines(r, lty = 3, border = "purple") # > lines.histogram(*)
par(op)
require(utils) # for str
str(hist(islands, breaks = 12, plot = FALSE)) #> 10 (~= 12) breaks
str(hist(islands, breaks = c(12,20,36,80,200,1000,17000), plot = FALSE))
hist(islands, breaks = c(12,20,36,80,200,1000,17000), freq = TRUE,
main = "WRONG histogram") # and warning
## Extreme outliers; the "FD" rule would take very large number of 'breaks':
XXL < c(1:9, c(1,1)*1e300)
hh < hist(XXL, "FD") # did not work in R <= 3.4.1; now gives warning
## pretty() determines how many counts are used (platform dependently!):
length(hh$breaks) ## typically 1 million  though 1e6 was "a suggestion only"
## R >= 4.2.0: no "*.5" labels on yaxis:
hist(c(2,3,3,5,5,6,6,6,7))
require(stats)
set.seed(14)
x < rchisq(100, df = 4)
## Histogram with custom xaxis:
hist(x, xaxt = "n")
axis(1, at = 0:17)
## Comparing data with a model distribution should be done with qqplot()!
qqplot(x, qchisq(ppoints(x), df = 4)); abline(0, 1, col = 2, lty = 2)
## if you really insist on using hist() ... :
hist(x, freq = FALSE, ylim = c(0, 0.2))
curve(dchisq(x, df = 4), col = 2, lty = 2, lwd = 2, add = TRUE)