[R] Drawing a histogram from a massive dataset

Paul Smith phhs80 at gmail.com
Mon Jul 18 23:08:32 CEST 2011


On Mon, Jul 18, 2011 at 9:11 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
>> [snip] I guess that I must have a data frame to plot a histogram.
>
> Not at all!
>
> ## a *vector* of 100 million observation
> x <- rnorm(10^8)
> ## a histogram for it (see attached for the result from my system)
> hist(x)
>
> No data frame required.  I would not try this straight in anything but
> traditional graphics for a 100 million observation vector, but if you
> wanted it made in ggplot2 or something, you could prebin the data and
> THEN plot bars corresponding to the bins.

Thanks, Joshua, for your answer.

True: A vector is enough to supply data for hist(). But my point is:
Can a histogram be drawn without having all data on the computer
memory? You partially answer this question by suggesting to prebind
the data. Can this prebinning process be done transparently but chunk
by chunk of data underneath?

Paul



More information about the R-help mailing list