[R] Drawing a histogram from a massive dataset

Dennis Murphy djmuser at gmail.com
Fri Jul 15 17:15:46 CEST 2011


Hi:

I would suggest that you avoid the histogram and make a density plot
instead. It would be more informative and probably require a lot less
time and ink. If you're married to the histogram concept, try taking a
sample of about 10000 and get a histogram of that instead. The result
shouldn't be much different from that of the entire sample - to test
out this hypothesis, take several random samples of size 10000 and
compare the histograms. If they're not much different in shape, it's
likely that the full sample is close to the same. If there are
noticeable differences, try 50000 or 100000 instead (rinse and
repeat).

HTH,
Dennis

On Fri, Jul 15, 2011 at 4:21 AM, Paul Smith <phhs80 at gmail.com> wrote:
> Dear All,
>
> I have a massive dataset from which I would like to draw a histogram.
> Any ideas on how to accomplish this?
>
> Thanks in advance,
>
> Paul
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list