[R] Drawing a histogram from a massive dataset

Paul Smith phhs80 at gmail.com
Mon Jul 18 19:57:05 CEST 2011


Thanks, Dennis, for your suggestions. I was thinking about the package
'sqldf', but I guess that I must have a data frame to plot a
histogram.

Paul


On Fri, Jul 15, 2011 at 4:15 PM, Dennis Murphy <djmuser at gmail.com> wrote:
> I would suggest that you avoid the histogram and make a density plot
> instead. It would be more informative and probably require a lot less
> time and ink. If you're married to the histogram concept, try taking a
> sample of about 10000 and get a histogram of that instead. The result
> shouldn't be much different from that of the entire sample - to test
> out this hypothesis, take several random samples of size 10000 and
> compare the histograms. If they're not much different in shape, it's
> likely that the full sample is close to the same. If there are
> noticeable differences, try 50000 or 100000 instead (rinse and
> repeat).
>
> HTH,
> Dennis
>
> On Fri, Jul 15, 2011 at 4:21 AM, Paul Smith <phhs80 at gmail.com> wrote:
>> Dear All,
>>
>> I have a massive dataset from which I would like to draw a histogram.
>> Any ideas on how to accomplish this?
>>
>> Thanks in advance,
>>
>> Paul
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list