[R] plotting huge data
Martin Maechler
maechler at stat.math.ethz.ch
Fri Aug 7 16:07:40 CEST 2009
>>>>> "FEH" == Frank E Harrell <f.harrell at vanderbilt.edu>
>>>>> on Fri, 07 Aug 2009 07:19:16 -0500 writes:
FEH> gauravbhatti wrote:
>> I have a data frame with 25000 rows containing two columns Time and Distance.
That's "large" by some standards, but definitely not "huge" ...
>> When I plot a simple distance versus time plot, the plot is very confusing
>> showing no general trend because of the large data. Is there any way I can
>> improve the plot by lets say using moving average as in EXCEL ? please also
>> suggest some other methods to make the graph smoother and better looking.
>> Gaurav
FEH> I recommend using the quantreg package to fit a quantile regression
FEH> model using a spline function of Time. Draw the estimated curves for
FEH> selected quantiles such as 0.1 0.25 0.5 0.75 0.9. A new function Rq in
FEH> the Design package makes this easier but you can do it with just quantreg.
Yes, modelling (with quantreg or also lowess(), runmed() ...) is
certainly a good idea for such a "Y ~ X" situation.
But to answer the original question:
Please note that R has had for a while the very nice and useful
smoothScatter()
function, written exactly for such cases, but also for cases
that are closer to "huge": E.g. still working fast for
n <- 1e6
Martin Maechler, ETH Zurich
More information about the R-help
mailing list