Plot optimization [Was: Re: [R] Plotting Speed: R vs Octave]

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Wed Dec 4 08:35:15 CET 2002


On Tue, 3 Dec 2002, Matej Cepl wrote:

> On Tue, 3 Dec 2002, Ben Bolker wrote:
> >   It seems to make more sense to put a tiny bit of effort into
> > thinning the points at your end rather than building code into
> > R's postscript driver to deal with this case.
>
> I will have to study your code, but it seems to me, that such
> problem can occur quite often, so that optimization-out of this
> ... well, I believe, bug ... could help a lot.

Plotting the points you asked for is not a bug: please read the FAQ
about BUGS.  It is not often that users are careless enough to plot
duplicated points many times.  The issue would apply to all drivers, not
just postscript().

It's also not easy to do in the driver: suppose the points were of
different colours?  The postscript rules are of opaque paint, so the last
one wins.  It would be necessary to cache all the points, delete the
earliest ones and then plot them.  Why don't you write an R program to
post-process the .ps output if this bothers you?

Recently one of my students plotted a scatterplot with 1.4m points on. I
couldn't print the PDF from Linux (a 135Mb file) although he could from
Windows (on the same printer which has 96Mb of memory in).  We didn't
report this as a bug in R (and there were no exact duplicates), rather
came up with a statistical way of thinning the points (by forming a
density estimate and thinning the points back to a maximum density).

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list