[R] scatterplot of 100000 points and pdf file format

Patrick Connolly p.connolly at hortresearch.co.nz
Thu Nov 25 00:35:34 CET 2004


On Wed, 24-Nov-2004 at 10:22AM -0600, Marc Schwartz wrote:

|> On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote:
|> > Hi,
|> > 
|> > I want to draw a scatter plot with 1M  and more points and save it as pdf.
|> > This makes the pdf file large.
|> > So i tried to save the file first as png and than convert it to pdf. 
|> > This looks OK if printed but if viewed e.g. with acrobat as document 
|> > figure the quality is bad.
|> > 
|> > Anyone knows a way to reduce the size but keep the quality?
|> 
|> Hi Eryk!
|> 
|> Part of the problem is that in a pdf file, the vector based instructions
|> will need to be defined for each of your 10 ^ 6 points in order to draw
|> them.
|> 
|> When trying to create a simple example:
|> 
|> pdf()
|> plot(rnorm(1000000), rnorm(1000000))
|> dev.off()
|> 
|> The pdf file is 55 Mb in size.
|> 
|> One immediate thought was to try a ps file and using the above plot, the
|> ps file was "only" 23 Mb in size. So note that ps can be more efficient.
|> 
|> Going to a bitmap might result in a much smaller file, but as you note,
|> the quality does degrade as compared to a vector based image.
|> 
|> I tried the above to a png, then converted to a pdf (using 'convert')
|> and as expected, the image both viewed and printed was "pixelated",
|> since the pdf instructions are presumably drawing pixels and not vector
|> based objects.

Using bitmap( ... , res = 300), I get a bitmap file of 56 Kb.

It's rather slow, most of the time being taken up using gs which is
converting the vector image I suspect.  Time would be much shorter if,
say a circle of diameter of 4 is left unplotted in the middle and
others have mentioned other ways of reducing redundant points.

A pdf file slightly larger than the png file can be made directly from
OpenOffice that has the png imported into it.  For a plot of 160mm
square, this pdf printed unpixelated.

Depending on what size (dimensions) you need to finish up with, you
might find you could get away with a lower resolution than 300 dpi,
but I usually find 200 too ragged.

HTH

-- 
Patrick Connolly
HortResearch
Mt Albert
Auckland
New Zealand 
Ph: +64-9 815 4200 x 7188
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~
I have the world`s largest collection of seashells. I keep it on all
the beaches of the world ... Perhaps you`ve seen it.  ---Steven Wright 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~




More information about the R-help mailing list