[R] scatterplot of 100000 points and pdf file format
andy_liaw at merck.com
Wed Nov 24 17:37:29 CET 2004
I have no experience with it, but I believe the hexbin package in BioC was
there for this purpose: avoid heavy over-plotting lots of points. You might
want to look into that, if you have not done so yet.
> From: Marc Schwartz
> On Wed, 2004-11-24 at 16:34 +0100, Witold Eryk Wolski wrote:
> > Hi,
> > I want to draw a scatter plot with 1M and more points and
> save it as pdf.
> > This makes the pdf file large.
> > So i tried to save the file first as png and than convert
> it to pdf.
> > This looks OK if printed but if viewed e.g. with acrobat as
> > figure the quality is bad.
> > Anyone knows a way to reduce the size but keep the quality?
> Hi Eryk!
> Part of the problem is that in a pdf file, the vector based
> will need to be defined for each of your 10 ^ 6 points in
> order to draw
> When trying to create a simple example:
> plot(rnorm(1000000), rnorm(1000000))
> The pdf file is 55 Mb in size.
> One immediate thought was to try a ps file and using the
> above plot, the
> ps file was "only" 23 Mb in size. So note that ps can be more
> Going to a bitmap might result in a much smaller file, but as
> you note,
> the quality does degrade as compared to a vector based image.
> I tried the above to a png, then converted to a pdf (using 'convert')
> and as expected, the image both viewed and printed was "pixelated",
> since the pdf instructions are presumably drawing pixels and
> not vector
> based objects.
> Depending upon what you plan to do with the image, you may have to
> choose among several options, resulting in tradeoffs between image
> quality and file size.
> If you can create the bitmap file explicitly in the size that you
> require for printing or incorporating in a document, that is
> one way to
> go and will preserve, to an extent, the overall fixed size image
> quality, while keeping file size small.
> Another option to consider for the pdf approach, if it does not
> compromise the integrity of your plot, is to remove any duplicate data
> points if any exist. Thus, you will not need what are in effect
> redundant instructions in the pdf file. This may not be possible
> depending upon the nature of your data (ie. doubles) without
> some tolerance level for "equivalence".
> Perhaps others will have additional ideas.
> Marc Schwartz
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide!
More information about the R-help