[R] scatterplot of 100000 points and pdf file format

Matt Nelson MNelson at sequenom.com
Wed Nov 24 17:53:01 CET 2004


I have found that plotting more than a few thousand data points at a time
quickly becomes a loosing proposition.  That is, the dense overlap of data
points tends to obscure the patterns of interest, with only outliers
distinctly visible.  I typically deal with this in two ways.  

The most straight forward is to select a much smaller subset data points to
plot, say on the order of 100-1000, depending on the nature of the data and
the features you want to illustrate.  How you sample depends on the
structure of your data set.  E.g. you may want to sample fixed proportions
within subgroups.  You can add loess lines or confidence ellipses estimated
from the complete data.

Another approach is to estimate the two dimensional density using kde2d()
(MASS package) and represent the result with a contour or image plot.  See
?kde2d for an example.  

Both of these will result in much more manageable (and likely more
informative) figures.


Matthew R. Nelson, Ph.D.
Director, Biostatistics
Sequenom, Inc.

> -----Original Message-----
> From: Witold Eryk Wolski [mailto:wolski at molgen.mpg.de]
> Sent: Wednesday, November 24, 2004 7:35 AM
> To: R Help Mailing List
> Subject: [R] scatterplot of 100000 points and pdf file format
> Hi,
> I want to draw a scatter plot with 1M  and more points and 
> save it as pdf.
> This makes the pdf file large.
> So i tried to save the file first as png and than convert it to pdf. 
> This looks OK if printed but if viewed e.g. with acrobat as document 
> figure the quality is bad.
> Anyone knows a way to reduce the size but keep the quality?
> /E
> -- 
> Dipl. bio-chem. Witold Eryk Wolski
> MPI-Moleculare Genetic
> Ihnestrasse 63-73 14195 Berlin
> tel: 0049-30-83875219                 __("<    _
> http://www.molgen.mpg.de/~wolski      \__/    'v'
> http://r4proteomics.sourceforge.net    ||    /   \
> mail: witek96 at users.sourceforge.net    ^^     m m
>       wolski at molgen.mpg.de
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html

More information about the R-help mailing list