[R] Scatterplot Showing All Points

S Ellison S.Ellison at lgc.co.uk
Tue Dec 18 14:29:27 CET 2007



>> Antony Unwin <unwin at math.uni-augsburg.de> >>
>I must admit to being very surprised that jittering and sunflower  
>plots have been suggested for a dataset of 5000 points.  Do those who 

>mentioned these methods have examples on that scale where they are  
>effective?)

You have a point. haha. 
But check the microarray literature; scatterplots have been used -
often - to display microarray data with 10000 observations at a time.
And in their defence, even on screen, a 600x600 pixel plot window holds
360000 pixels - 5000 is not a large fraction of that. Jittering has
visible effects on data at that resolution. Compare the two plots in 

library(MASS)
Sigma <- matrix(c(10,4,4,2),2,2)
xy<- round(mvrnorm(n=5000, rep(0, 2), Sigma), 1)
plot(xy,pch=".")
plot(jitter(xy, factor=2),pch=".")

But you're of course right to question how sensible this is. The best
you can get is a visual impression of the 'shape' of the data with a
greater perceived density at multiple observations which otherwise
overlapped. 

S.



More information about the R-help mailing list