[R] Clean up a scatterplot with too much data

Dennis Murphy djmuser at gmail.com
Tue Aug 2 15:07:21 CEST 2011


In addition to the other responses (all of which I liked), a couple of
other alternatives to consider are 2D density plots (see ?kde2d in the
MASS package, for example) or geom_tile() in the ggplot2 package,
which you can think of as a 3D histogram projected to 2D with color
corresponding to (relative) frequency, as suggested by Paul Hiemstra.
geom_tile() is a discretized, gridded version of a hexbin plot, but I
would start with the hexbin myself. I echo KOH's comment: make sure
you remove the outliers first, especially that one in the upper left
corner :)

After looking at your plot, here's my question: why would you plot
kills/minute vs. minutes played? Doesn't the first variable render the
second one moot? Wouldn't kills vs. minutes played be a more relevant
(scatter)plot? If you have information on the skill level of the
players, you could incorporate that information into the plot as well.
There are several nice ways to go if this is the case.

If kills/minute is the more appropriate measure, a univariate density
plot would make sense, or a histogram.

HTH,
Dennis

On Mon, Aug 1, 2011 at 10:26 PM, DimmestLemming <NICOADAMS000 at gmail.com> wrote:
> I'm working with a lot of data right now, but I'm new to R, and not very good
> with it, hence my request for help. What type of graph could I use to
> straighten out things like...
>
> http://r.789695.n4.nabble.com/file/n3711389/Untitled.png
>
> ...this?
>
> I want to see general frequencies. Should I use something like a 3D
> histogram, or is there an easier way like, say, shading? I'm sure these are
> both possible, but I don't know which is easiest or how to implement either
> of them.
>
> Thanks!
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list