[R] perception of graphical data

hadley wickham h.wickham at gmail.com
Sat Aug 25 00:42:09 CEST 2007

Hi Richard,

> I apologize that this is off-topic.  I am seeking information on
> perception of graphical data, in an effort to improve the plots I
> produce.  Would anyone point me to literature reviews in this area?  (Or
> keywords to try on google?)  Is this located somewhere near cognitive
> science, psychology, human factors research?

Probably the best place to start on these general issues, are a couple
of papers by Cleveland:

        Author = {Cleveland, William and McGill, Robert},
        Journal = {Journal of the Royal Statistical Society. Series A
        Number = {3},
        Pages = {192-229},
        Title = {Graphical Perception: The Visual Decoding of Quantitative
Information on Graphical Displays of Data},
        Volume = {150},
        Year = {1987}}

        Author = {Cleveland, William S. and McGill, M. E.},
        Journal = {Journal of the American Statistical Association},
        Number = 387,
        Pages = {531-554},
        Title = {Graphical Perception: Theory, Experimentation and
Application to the Development of Graphical Methods.},
        Volume = 79,
        Year = 1984}

For colour in particular, I like Ross Ihaka's introduction to the subject:

        Author = {Ihaka, Ross},
        Booktitle = {Proceedings of the 3rd International Workshop on
Distributed Statistical Computing (DSC 2003)},
        Title = {Colour for Presentation Graphics},
        Year = {2003}}

and also see colorbrewer.org

> Scatter plots of microarray data often attempt to represent thousands or
> tens of thousands of points, but all I read from them are density and
> distribution --- the gene names cannot be shown.  At what point, would a
> sunflowerplot-like display or a smooth gradient be better?  When two
> data points drawn as 50% gray disks are small and tangent, are they
> perceptually equivalent to a single, 100% black disk?  Or a 50% gray
> disk with twice the area?  What problems are known about plotting with
> disks --- do viewers use the area or the diameter (or neither) to gauge
> weight?

I think many of these are still research topics.  Two (of many) places
to start are:

        Author = {Huang, Chisheng and McDonald, John Alan and Stuetzle, Werner},
        Journal = {Journal of Computational and Graphical Statistics},
        Pages = {383--396},
        Title = {Variable resolution bivariate plots},
        Volume = {6},
        Year = {1997}}

        Author = {Carr, D. B. and Littlefield, R. J. and Nicholson, W. L. and
Littlefield, J. S.},
        Journal = {Journal of the American Statistical Association},
        Number = {398},
        Pages = {424-436},
        Title = {Scatterplot Matrix Techniques for Large N},
        Volume = {82},
        Year = {1987}}

> As you can tell, I'm a non-expert, mixing issues of data interpretation,
> visual perception, graphic representation.  Previously, I didn't have
> the flexibility of R's graphics, so I didn't need to think so much.
> I've read some of Edward S. Tufte's books, but found them more
> qualitative than quantitative.

More quantitative approaches are Cleveland's, Bertin's and Wilkinson's:

        Author = {Cleveland, William},
        Publisher = {Hobart Press},
        Title = {Visualizing data},
        Year = {1993}}

        Author = {Cleveland, William},
        Publisher = {Hobart Press},
        Title = {The Elements of Graphing Data},
        Year = {1994}}

        Author = {Chambers, John and Cleveland, William and Kleiner, Beat and
Tukey, Paul},
        Publisher = {Wadsworth},
        Title = {Graphical methods for data analysis},
        Year = {1983}}

        Address = {Madison, WI},
        Author = {Bertin, Jacques},
        Publisher = {University of Wisconsin Press},
        Title = {Semiology of Graphics},
        Year = {1983}}

        Author = {Wilkinson, Leland},
        Publisher = {Springer},
        Series = {Statistics and Computing},
        Title = {The Grammar of Graphics},
        Year = {2005}}

Hope this gets you started!



More information about the R-help mailing list