[R] perception of graphical data
hadley wickham
h.wickham at gmail.com
Sat Aug 25 00:42:09 CEST 2007
Hi Richard,
> I apologize that this is off-topic. I am seeking information on
> perception of graphical data, in an effort to improve the plots I
> produce. Would anyone point me to literature reviews in this area? (Or
> keywords to try on google?) Is this located somewhere near cognitive
> science, psychology, human factors research?
Probably the best place to start on these general issues, are a couple
of papers by Cleveland:
@article{cleveland:1987,
Author = {Cleveland, William and McGill, Robert},
Journal = {Journal of the Royal Statistical Society. Series A
(General)},
Number = {3},
Pages = {192-229},
Title = {Graphical Perception: The Visual Decoding of Quantitative
Information on Graphical Displays of Data},
Volume = {150},
Year = {1987}}
@article{cleveland:1984,
Author = {Cleveland, William S. and McGill, M. E.},
Journal = {Journal of the American Statistical Association},
Number = 387,
Pages = {531-554},
Title = {Graphical Perception: Theory, Experimentation and
Application to the Development of Graphical Methods.},
Volume = 79,
Year = 1984}
For colour in particular, I like Ross Ihaka's introduction to the subject:
@inproceedings{ihaka:2003,
Author = {Ihaka, Ross},
Booktitle = {Proceedings of the 3rd International Workshop on
Distributed Statistical Computing (DSC 2003)},
Title = {Colour for Presentation Graphics},
Year = {2003}}
and also see colorbrewer.org
> Scatter plots of microarray data often attempt to represent thousands or
> tens of thousands of points, but all I read from them are density and
> distribution --- the gene names cannot be shown. At what point, would a
> sunflowerplot-like display or a smooth gradient be better? When two
> data points drawn as 50% gray disks are small and tangent, are they
> perceptually equivalent to a single, 100% black disk? Or a 50% gray
> disk with twice the area? What problems are known about plotting with
> disks --- do viewers use the area or the diameter (or neither) to gauge
> weight?
I think many of these are still research topics. Two (of many) places
to start are:
@article{huang:1997,
Author = {Huang, Chisheng and McDonald, John Alan and Stuetzle, Werner},
Journal = {Journal of Computational and Graphical Statistics},
Pages = {383--396},
Title = {Variable resolution bivariate plots},
Volume = {6},
Year = {1997}}
@article{carr:1987,
Author = {Carr, D. B. and Littlefield, R. J. and Nicholson, W. L. and
Littlefield, J. S.},
Journal = {Journal of the American Statistical Association},
Number = {398},
Pages = {424-436},
Title = {Scatterplot Matrix Techniques for Large N},
Volume = {82},
Year = {1987}}
> As you can tell, I'm a non-expert, mixing issues of data interpretation,
> visual perception, graphic representation. Previously, I didn't have
> the flexibility of R's graphics, so I didn't need to think so much.
> I've read some of Edward S. Tufte's books, but found them more
> qualitative than quantitative.
More quantitative approaches are Cleveland's, Bertin's and Wilkinson's:
@book{cleveland:1993,
Author = {Cleveland, William},
Publisher = {Hobart Press},
Title = {Visualizing data},
Year = {1993}}
@book{cleveland:1994,
Author = {Cleveland, William},
Publisher = {Hobart Press},
Title = {The Elements of Graphing Data},
Year = {1994}}
@book{chambers:1983,
Author = {Chambers, John and Cleveland, William and Kleiner, Beat and
Tukey, Paul},
Publisher = {Wadsworth},
Title = {Graphical methods for data analysis},
Year = {1983}}
@book{bertin:1983,
Address = {Madison, WI},
Author = {Bertin, Jacques},
Publisher = {University of Wisconsin Press},
Title = {Semiology of Graphics},
Year = {1983}}
@book{wilkinson:2006,
Author = {Wilkinson, Leland},
Publisher = {Springer},
Series = {Statistics and Computing},
Title = {The Grammar of Graphics},
Year = {2005}}
Hope this gets you started!
Hadley
--
http://had.co.nz/
More information about the R-help
mailing list