[R] Re. When is *interactive* data visualization useful to use?
Antony Unwin
unwin at math.uni-augsburg.de
Fri Feb 11 14:55:22 CET 2011
Hello Tal,
You asked *When is it helpful to use interactive plots? Either for data exploration (for ourselves) and data presentation (for a "client")?*
My answer: It's helpful for checking data quality, for exploration with and without "clients", for checking results, and for data presenting.
Notes:
(1) It's difficult to explain interactive data visualization in print, demonstrations are so much more effective.
(2) Interactive data visualization is fun, both for the analyst, and more important, for the dataset owners. You not only get better interaction with the data, you get better interaction with the scientists you cooperate with. They are prepared to contribute, because they can understand what is going on. That is not always the case with statistical models.
(3) The key is not "animation" but "direct manipulation". The aim is to be able to directly interact with all statistical objects in a graphic: querying, linking, reordering, reformatting, zooming, whatever.
(4) You write of point-based graphics, what about area-based graphics like histograms, barcharts and mosaicplots? For categorical data the ability to select groups and look at spineplots of other variables to compare proportions is very effective. (And don't forget linking to maps for spatial data.)
(5) You mention outliers. How do you decide what is an outlier? Interactive parallel coordinate plots are extremely useful, either for identifying outliers or for checking ones found with an analytic approach.
(6) Interactive data visualization is not in competition with other approaches, it complements them. Results found with models should be checked graphically and results found graphically should be checked analytically. Your comment about data dredging is important, though why people think this only happens with graphics and not with modelling approaches always puzzles me!
(7) There are often interesting features of a dataset (not just errors and outlier groups) that can be found graphically that would be difficult or impossible to find analytically.
Have a look at Interactive Graphics for Data Analysis: Principles and Examples by Martin Theus and Simon Urbanek (Chapman & Hall). There are some excellent explanations and case studies there.
I could go on (and on), but what you really need is a good demo.
Best regards
Antony
PS Have you reported the bugs in GGobi and Mondrian you have found to the software authors?
Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis,
Mathematics Institute,
University of Augsburg,
86135 Augsburg, Germany
More information about the R-help
mailing list