[R] : unusual combinations of categorical data
Michael Friendly
friendly at yorku.ca
Tue Nov 9 14:53:21 CET 2010
On 11/8/2010 5:25 PM, Alan Chalk wrote:
> Regarding unusual combinations of factors in categorical data.
> Are there any R packages that can be used to identify the outliers i.e.
> unusual combinations in categorical datasets ?
"Unusual combinations" of factors are those that have large residuals in
some loglinear model (or glm with poisson link)-- positive if the
observed frequencies are > expected, negative otherwise.
The most basic 'null' loglinear model is that of mutual independence,
however, if some of the factors are predictors, it makes sense to
include their highest interaction in the null model.
Fit the model with loglm() or glm(), and use vcd::mosaic() to visualize
the outliers.
HTH
--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street Web: http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA
More information about the R-help
mailing list