[R] : unusual combinations of categorical data

Michael Friendly friendly at yorku.ca
Tue Nov 9 14:53:21 CET 2010


On 11/8/2010 5:25 PM, Alan Chalk wrote:
> Regarding unusual combinations of factors in categorical data.
> Are there any R packages that can be used to identify the outliers i.e.
> unusual combinations in categorical datasets ?

"Unusual combinations" of factors are those that have large residuals in 
some loglinear model (or glm with poisson link)-- positive if the
observed frequencies are > expected, negative otherwise.
The most basic 'null' loglinear model is that of mutual independence,
however, if some of the factors are predictors, it makes sense to
include their highest interaction in the null model.

Fit the model with loglm() or glm(), and use vcd::mosaic() to visualize
the outliers.

HTH

-- 
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA



More information about the R-help mailing list