On 11/8/2010 5:25 PM, Alan Chalk wrote:
Regarding unusual combinations of factors in categorical data.
Are there any R packages that can be used to identify the outliers i.e.
unusual combinations in categorical datasets ?

"Unusual combinations" of factors are those that have large residuals in some loglinear model (or glm with poisson link)-- positive if the
observed frequencies are > expected, negative otherwise.
The most basic 'null' loglinear model is that of mutual independence,
however, if some of the factors are predictors, it makes sense to
include their highest interaction in the null model.

Fit the model with loglm() or glm(), and use vcd::mosaic() to visualize
the outliers.

HTH

--
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to