On 11/8/2010 5:25 PM, Alan Chalk wrote:
Regarding unusual combinations of factors in categorical data. Are there any R packages that can be used to identify the outliers i.e. unusual combinations in categorical datasets ?
"Unusual combinations" of factors are those that have large residuals in some loglinear model (or glm with poisson link)-- positive if the
observed frequencies are > expected, negative otherwise. The most basic 'null' loglinear model is that of mutual independence, however, if some of the factors are predictors, it makes sense to include their highest interaction in the null model. Fit the model with loglm() or glm(), and use vcd::mosaic() to visualize the outliers. HTH -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.