"More comprehensible" depends on context, which we don't have. You could
be simply trying to illustrate the logic of the transformation (solution 1
below) or providing a recipe which by its brevity is (perhaps?) memorable
(solution 2 below).
Solution 1:
appendDummys <- function( DF, keycol, d.base ) {
# get levels of key column
lvls <- levels( DF[[ keycol ]] )
# for each level in the key column
keyno <- 1L
for ( keylvl in lvls ) {
# name for new column
dname <- paste0( d.base, keyno )
# make the new column, filled with default value
DF[[ dname ]] <- 0L
# change those values in the new column where the value matches the
# current level
DF[ keylvl == DF[[ keycol ]], dname ] <- 1L
# prepare for next loop
keyno <- keyno + 1L
}
# return modified data frame
DF
}
haireye <- margin.table(HairEyeColor, 1:2)
haireye.df <- as.data.frame(haireye)
haireye.df <- appendDummys( haireye.df, "Hair", "h" )
haireye.df <- appendDummys( haireye.df, "Eye", "e" )
###
Solution 2
haireye <- margin.table(HairEyeColor, 1:2)
haireye.df <- as.data.frame(haireye)
dummykeys <- data.frame( h = factor( as.integer( haireye.df$Hair ) )
, e = factor( as.integer( haireye.df$Eye ) ) )
dummy.hair <- as.data.frame( model.matrix( ~ h - 1 ), data=dummykeys )
dummy.eye <- as.data.frame( model.matrix( ~ e - 1 ), data=dummykeys )
haireye.df <- data.frame( haireye.df, dummy.hair, dummy.eye )
###
FWIW I am not a fan of mixing the model matrix columns in with the
original data... the column names can (in general) clash.
On Tue, 30 Dec 2014, Michael Friendly wrote:
In a manuscript, I have the following code to illustrate dummy coding of two
factors in a contingency table.
It works, but is surely obscured by the method I used, involving outer() to
find equalities and 0+outer()
to convert to numeric. Can someone help simplify this code to be more
comprehensible and give the
*same* result? I'd prefer a solution that uses base R.
haireye <- margin.table(HairEyeColor, 1:2)
haireye.df <- as.data.frame(haireye)
dummy.hair <- 0+outer(haireye.df$Hair, levels(haireye.df$Hair), `==`)
colnames(dummy.hair) <- paste0('h', 1:4)
dummy.eye <- 0+outer(haireye.df$Eye, levels(haireye.df$Eye), `==`)
colnames(dummy.eye) <- paste0('e', 1:4)
haireye.df <- data.frame(haireye.df, dummy.hair, dummy.eye)
haireye.df
haireye.df
Hair Eye Freq h1 h2 h3 h4 e1 e2 e3 e4
1 Black Brown 68 1 0 0 0 1 0 0 0
2 Brown Brown 119 0 1 0 0 1 0 0 0
3 Red Brown 26 0 0 1 0 1 0 0 0
4 Blond Brown 7 0 0 0 1 1 0 0 0
5 Black Blue 20 1 0 0 0 0 1 0 0
6 Brown Blue 84 0 1 0 0 0 1 0 0
7 Red Blue 17 0 0 1 0 0 1 0 0
8 Blond Blue 94 0 0 0 1 0 1 0 0
9 Black Hazel 15 1 0 0 0 0 0 1 0
10 Brown Hazel 54 0 1 0 0 0 0 1 0
11 Red Hazel 14 0 0 1 0 0 0 1 0
12 Blond Hazel 10 0 0 0 1 0 0 1 0
13 Black Green 5 1 0 0 0 0 0 0 1
14 Brown Green 29 0 1 0 0 0 0 0 1
15 Red Green 14 0 0 1 0 0 0 0 1
16 Blond Green 16 0 0 0 1 0 0 0 1
--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street Web:http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<[email protected]> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.