I am working on a system to visualize survey responses. Survey responses typically include factors, numeric, timestamps, textfields and therefore fit perfectly nice in dataframes, making it easy to visualize using standard R functions.
However I am currently working on a survey that also include questions in which the respondent can check more than one answer on a single multichoice item. I.e. this represents a factor for which every row has multiple responses. I am looking for a way to put this into a dataframe together with the other questions of the survey. I considered three workarounds, but both are problematic: - Column-wise expanding: convert a single multi-choice item into N binary column factors for every possible response (level) with 1/0 values representing if the answer was checked or not. Problem with this is that you lose the information that these N columns are in fact one question and it becomes very hard to vizualise this single question. - Row wise expanding: convert a single response into N rows, one for every response. Problem with this is that if the factor is part of the dataframe, also all of the other items have to be duplicated, leading to artificial results. I was wondering if there is a more natural datastructure to put a multi-choice item into a dataframe? Some code for illustration: people <- list( name=c("John", "Mary", "Jennifer", "Neil"), gender=factor(c("M","F","F","M")), age=c(34,23,40,30), residence=sapply(list("US", c("US", "CA"), "MX", c("MX", "US", "CA")), factor, levels=c("US", "CA", "MX")) ); -- View this message in context: http://r.789695.n4.nabble.com/datastructure-for-multi-choice-factors-tp3650940p3650940.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.