Hi Bert, Thanks for drawing my attention to "simplify" argument and for the examples. I understand know.
Thanks. Dan -----Original Message----- From: Bert Gunter [mailto:gunter.ber...@gene.com] Sent: Wednesday, February 20, 2013 4:25 PM To: Lopez, Dan Cc: R help (r-help@r-project.org) Subject: Re: [R] Having trouble converting a dataframe of character vectors to factors Pleaser re-read ?sapply and pay particular attention to the "simplify" argument. The following should help explain the issues: > z <- data.frame(a=letters[1:3],b=letters[4:6],stringsAsFactors=FALSE) > sapply(z,class) a b "character" "character" > z1 <- sapply(z,as.factor) > sapply(z1,class) a b c d e f "character" "character" "character" "character" "character" "character" > z2 <- sapply(z,factor, simplify = FALSE) > sapply(z2,class) a b "factor" "factor" > z3 <- lapply(z,factor) > sapply(z3,class) a b "factor" "factor" > z3 $a [1] a b c Levels: a b c $b [1] d e f Levels: d e f ## Note that both z2 and z3 are lists, and would have to be converted back to data frames. -- Bert On Wed, Feb 20, 2013 at 4:09 PM, Lopez, Dan <lopez...@llnl.gov> wrote: > R Experts, > > I have a dataframe made up of character vectors--these are results from > survey questions. I need to convert them to factors. > > I tried the following which did not work: > scs2<-sapply(scs2,as.factor) > also this didn't work: > scs2<-sapply(scs2,function(x) as.factor(x)) > > After doing either of above I end up with >>str(scs2) > > chr [1:10, 1:10] "very important" "very important" "very important" "very > important" ... > > - attr(*, "dimnames")=List of 2 > > ..$ : NULL > > ..$ : chr [1:10] "Q1_1" "Q1_2" "Q1_3" "Q1_4" ... > >>class(scs2) > "matrix" > > But when I do it one at a time it works: > scs2$Q1_1<-as.factor(scs2$Q1_1) > scs2$Q1_2<- as.factor(scs2$Q1_2) > > What am I doing wrong? How do I accomplish this with sapply or similar > function? > > Data for reproducibility: > > > scs2<-structure(list(Q1_1 = c("very important", "very important", > "very important", > > "very important", "very important", "very important", "very > important", > > "somewhat important", "important", "very important"), Q1_2 = > c("important", > > "somewhat important", "very important", "important", "important", > > "very important", "somewhat important", "somewhat important", > > "very important", "very important"), Q1_3 = c("very important", > > "important", "very important", "very important", "important", > > "very important", "very important", "somewhat important", "not > important", > > "important"), Q1_4 = c("very important", "important", "very > important", > > "very important", "important", "important", "important", "very > important", > > "somewhat important", "important"), Q1_5 = c("very important", > > "not important", "important", "very important", "not important", > > "important", "somewhat important", "important", "somewhat important", > > "not important"), Q1_6 = c("very important", "not important", > > "important", "very important", "somewhat important", "very important", > > "very important", "very important", "important", "important"), > > Q1_7 = c("very important", "somewhat important", "important", > > "somewhat important", "important", "important", "very important", > > "very important", "somewhat important", "not important"), > > Q2 = c("Somewhat", "Very Much", "Somewhat", "Very Much", > > "Very Much", "Very Much", "Very Much", "Very Much", "Very Much", > > "Very Much"), Q3 = c("yes", "yes", "yes", "yes", "yes", "yes", > > "yes", "yes", "yes", "yes"), Q4 = c("None", "None", "None", > > "None", "Confirmed Field of Study", "Confirmed Field of Study", > > "Confirmed Field of Study", "None", "None", "None")), .Names = > c("Q1_1", > > "Q1_2", "Q1_3", "Q1_4", "Q1_5", "Q1_6", "Q1_7", "Q2", "Q3", "Q4" > > ), row.names = c(78L, 46L, 80L, 196L, 188L, 197L, 39L, 195L, > > 172L, 110L), class = "data.frame") > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.