David and Bill, Thank you so much for your rapid and exceptional help. Bill, the reason I had gone with lists within the list was because I thought I might use the list for holding other information - and because it was easier to get the column name. Your simple-list suggestion is cleaner and I'm going with that for now. To better understand it, I tweaked David's fabulous lapply() statement to work with the simpler list and for prosperity, provide that below. The for-loop seems cleanest and I'll probably go with it on its own, outside of the function.
> y <- data.frame(colOne = c(1,2,3), colTwo = c("apple","pear","orange"), + colThree=c(4,5,6) ) > my.factor.defs <- list(colOne = c(1,2,3,4,5,6), + colTwo = c("apple", "pear", "orange", "fig", "banana")) > y[ , names(my.factor.defs)] <- lapply(names(my.factor.defs), function(x) { + y[[x]] <- factor(y[[x]] , levels= my.factor.defs[[x]])}) > str(y) 'data.frame': 3 obs. of 3 variables: $ colOne : Factor w/ 6 levels "1","2","3","4",..: 1 2 3 $ colTwo : Factor w/ 5 levels "apple","pear",..: 1 2 3 $ colThree: num 4 5 6 Thank you again! Best, Tim Howard >>> "William Dunlap" <wdun...@tibco.com> 2/9/2011 4:41 PM >>> > -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Tim Howard > Sent: Wednesday, February 09, 2011 12:44 PM > To: r-help@r-project.org > Subject: [R] assign factor levels based on list > > All, > > Given a data frame and a list containing factor definitions > for certain columns, how can I apply those definitions from > the list, rather than doing it the standard way, as noted > below. I'm lost in the world of do.call, assign, paste, and > can't find my way through. For example: > > #set up df > y <- data.frame(colOne = c(1,2,3), colTwo = > c("apple","pear","orange")) > > factor.defs <- list(colOne = list(name = "colOne", > lvl = c(1,2,3,4,5,6)), > colTwo = list(name = "colTwo", > lvl = c("apple","pear","orange","fig","banana"))) Why not the following format? my.factor.defs <- list(colOne = c(1,2,3,4,5,6), colTwo = c("apple", "pear", "orange", "fig", "banana")) Do you really want to support a case like the following? list(colOne = list( name = "anotherColumn", lvl=c(1,2,3,4,5,6)) > #A standard way to define levels > y$colTwo <- factor(y$colTwo , levels = > c("apple","pear","orange","fig","banana")) > > # I'd like to use the definitions locally but also pass them > (but not the data) to a function, > # so, rather than defining each manually each time, I'd like > to loop through the columns, > # call them by name, find the definitions in the list and use > them from there. Before I try to loop > # or use some form of apply, I'd like to get a single factor > definition working. First write a function that takes a data.frame and list of desired levels for each column and outputs a new data.frame. E.g., if you use the simpler form of the levelsList I gave above, the following might work well enough (it does no error checking): assignNewLevelsToDataFrameColumns <- function(x, levelsList) { for(colName in names(levelsList)) { # note that x$name is equivalent to x[["name"]], so # if you want to use a variable as the name, use [[. x[[colName]] <- factor(x[[colName]], levels=levelsList[[colName]]) } x } Test it: > fixedY <- assignNewLevelsToDataFrameColumns(y, my.factor.defs) colOne colTwo 1 1 apple 2 2 pear 3 3 orange > str(fixedY) 'data.frame': 3 obs. of 2 variables: $ colOne: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 $ colTwo: Factor w/ 5 levels "apple","pear",..: 1 2 3 Do > y <- assignNewLevelsToDataFrameColumns(y, my.factor.defs) if you want to overwrite the old y. Now if you want a function that changes the data.frame you give it, use a replacement function. If you want to use the syntax > func(y) <- newStuff then the function should be called `func<-` and the last argument must be called 'value' (newStuff will be passed via value=newStuff). E.g., `func<-` <- function(x, value) { alteredX <- assignNewLevelsToDataFrameColumns(x, value) alteredX } and use it as > func(y) <- my.factor.defs > str(y) 'data.frame': 3 obs. of 2 variables: $ colOne: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 $ colTwo: Factor w/ 5 levels "apple","pear",..: 1 2 3 The first command gets translated into y <- `func<-`(y, value=my.factor.defs) If you write a replacement function, it is nice to create a matching extractor function called 'func'. E.g., > func <- function(x) lapply(x, levels) > func(y) $colOne [1] "1" "2" "3" "4" "5" "6" $colTwo [1] "apple" "pear" "orange" "fig" "banana" Note that this avoids assign(), get(), eval(), etc., and thus makes it easy to follow the flow of data in the code: only things on the left side of the assignment arrow can get changed. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > # this doesn't seem to see the dataframe properly > do.call(factor,list((paste("y$",factor.defs[2][[1]]$name,sep=" ")),levels=factor.defs[2][[1]]$lvl)) > > #adding "as.name" doesn't help > do.call(factor,list(as.name(paste("y$",factor.defs[2][[1]]$nam e,sep="")),levels=factor.defs[2][[1]]$lvl)) > > #Here's my attempt to mimic the standard way, using assign. > Ha! what a joke. > assign(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")), > do.call(factor, > list(as.name(paste("y$",factor.defs[2][[1]]$name,sep="")), > levels = factor.defs[2][[1]]$lvl))) > ##Error in function (x = character(), levels, labels = > levels, exclude = NA, : > ## object 'y$colTwo' not found > Any help or perspective (or better way from the beginning!) > would be greatly appreciated. > Thanks in advance! > Tim > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.