I think substitute() or bquote() will do a better job here than gsub() be they work on the parsed formula rather than on the raw string. The terms() function will interpret the formula-specific operators like "+" and ":" to come up with a list of the 'variables' (or 'terms') in the formula E.g., with the 'f' given below we get
> f(y1 + y2 ~ a*(b + c) + d + f * (h == 3) + (sex == 'male')*i) Y1z + Y2z ~ Az * (Bz + Cz) + Dz + Fz * (h == 3) + (sex == "male") * Iz Is that what you wanted? If you only wanted to keep intact the expressions of the form var==value (calls to `==`) but transform things like log(a) to log(Az) you could extend this code to do that as well. f <- function(formula) { trms <- terms(formula) variables <- as.list(attr(trms, "variables"))[-1] # the 'variables' attribute is stored as a call to list(), # so we changed the call to a list and removed the first element # to get the variables themselves. if (attr(trms, "response") == 1) { # terms does not pull apart right hand side of formula, # so we assume each non-function is to be renamed. responseVars <- lapply(all.vars(variables[[1]]), as.name) variables <- variables[-1] } else { responseVars <- list() } # omit non-name variables from list of ones to change. # This is where you could expand calls to certain functions. variables <- variables[vapply(variables, is.name, TRUE)] variables <- c(responseVars, variables) # all are names now names(variables) <- vapply(variables, as.character, "") newVars <- lapply(variables, function(v) as.name(paste0(toupper(v), "z"))) formula(do.call("substitute", list(formula, newVars)), env=environment(formula)) } Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Frank Harrell > Sent: Wednesday, August 14, 2013 8:14 PM > To: RHELP > Subject: [R] regex challenge > > I would like to be able to use gsub or gsubfn to process a formula and > to translate the variables but to ignore expressions in the formula. > Supposing that the R formula has already been transformed into a > character string and that the transformation is to convert variable > names to upper case and to append z to the names, an example would be to > convert y1 + y2 ~ a*(b + c) + d + f * (h == 3) + (sex == 'male')*i to > Y1z + Y2z ~ Az*(Bz + Cz) + Dz + Fz * (h == 3) + (sex == 'male')*Iz. Any > expression that is not just a simple variable name would be left alone. > > Does anyone want to try their hand at creating a regex that would > accomplish this? > > Thanks > Frank > -- > Frank E Harrell Jr Professor and Chairman School of Medicine > Department of Biostatistics Vanderbilt University > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.