I'm not sure what's so complicated about that (am I missing something?). You can search using grep, and replace using gsub, so
tmpDF <- read.table(text="V1 V2 A 5 a1 1 a2 1 a3 1 a4 1 a5 1 B 4 b1 1 b2 1 b3 1 b4 1", header=TRUE) tmpDF <- tmpDF[grepl("[0-9]", tmpDF$V1), ] data.frame(tmpDF, V3 = toupper(gsub("[0-9]", "", tmpDF$V1))) Seems to do the trick. Best, Ista On Sat, Jan 3, 2015 at 9:41 PM, npretnar <npret...@gmail.com> wrote: > I have a string variable (V1) in a data frame structured as follows: > > V1 V2 > A 5 > a1 1 > a2 1 > a3 1 > a4 1 > a5 1 > B 4 > b1 1 > b2 1 > b3 1 > b4 1 > > I want the following: > > V1 V2 V3 > a1 1 A > a2 1 A > a3 1 A > a4 1 A > a5 1 A > b1 1 B > b2 1 B > b3 1 B > b4 1 B > > I am not sure how to go about making this transformation besides writing a > long vector that contains each of the categorical string names (these are > state names, so it would be a really long vector). Any help would be greatly > appreciated. > > Thanks, > > Nicholas Pretnar > Mizzou Economics Grad Assistant > npret...@gmail.com > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.