On Feb 4, 2011, at 6:32 AM, D. Alain wrote: > Dear R-List, > > I have a dataframe with one column "name.of.report" containing character > values, e.g. > > >> df$name.of.report > > "jeff_2001_teamx" > "teamy_jeff_2002" > "robert_2002_teamz" > "mary_2002_teamz" > "2003_mary_teamy" > ... > (i.e. the bit of interest is not always at same position) > > Now I want to recode the column "name.of.report" into the variables "person", > "year","team", like this > >> new.df > > "person" "year" "team" > jeff 2001 x > jeff 2002 y > robert 2002 z > mary 2002 z > > I tried with grep() > > df$person<-grep("jeff",df$name.of.report) > > but of course it didn't exactly result in what I wanted to do. Could not find > any solution via RSeek. Excuse me if it is a very silly question, but can > anyone help me find a way out of this? > > Thanks a lot > > Alain
There will be several approaches, all largely involving the use of ?regex. Here is one: DF <- data.frame(name.of.report = c("jeff_2001_teamx", "teamy_jeff_2002", "robert_2002_teamz", "mary_2002_teamz", "2003_mary_teamy")) > DF name.of.report 1 jeff_2001_teamx 2 teamy_jeff_2002 3 robert_2002_teamz 4 mary_2002_teamz 5 2003_mary_teamy DF.new <- data.frame(person = gsub("[_0-9]|team.", "", DF$name.of.report), year = gsub(".*([0-9]{4}).*","\\1", DF$name.of.report), team = gsub(".*team(.).*","\\1", DF$name.of.report)) > DF.new person year team 1 jeff 2001 x 2 jeff 2002 y 3 robert 2002 z 4 mary 2002 z 5 mary 2003 y HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.