HI, YOu could do either: Lines<-readLines(textConnection("Name: John Smith Age: 35 Address: 32, street, sub, something Name Adam Grey Age: 25 Address: 26, street, sub, something")) Lines[-grep("Name\\:",Lines)]<-gsub("Name","Name:",Lines[-grep("Name\\:",Lines)]) Name<-gsub("Name\\: (.*) Age\\: (.*) Address\\: (.*)","\\1",Lines) age<-gsub("Name\\: (.*) Age\\: (.*) Address\\: (.*)","\\2",Lines) Address<-gsub("Name\\: (.*) Age\\: (.*) Address\\: (.*)","\\3",Lines) dat1<-data.frame(Name,age,Address,stringsAsFactors=FALSE) dat1 dat1 # Name age Address #1 John Smith 35 32, street, sub, something #2 Adam Grey 25 26, street, sub, something
#or Lines[-grep("Name\\:",Lines)]<-gsub("Name","Name:",Lines[-grep("Name\\:",Lines)]) res<-read.table(text=gsub("Name|Age|Address","",Lines),sep=":",stringsAsFactors=F)[-1] res[sapply(res,is.character)]<-do.call(cbind,lapply(res[sapply(res,is.character)],function(x) sub("^[[:space:]]*(.*?)[[:space:]]*$","\\1",x))) str(res) #'data.frame': 2 obs. of 3 variables: # $ V2: chr "John Smith" "Adam Grey" # $ V3: num 35 25 # $ V4: chr "32, street, sub, something" "26, street, sub, something" A.K. ----- Original Message ----- From: Sachinthaka Abeywardana <sachin.abeyward...@gmail.com> To: "r-help@r-project.org" <r-help@r-project.org> Cc: Sent: Monday, January 14, 2013 4:30 AM Subject: [R] Grabbing Specific Words from Content (basic text mining) Hi all, Suppose I have a data frame with mixed content (name age and address). a<-"Name: John Smith Age: 35 Address: 32, street, sub, something" b<-data.frame(a) 1. The question is I want to extract the name age and address separately from this data frame (containing potentially more people). 2. Also just incase I have to deal with it how would the syntax change if I had "Name" as opposed to "Name:" (without the colon). Any thoughts are much appreciated. Thanks, Sachin [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.