Ravi, Here is a third way to do this, but it doesn't make use of regular expressions per se:
> avec <- unlist(strsplit(astr, "")) # First convert astr to a vector > avec[c(1, 1 + grep(" ", avec))] [1] "T" "i" "m" "t" "o" "w" "t" "d" "i" This latter expression subscripts avec by concatenating the first position, and 1 + the position of each blank in the character vector. Here is yet a fourth way that does use a regular expression: > avec[unlist(gregexpr("\\<[[:alpha:]]", astr))] # avec from above [1] "T" "i" "m" "t" "o" "w" "t" "d" "i" The components of this regular expression can be broken down as follows: "\\<" The empty string at the beginning of a word. R requires the extra backslash. "[[:alpha:]]" Any alphabetic character, upper or lower case gregexpr() returns a list; unlist() converts the list to a vector, each element of which points to the first character of a word in astr. That result can be used to subscript avec. Best regards, Chuck Taylor TIBCO Spotfire Seattle, WA, USA -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of ravi Sent: Tuesday, August 04, 2009 10:28 AM To: r-help@r-project.org Subject: [R] regex question Hi, I am getting stuck over an apparently simple problem in the use of regular expressions : To collect together the first letters of the words from the Perl motto, “There is more than one way to do it” in the following form – TIMTOWTDI. I tried the following code : ##### A regex problem with the Perl motto astr<-"There is more than one way to do it" b1<-grep("\\<", astr,value=T) ## This just retrieves the whole string ## Next trial with gregexpr b2<-gregexpr("\\<",astr) ## This gives : > b3 [[1]] [1] 1 7 10 15 20 24 28 31 34 attr(,"match.length") [1] 0 0 0 0 0 0 0 0 0 A vector of indices corresponding to the first letter is obtained all right with gregexpr but the next step is not so clear. I am not able to figure out how I can use this information to pick out the letters from the original string. My problem is that I don’t know how I can treat the string as a vector and pluck out the letters. There may be many ways to do it, but I have not succeeded in coming up with even one way! I will appreciate any tips that I can get. Thanking you, Ravi ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.