Re: [R] regular expression, stringr::str_view, grep

2020-04-29 Thread Andy Spada
This highlights the literal meaning of the last ] in your correct_brackets: aff <- c("affgfk]ing", "fgok", "rafgkah]e","a fgk", "bafghk]") To me, too, the missing_brackets looks more like what was desired, and returns correct results for a PCRE. Perhaps the regular expression should have been re

Re: [R] regular expression, stringr::str_view, grep

2020-04-28 Thread David Winsemius
On 4/28/20 2:29 AM, Sigbert Klinke wrote: Hi, we gave students the task to construct a regular expression selecting some texts. One send us back a program which gives different results on stringr::str_view and grep. The problem is "[^[A-Z]]" / "[^[A-Z]" at the end of the regular expressio

Re: [R] Regular expression help

2017-10-10 Thread David Winsemius
> On Oct 9, 2017, at 6:08 PM, Georges Monette wrote: > > How about this (I'm showing it as a pipe because it's easier to read that > way): > > library(magrittr) > "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" %>% > strsplit(' ') %>% > unlist %>% > sub('^[^/]*/*','',.) %>% >

Re: [R] Regular expression help

2017-10-09 Thread Georges Monette
How about this (I'm showing it as a pipe because it's easier to read that way): library(magrittr) "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" %>%   strsplit(' ') %>%   unlist %>%   sub('^[^/]*/*','',.) %>%   sub('^[^/]*/*','',.) %>%   paste(collapse = ' ') Georges Monette -- Geo

Re: [R] Regular expression help

2017-10-09 Thread Duncan Murdoch
On 09/10/2017 12:06 PM, William Dunlap wrote: "(^| +)([^/ ]*/?){0,2}", with the first "*" replaced by "+" would be a bit better. Thanks! I think I actually need the *, because theoretically the b part of the word could be empty, i.e. "a//c" would be legal and should become "c". Duncan Murd

Re: [R] Regular expression help

2017-10-09 Thread Duncan Murdoch
On 09/10/2017 11:23 AM, Ulrik Stervbo wrote: Hi Duncan, why not split on / and take the correct elements? It is not as elegant as regex but could do the trick. Thanks for the suggestion. There are likely many thousands of lines of data like the two real examples (which had about 5000 and 60

Re: [R] Regular expression help

2017-10-09 Thread William Dunlap via R-help
"(^| +)([^/ ]*/?){0,2}", with the first "*" replaced by "+" would be a bit better. Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Oct 9, 2017 at 8:50 AM, William Dunlap wrote: > > x <- "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" > > gsub("(^| *)([^/ ]*/?){0,2}", "\\1", x) >

Re: [R] Regular expression help

2017-10-09 Thread William Dunlap via R-help
> x <- "f 147/1315/587 2820/1320/587 3624/1321/587 1852/1322/587" > gsub("(^| *)([^/ ]*/?){0,2}", "\\1", x) [1] " 587 587 587 587" > y <- "aa aa/ aa/bb aa/bb/ aa/bb/cc aa/bb/cc/ aa/bb/cc/dd aa/bb/cc/dd/" > gsub("(^| *)([^/ ]*/?){0,2}", "\\1", y) [1] "cc cc/ cc/dd cc/dd/" Bill Dunlap TIBCO Sof

Re: [R] Regular expression help

2017-10-09 Thread peter dalgaard
> On 9 Oct 2017, at 17:02 , Duncan Murdoch wrote: > > I have a file containing "words" like > > > a > > a/b > > a/b/c > > where there may be multiple words on a line (separated by spaces). The a, b, > and c strings can contain non-space, non-slash characters. I'd like to use > gsub() to

Re: [R] Regular expression help

2017-10-09 Thread Eric Berger
Hi Duncan, You can try this: library(readr) f <- function(s) { t <- unlist(readr::tokenize(paste0(gsub(" ",",",s),"\n",collapse=""))) i <- grep("[a-zA-Z0-9]*/[a-zA-Z0-9]*/",t) u <- sub("[a-zA-Z0-9]*/[a-zA-Z0-9]*/","",t[i]) paste0(u,collapse=" ") } f("f 147/1315/587 2820/1320/587 3624/1321

Re: [R] Regular expression help

2017-10-09 Thread Ulrik Stervbo
Hi Duncan, why not split on / and take the correct elements? It is not as elegant as regex but could do the trick. Best, Ulrik On Mon, 9 Oct 2017 at 17:03 Duncan Murdoch wrote: > I have a file containing "words" like > > > a > > a/b > > a/b/c > > where there may be multiple words on a line (se

Re: [R] regular expression help

2017-06-08 Thread Ashim Kapoor
Dear Enrico, Many thanks and Best Regards, Ashim. On Thu, Jun 8, 2017 at 5:11 PM, Enrico Schumann wrote: > > Zitat von Ashim Kapoor : > > > Dear All, >> >> My query is: >> >> Do we always need to use perl = TRUE option when doing ignore.case=TRUE? >> >> A small example : >> >> my_text = >> "RE

Re: [R] regular expression help

2017-06-08 Thread Enrico Schumann
Zitat von Ashim Kapoor : Dear All, My query is: Do we always need to use perl = TRUE option when doing ignore.case=TRUE? A small example : my_text = "RECOVERY OFFICER-II\nDEBTS RECOVERY TRIBUNAL-III\n RC No. 162/2015\nSBI VS RAMESH GUPTA.\nDated: 01.03.2016 Item no.01

Re: [R] regular expression question

2015-01-14 Thread MacQueen, Don
uary 14, 2015 at 8:47 AM To: dh m mailto:macque...@llnl.gov>> Cc: Mark Leeds mailto:marklee...@gmail.com>>, "r-help-stat.math.ethz.ch" mailto:r-h...@stat.math.ethz.ch>> Subject: Re: [R] regular expression question On Wed, Jan 14, 2015 at 10:03 AM, MacQueen, Don m

Re: [R] regular expression question

2015-01-14 Thread John McKown
On Wed, Jan 14, 2015 at 10:03 AM, MacQueen, Don wrote: > I know you already have a couple of solutions, but I would like to mention > that it can be done in two steps with very simple regular expressions. I > would have done: > > s <- c("lngimbintrhofixed","lngimbnointnorhofixed","test", >

Re: [R] regular expression question

2015-01-14 Thread MacQueen, Don
I know you already have a couple of solutions, but I would like to mention that it can be done in two steps with very simple regular expressions. I would have done: s <- c("lngimbintrhofixed","lngimbnointnorhofixed","test", 'rhofixedtest','norhofixedtest') res <- gsub('norhofixed$', '',s) r

Re: [R] regular expression question

2015-01-13 Thread Loris Bennett
Hi Mark, Mark Leeds writes: > Hi All: I have a regular expression problem. If a character string ends > with "rhofixed" or "norhofixed", I want that part of the string to be > removed. If it doesn't end with either of those two endings, then the > result should be the same as the original. Below

Re: [R] regular expression question

2015-01-12 Thread John McKown
No HTML please. it makes me itchy! > s <- c("lngimbintrhofixed","lngimbnointnorhofixed","test") > sub('(no)?rhofixed$','',s) [1] "lngimbint" "lngimbnoint" "test" > On Mon, Jan 12, 2015 at 1:37 PM, Mark Leeds wrote: > Hi All: I have a regular expression problem. If a character string ends >

Re: [R] regular expression help

2014-06-30 Thread C Lin
, 29 Jun 2014 13:16:26 -0700 > Subject: Re: [R] regular expression help > To: bac...@hotmail.com > CC: dwinsem...@comcast.net; r-help@r-project.org > >> what's the difference between [:space:]+ and[[:space:]]+ ? > > The pattern '[:space:]' matches any of

Re: [R] regular expression help

2014-06-29 Thread William Dunlap
ace:]+ instead > of [[:space:]]+ > what's the difference between [:space:]+ and[[:space:]]+ ? > > Thanks so much! > Lin > > >> From: wdun...@tibco.com >> Date: Fri, 27 Jun 2014 02:35:54 -0700 >> Subje

Re: [R] regular expression help

2014-06-27 Thread arun
e space or // or nothing so, from test <- c('AARSD11','AARSD1-','AARSD1//','AARSD1 //','//AARSD1','AARSD1'); I want to match only 'AARSD1//','AARSD1 //','//AARSD1','AARSD1' Thanks, Lin   ---

Re: [R] regular expression help

2014-06-27 Thread C Lin
! Lin > From: wdun...@tibco.com > Date: Fri, 27 Jun 2014 02:35:54 -0700 > Subject: Re: [R] regular expression help > To: dwinsem...@comcast.net > CC: bac...@hotmail.com; r-help@r-project.org > > You can use parentheses to factor out the common string in David's >

Re: [R] regular expression help

2014-06-27 Thread William Dunlap
,'AARSD1'); >> >> I want to match only 'AARSD1//','AARSD1 //','//AARSD1','AARSD1' > > Perhaps you want jsut > > grepl('^AARSD1//$|^AARSD1 //$|^//AARSD1$|^AARSD1', test) > >> grepl('^AARSD1//$|^AARSD1 //$

Re: [R] regular expression help

2014-06-26 Thread David Winsemius
D1$|^AARSD1', test) > grepl('^AARSD1//$|^AARSD1 //$|^//AARSD1$|^AARSD1$', test) [1] FALSE FALSE TRUE TRUE TRUE TRUE -- David. > > Thanks, > Lin > > >> From: dulca...@bigpond.com >> To: bac...@hotmail.co

Re: [R] regular expression help

2014-06-26 Thread C Lin
7;//AARSD1','AARSD1'); I want to match only 'AARSD1//','AARSD1 //','//AARSD1','AARSD1' Thanks, Lin   ---------------- > From: dulca...@bigpond.com > To: bac...@hotmail.com; r-help@r-project.org > Subject: RE: [R] r

Re: [R] regular expression help

2014-06-26 Thread Duncan Mackay
Hi You only have a vector of length 5 and I am not quite sure of the string you are testing so try this grep('[/]*\\[/]*',test) Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au -Original Message

Re: [R] Regular Expression returning unexpected results

2013-10-29 Thread Lopez, Dan
3 11:08 AM To: Lopez, Dan; R help (r-help@r-project.org) Subject: Re: [R] Regular Expression returning unexpected results Please read and follow the Posting Guide, in particular re plain text email. You need to keep in mind that the characters in literal strings in R source have to make it into

Re: [R] Regular Expression returning unexpected results

2013-10-29 Thread David Carlson
>From ?regex "(do remember that backslashes need to be doubled when entering R character strings, e.g. from the keyboard)." > lines[grep("^([a-z]+) +\\1 +[a-z]+ [0-9]",lines)] [1] "night night at 8" - David L Carlson Department of Anthropology Texas A&M Univer

Re: [R] Regular Expression returning unexpected results

2013-10-29 Thread Jeff Newmiller
Please read and follow the Posting Guide, in particular re plain text email. You need to keep in mind that the characters in literal strings in R source have to make it into RAM before the regex code can parse it. Since regex needs a single backslash to escape normal parsing and interpret 1 as a

Re: [R] Regular Expression returning unexpected results

2013-10-29 Thread Sarah Goslee
On Tue, Oct 29, 2013 at 1:13 PM, Lopez, Dan wrote: > grep("^([a-z]+) +\1 +[a-z]+ [0-9]",lines) Your expression has a typo: R> grep("^([a-z]+) +\\1 +[a-z]+ [0-9]",lines) [1] 2 -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.or

Re: [R] regular expression strikes again

2013-07-09 Thread jim holtman
> test <- c("pH 9,36 2", "pH 9,36 3", "pH 9,66 1", "pH 9,66 2", "pH 9,66 3", + "pH 10,04 1", "pH 10,04 2", "pH 10,04 3", "RGLP 144006 pH 6,13 1", + "RGLP 144006 pH 6,13 2", "RGLP 144006 pH 6,13 3") > > # make it less greedy with a "?" > gsub("^.*?([[:digit:]]+,[[:digit:]]*).*$", "\\1", test) [1] "

Re: [R] regular expression strikes again

2013-07-09 Thread arun
Hi, May be this helps:   gsub(".*\\w+\\s+(.*)\\s+.*","\\1",test)  #[1] "9,36"  "9,36"  "9,66"  "9,66"  "9,66"  "10,04" "10,04" "10,04" "6,13" #[10] "6,13"  "6,13" A.K. - Original Message - From: PIKAL Petr To: r-help Cc: Sent: Tuesday, July 9, 2013 5:45 AM Subject: [R] regular expre

Re: [R] regular expression strikes again

2013-07-09 Thread peter dalgaard
On Jul 9, 2013, at 12:19 , PIKAL Petr wrote: > Thanks, it works to some extent. > > The test comes from some file which is not filled propperly. If I use your > suggestion I get correct values for those 2 digit numbers before "," but I > get some other values which do not have space before nu

Re: [R] regular expression strikes again

2013-07-09 Thread PIKAL Petr
"9,66" "9,66" "9,66" "10,04""10,04""10,04" [19] "6,13" "6,13" "6,13" > Basically I would like to get one or two digits before comma and two digits after comma. Thanks a

Re: [R] regular expression strikes again

2013-07-09 Thread Jan Kim
On Tue, Jul 09, 2013 at 09:45:55AM +, PIKAL Petr wrote: > Dear experts in regexpr. > > I have this > > dput(test[500:510]) > c("pH 9,36 2", "pH 9,36 3", "pH 9,66 1", "pH 9,66 2", "pH 9,66 3", > "pH 10,04 1", "pH 10,04 2", "pH 10,04 3", "RGLP 144006 pH 6,13 1", > "RGLP 144006 pH 6,13 2", "RG

Re: [R] regular expression strikes again

2013-07-09 Thread peter dalgaard
On Jul 9, 2013, at 11:45 , PIKAL Petr wrote: > Dear experts in regexpr. > > I have this > > dput(test[500:510]) > c("pH 9,36 2", "pH 9,36 3", "pH 9,66 1", "pH 9,66 2", "pH 9,66 3", > "pH 10,04 1", "pH 10,04 2", "pH 10,04 3", "RGLP 144006 pH 6,13 1", > "RGLP 144006 pH 6,13 2", "RGLP 144006 pH

Re: [R] Regular expression

2013-01-15 Thread arun
HI, vec1<-"'asd'f"  vec2<-'"asd"f'  gsub("[\"]","",vec2) #[1] "asdf"  gsub("[']","",vec1) #[1] "asdf" A.K. - Original Message - From: Christofer Bogaso To: r-help Cc: Sent: Tuesday, January 15, 2013 4:38 PM Subject: [R] Regular expression Hello again, I am having a problem on Regu

Re: [R] Regular expression

2013-01-15 Thread William Dunlap
>> gsub("[',"]", "", "'asd'f") >Error: unexpected ']' in "gsub("[',"]" > >What is the right way to include the 'double quote' in the search field? The 'search field' is a string and to put a double quote into a double-quote delimited string you need to escape it with a backslash so it is not inter

Re: [R] Regular Expression

2012-07-24 Thread arun
Hi, Try this: dat1$MONTH<- gsub("^[0-9]+\\-","",dat1$MONTH) [1] "07" "07" "01" dat1$QUARTER<- gsub("^[0-9]+\\-","",dat1$QUARTER) [1] "3" "3" "1" dat1   MONTH QUARTER YEAR 1    07   3 2012 2    07   3 2001 3    01   1 2002 A.K. - Original Message - From: Fred G To: r-help

Re: [R] Regular Expression

2012-07-24 Thread Fred G
Thank you! :) On Tue, Jul 24, 2012 at 1:42 PM, Sarah Goslee wrote: > To delete everything from the beginning of the string to and including > the hyphen, use > sub("^.*-", "", tmp) > > Sarah > > On Tue, Jul 24, 2012 at 1:36 PM, Fred G wrote: > > Hi-- > > > > I have three columns in an input file

Re: [R] Regular Expression

2012-07-24 Thread Gabor Grothendieck
On Tue, Jul 24, 2012 at 1:36 PM, Fred G wrote: > Hi-- > > I have three columns in an input file: > MONTH QUARTER YEAR > 2012-07 2012-32012 > 2001-07 2001-32001 > 2002-01 2002-12002 > > I want to make output like so: > MONTH QUARTER YEAR > 07 3

Re: [R] Regular Expression

2012-07-24 Thread David L Carlson
If they are all formatted as your example, substr() would be simpler: MONTH <- c("2012-07", "2001-07", "2002-01") QUARTER <- c("2012-3", "2001-3", "2002-1") YEAR <- c(2013, 2001, 2002) Inp <- data.frame(MONTH, QUARTER, YEAR) Out <- data.frame(MONTH=substr(MONTH, 6, 8), QUARTER=substr(QUARTER,

Re: [R] Regular Expression

2012-07-24 Thread Rui Barradas
Hello, I believe the following will do it. d <- read.table(text=" MONTH QUARTER YEAR 2012-07 2012-32012 2001-07 2001-32001 2002-01 2002-12002 ", header=TRUE) search <- "^.*-([[:digit:]]+)$" sapply(d, function(x) as.integer(sub(search, "\\1", x))) Hope this he

Re: [R] Regular Expression

2012-07-24 Thread jim holtman
Is this what you want: > x <- read.table(text = "MONTH QUARTER YEAR + 2012-07 2012-32012 + 2001-07 2001-32001 + 2002-01 2002-12002", header = TRUE, as.is = TRUE) > x MONTH QUARTER YEAR 1 2012-07 2012-3 2012 2 2001-07 2001-3 2001 3 2002-01 2002-1 2002 > x$MON

Re: [R] Regular Expression

2012-07-24 Thread Sarah Goslee
To delete everything from the beginning of the string to and including the hyphen, use sub("^.*-", "", tmp) Sarah On Tue, Jul 24, 2012 at 1:36 PM, Fred G wrote: > Hi-- > > I have three columns in an input file: > MONTH QUARTER YEAR > 2012-07 2012-32012 > 2001-07 2001-32001

Re: [R] Regular Expression

2012-07-24 Thread R. Michael Weylandt
Hi Fred, I'm no regex ninja (and I imagine one will be along shortly to solve your problem) but in your case does it simply suffice to drop the first 5 characters? That might be an easier sub() to write. Best, Michael On Tue, Jul 24, 2012 at 12:36 PM, Fred G wrote: > Hi-- > > I have three colum

Re: [R] Regular Expression

2012-07-24 Thread Henrik Singmann
Hi, one problem, many solutions, only one of which uses regular expression but work equally well. dat1<-read.table(text=" MONTH QUARTER YEAR 2012-07 2012-32012 2001-07 2001-32001 2002-01 2002-12002 ",sep="",as.is = TRUE, header=TRUE) # using substr: substr(dat1

Re: [R] Regular Expression

2012-07-24 Thread jose Bartolomei
If you want that output. substr() Can help in your task too. I can not help with regular expression, I will learn too. > Date: Tue, 24 Jul 2012 13:36:25 -0400 > From: bayespoker...@gmail.com > To: r-help@r-project.org > Subject: [R] Regular Expression > > Hi-- > > I have three columns i

Re: [R] regular expression and R

2012-06-04 Thread Gabor Grothendieck
On Mon, Jun 4, 2012 at 4:48 PM, Erin Hodgess wrote: > Dear R People: > > Are there any courses which describe how to use regular expressions in > R, please?  Or any books, please? > > I know a little bit (very little) but would like to know more. > You might want to go through the regular express

Re: [R] regular expression and R

2012-06-04 Thread Marc Schwartz
On Jun 4, 2012, at 3:48 PM, Erin Hodgess wrote: > Dear R People: > > Are there any courses which describe how to use regular expressions in > R, please? Or any books, please? > > I know a little bit (very little) but would like to know more. > > Thanks, > Erin Hi Erin, The two places that

Re: [R] regular expression

2012-02-29 Thread Justin Haynes
gsub('.+; (.+);.+','\\1',x) or if you just want the value out: gsub('.+; Surv\\(months\\): ([0-9]+);.+','\\1',x) You can also look at strsplit: > strsplit(x,';') [[1]] [1] "99-625: Cell type: S"" Surv(months): 21" " STATUS(0=alive, 1=dead): 1" > lapply(strsplit(x,';'),'[',2) [

Re: [R] regular expression

2012-02-29 Thread Gabor Grothendieck
On Wed, Feb 29, 2012 at 2:24 PM, Fred G wrote: > Computer Friends, > > with the following example lines: > > [107] "98-610: Cell type: S; Surv(months): 6; STATUS(0=alive, 1=dead): 1" > > [108] "99-625: Cell type: S; Surv(months): 21; STATUS(0=alive, 1=dead): 1" > > i want to be able to isolate the

Re: [R] regular expression

2012-02-29 Thread David Winsemius
On Feb 29, 2012, at 2:24 PM, Fred G wrote: Computer Friends, with the following example lines: Modified to be correct R code. Please emulate my example in the future. inp <-c( "98-610: Cell type: S; Surv(months): 6; STATUS(0=alive, 1=dead): 1", "99-625: Cell type: S; Surv(months): 21; ST

Re: [R] regular expression for selection

2011-11-14 Thread Uwe Ligges
On 14.11.2011 11:27, Petr PIKAL wrote: Hi Thank you. It is a pure magic, something taught in Unseen University. this is what I got as a help for selecting only letters from set of character vector. vzor [1] "61A" "62C/27" "65A/27" "66C/29" "69A/29" "70C/31" "73A/31" [8] "74C/33

Re: [R] regular expression for selection

2011-11-14 Thread Petr PIKAL
Hi Thank you. It is a pure magic, something taught in Unseen University. this is what I got as a help for selecting only letters from set of character vector. > vzor [1] "61A" "62C/27" "65A/27" "66C/29" "69A/29" "70C/31" "73A/31" [8] "74C/33" "77A/33" "81A/35" "82C/37" "85A/37"

Re: [R] regular expression for selection

2011-11-14 Thread Rainer Schuermann
Does library( stringr ) str_extract( mena, "m5[0-9]" ) achieve what you are looking for? Rgds, Rainer On Monday 14 November 2011 10:22:09 Petr PIKAL wrote: > Hi > > > On 11/14/2011 07:45 PM, Petr PIKAL wrote: > > > Dear all > > > > > > I am again (as usual) lost in regular expression use for

Re: [R] regular expression for selection

2011-11-14 Thread Uwe Ligges
On 14.11.2011 10:22, Petr PIKAL wrote: Hi On 11/14/2011 07:45 PM, Petr PIKAL wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c("138516_10g_50ml_50c_250utes1_m53.00-_s1.imp", "138516_10g_50ml_50c_250utes1_m54.00_s1.imp", "

Re: [R] regular expression for selection

2011-11-14 Thread Petr PIKAL
Hi > On 11/14/2011 07:45 PM, Petr PIKAL wrote: > > Dear all > > > > I am again (as usual) lost in regular expression use for selection. Here > > are my data: > > > >> dput(mena) > > c("138516_10g_50ml_50c_250utes1_m53.00-_s1.imp", > > "138516_10g_50ml_50c_250utes1_m54.00_s1.imp", > > "138516_10g_

Re: [R] regular expression for selection

2011-11-14 Thread Petr PIKAL
Hi > > Hi, > > Try grepl instead of sub, > > mena[grepl("m5.", mena)] It does not select those "m5?" strings from those character vectors. I need as an output a vector m53, m54, m55, m56, m57, m58, m59 Regards Petr > > HTH, > > baptiste > > On 14 November 2011 21:45, Petr PIKAL wrote:

Re: [R] regular expression for selection

2011-11-14 Thread Jim Lemon
On 11/14/2011 07:45 PM, Petr PIKAL wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c("138516_10g_50ml_50c_250utes1_m53.00-_s1.imp", "138516_10g_50ml_50c_250utes1_m54.00_s1.imp", "138516_10g_50ml_50c_250utes1_m55.00_s1.imp", "138

Re: [R] regular expression for selection

2011-11-14 Thread baptiste auguie
Hi, Try grepl instead of sub, mena[grepl("m5.", mena)] HTH, baptiste On 14 November 2011 21:45, Petr PIKAL wrote: > Dear all > > I am again (as usual) lost in regular expression use for selection. Here > are my data: > >> dput(mena) > c("138516_10g_50ml_50c_250utes1_m53.00-_s1.imp", > "138516

Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
That works like a charm! Thanks so much Duncan. On Fri, Apr 29, 2011 at 6:37 PM, Duncan Murdoch wrote: > On 29/04/2011 9:34 PM, Miao wrote: > >> Thanks Duncan for clarifying this. I'm pretty a newbie to such type of >> characters and special characters. In R's gsub() what regular >> expression

Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Mike Miller
On Fri, 29 Apr 2011, Duncan Murdoch wrote: On 29/04/2011 7:41 PM, Miao wrote: Can anyone help on gsub() in R? I have a string like something below, and wanted to delete all the strings with leading backslash, including "\xa0On", "\023, "\xab", and many others. How should I write a regular

Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Duncan Murdoch
On 29/04/2011 9:34 PM, Miao wrote: Thanks Duncan for clarifying this. I'm pretty a newbie to such type of characters and special characters. In R's gsub() what regular expressions shall I use to handle all these situations? I don't know. This might work: gsub("[\x01-\x1f\x7f-\xff]", "", x)

Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Miao
Thanks Duncan for clarifying this. I'm pretty a newbie to such type of characters and special characters. In R's gsub() what regular expressions shall I use to handle all these situations? On Fri, Apr 29, 2011 at 6:07 PM, Duncan Murdoch wrote: > On 29/04/2011 7:41 PM, Miao wrote: > >> Hello, >

Re: [R] regular expression in gsub() for strings with leading backslash

2011-04-29 Thread Duncan Murdoch
On 29/04/2011 7:41 PM, Miao wrote: Hello, Can anyone help on gsub() in R? I have a string like something below, and wanted to delete all the strings with leading backslash, including "\xa0On", "\023, "\xab", and many others. How should I write a regular expression pattern in gsub()? I don't

Re: [R] regular expression for nth character in a string

2011-04-25 Thread Gabor Grothendieck
2011/4/25 Gonçalo Ferraz : > Hi, I have a string > > "InTrouble" > > and want to extract, say, the first two characters: "In" > or the last three: "blee" > or the 3rd, 4th, and 5th: "Trou" > > Is there an easy way of doing this quickly with regular expressions in gsub, > grep or similar? > strapp

Re: [R] regular expression for nth character in a string

2011-04-25 Thread David Winsemius
On Apr 25, 2011, at 6:17 AM, Gonçalo Ferraz wrote: Hi, I have a string "InTrouble" and want to extract, say, the first two characters: "In" or the last three: "blee" or the 3rd, 4th, and 5th: "Trou" Is there an easy way of doing this quickly with regular expressions in gsub, grep or simila

Re: [R] regular expression for nth character in a string

2011-04-25 Thread Jim Lemon
On 04/25/2011 08:17 PM, Gonçalo Ferraz wrote: Hi, I have a string "InTrouble" and want to extract, say, the first two characters: "In" or the last three: "blee" or the 3rd, 4th, and 5th: "Trou" Is there an easy way of doing this quickly with regular expressions in gsub, grep or similar? Hi

Re: [R] regular expression for nth character in a string

2011-04-25 Thread jim holtman
will this do it: > x <- "InTrouble" > sub("^(..).*", "\\1", x) # first two [1] "In" > sub(".*(...)$", "\\1", x) # last three [1] "ble" > sub("^..(...).*", "\\1", x) # 3rd,4th,5th char [1] "Tro" > 2011/4/25 Gonçalo Ferraz : > Hi, I have a string > > "InTrouble" > > and want to extract, say, t

Re: [R] regular expression question

2011-04-11 Thread Joshua Wiley
Hi Erin, Please read ?grep. It is clearly not the function you want (neither is strsplit() either really). This does what you want and you can modify for upper/lower case if you need it. Also note that regular expressions exist separate from R, so while ":" may have seemed natural to select a r

Re: [R] regular expression question

2011-04-11 Thread Peter Langfelder
On Mon, Apr 11, 2011 at 10:49 PM, Erin Hodgess wrote: > Dear R People: > > I have a data frame with the following column names: > >> names(funky) >  [1] "UHD.1"   "UHD.2"   "UHD.3"   "UHD.4"   "L..W..1" "L..W..2" "L..W..3" >  [8] "L..W..4" "B..W..1" "B..W..2" "B..W..3" "B..W..4" "W..B..1" "W..B..2

Re: [R] regular expression

2011-04-01 Thread Henrique Dallazuanna
gt; Great. thank you Bernd! Learned a new thing here. > > John > > > > > > From: Bernd Weiss > > Cc: r-help@r-project.org > Sent: Thu, March 31, 2011 6:19:25 PM > Subject: Re: [R] regular expression > > Am 31.03.2011 21:06, schrieb array ch

Re: [R] regular expression

2011-04-01 Thread array chip
Great. thank you Bernd! Learned a new thing here. John From: Bernd Weiss Cc: r-help@r-project.org Sent: Thu, March 31, 2011 6:19:25 PM Subject: Re: [R] regular expression Am 31.03.2011 21:06, schrieb array chip: > Ok then this code didn't do what

Re: [R] regular expression

2011-03-31 Thread Bernd Weiss
Am 31.03.2011 21:06, schrieb array chip: Ok then this code didn't do what I wanted. I want "not including 'arg' before '.symptom'", not individual letters of "arg", but rather as a word. Bill Dunlap suggested using invert=T, it works for single 1 condition, but not for 2 conditions here: not inc

Re: [R] regular expression

2011-03-31 Thread array chip
t including "arg" before ".", but at the same time, does include ".symptom". Any other suggestions would be appreciated John From: Peter Langfelder Cc: Bernd Weiss ; r-help@r-project.org Sent: Thu, March 31, 2011 5:55:26 PM

Re: [R] regular expression

2011-03-31 Thread Peter Langfelder
On Thu, Mar 31, 2011 at 5:49 PM, array chip wrote: > Thanks Bernd! I tried your approach with my real example, sometimes it worked, > sometimes it didn't. For example > > grep('[^(arg)]\\.symptom',"stomach.symptom",value=T) > [1] "stomach.symptom" > > grep('[^(arg)]\\.symptom',"liver.symptom",valu

Re: [R] regular expression

2011-03-31 Thread array chip
) character(0) I think both examples should return the text, but the 2nd example didn't. What was wrong here? Thanks John From: Bernd Weiss Sent: Thu, March 31, 2011 5:32:25 PM Subject: Re: [R] regular expression Am 31.03.2011 19:31, schrieb array chip: &

Re: [R] regular expression

2011-03-31 Thread Bernd Weiss
Am 31.03.2011 19:31, schrieb array chip: Hi, I am stuck on this: how to specify a match pattern that means not to include "abc"? I tried: grep("^(abc)", "hello", value=T) should return "hello". > grep("[^(abc)]", "hello", value=T) [1] "hello" HTH, Bernd ___

Re: [R] Regular Expression

2011-02-14 Thread Gabor Grothendieck
On Mon, Feb 14, 2011 at 4:13 AM, Deb Midya wrote: > Hi R users, > > Thanks in advance. > > I am using R-2.12.1 on Windows XP. > > I am looking for some good literature on Regular Expression. May I request > you to assist me please. There are regular expression links on the gsubfn home page: http

Re: [R] Regular expression to find value between brackets

2010-10-13 Thread Gabor Grothendieck
On Wed, Oct 13, 2010 at 2:16 PM, Bart Joosen wrote: > > Hi, > > this should be an easy one, but I can't figure it out. > I have a vector of tests, with their units between brackets (if they have > units). > eg tests <- c("pH", "Assay (%)", "Impurity A(%)", "content (mg/ml)") > strapply in gsubfn

Re: [R] Regular expression to find value between brackets

2010-10-13 Thread Matt Shotwell
Here's a shorter (but more cryptic) one: > gsub("^([^\\(]+)(\\((.+)\\))?", "\\2", tests) [1] """(%)" "(%)" "(mg/ml)" > gsub("^([^\\(]+)(\\((.+)\\))?", "\\3", tests) [1] "" "%" "%" "mg/ml" -Matt On Wed, 2010-10-13 at 14:34 -0400, Henrique Dallazuanna wrote: > Try th

Re: [R] Regular expression to find value between brackets

2010-10-13 Thread Bert Gunter
Note: My original proposal, not quite right, can be made quite right via: gsub(".*\\((.*)\\).*||[^()]+", "\\1",tests) The "||" or clause at the end handles the case where there are no parentheses in the string. -- Bert On Wed, Oct 13, 2010 at 11:16 AM, Bart Joosen wrote: > > Hi, > > this shou

Re: [R] Regular expression to find value between brackets

2010-10-13 Thread Bert Gunter
One way: gsub(".*\\(([^()]*)\\).*", "\\1",tests) Idea: Pick out the units designation between the "()" and replace the whole expression with it. The "\\1" refers to the "[^()]* parenthesized expression in the middle that picks out the units. Cheers, Bert On Wed, Oct 13, 2010 at 11:16 AM, Bart

Re: [R] Regular expression to find value between brackets

2010-10-13 Thread Erik Iverson
Bart, I'm hardly one of the lists regex gurus: but this can get you started... tests <- c("pH", "Assay (%)", "Impurity A(%)", "content (mg/ml)") x <- regexpr("\\((.*)\\)", tests) substr(tests, x + 1, x + attr(x, "match.length") - 2) Bart Joosen wrote: Hi, this should be an easy one, but I c

Re: [R] Regular expression to find value between brackets

2010-10-13 Thread Henrique Dallazuanna
Try this: replace(gsub(".*\\((.*)\\)$", "\\1", tests), !grepl("\\(.*\\)", tests), "") On Wed, Oct 13, 2010 at 3:16 PM, Bart Joosen wrote: > > Hi, > > this should be an easy one, but I can't figure it out. > I have a vector of tests, with their units between brackets (if they have > units). > e

Re: [R] Regular Expression

2010-08-08 Thread Michael Bedward
And my \\3 should have been a \\2 anyway ! On 9 August 2010 12:23, Michael Bedward wrote: > Was going to suggest gsub("^[0-9]+ (SPE )?([^ -])( -.*)?", "\\3", s) > but I see Wu Gong beat me to the punch with a nicer one :) > > On 9 August 2010 12:13, Wu Gong wrote: >> >> gsub(pattern = "^[0-9]+ (

Re: [R] Regular Expression

2010-08-08 Thread Michael Bedward
Was going to suggest gsub("^[0-9]+ (SPE )?([^ -])( -.*)?", "\\3", s) but I see Wu Gong beat me to the punch with a nicer one :) On 9 August 2010 12:13, Wu Gong wrote: > > gsub(pattern = "^[0-9]+ (SPE )*(\\w+) - .*$", "\\2", dat) > __ R-help@r-project.o

Re: [R] Regular Expression

2010-08-08 Thread Wu Gong
gsub(pattern = "^[0-9]+ (SPE )*(\\w+) - .*$", "\\2", dat) - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/Regular-Expression-tp2318086p2318101.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r

Re: [R] regular expression help to extract specific strings from text

2010-04-01 Thread Tony B
Thank you guys, both solutions work great! Seems I have two new packages to investigate :) Regards, Tony Breyal On 31 Mar, 14:20, Tony B wrote: > Dear all, > > Lets say I have the following: > > > x <- c("Eve: Going to try something new today...", "Adam: Hey @Eve, how are > > you finding R? #rs

Re: [R] regular expression help to extract specific strings from text

2010-03-31 Thread hadley wickham
On Wed, Mar 31, 2010 at 8:20 AM, Tony B wrote: > Dear all, > > Lets say I have the following: > >> x <- c("Eve: Going to try something new today...", "Adam: Hey @Eve, how are >> you finding R? #rstats", "Eve: @Adam, It's awesome, so much better at >> statistics that #Excel ever was! @Cain & @Abl

Re: [R] regular expression help to extract specific strings from text

2010-03-31 Thread Gabor Grothendieck
strapply in gsubfn can extract matches based on content which seems to be what you want: library(gsubfn) f <- function(...) sapply(list(...), paste, collapse = ", ") DF <- data.frame(x, Source = strapply(x, "^(\\w+):", c, simplify = f), Mentions = strapply(x, "@(\\w+)", c, simpli

Re: [R] regular expression

2010-03-26 Thread jim holtman
try this: > x <- c("XXX184_YYY_ZZZ.dat", "YY123_YY_ZZ.dat") > sub("(^[[:alpha:]]+)[[:digit:]]+.*", "\\1 ", x, perl=TRUE) [1] "XXX" "YY" > On Fri, Mar 26, 2010 at 2:27 PM, arnaud chozo wrote: > Hi, > > I need to select a substring from the filename of a file in a list. > I can find all the filen

Re: [R] regular expression submatch?

2010-02-01 Thread sjaffe
Thanks for the suggestions. gsub("hello (.*)", "\\1", "hello world") seems simplest. Setting value=TRUE returns the whole match, not the subexpression. (I always read the man pages carefully before asking for help, gratuituous comments notwithstanding. I didn't see a solution using gregexpr;

Re: [R] regular expression submatch?

2010-02-01 Thread Gabor Grothendieck
Try this: > library(gsubfn) > strapply("hello world", "hello (.*)")[[1]] [1] "world" On Mon, Feb 1, 2010 at 1:57 PM, sjaffe wrote: > > What is the simplest way to extract a matched subexpression? > > Eg. in perl you can do > > "hello world" =~ m/hello (.*)/ > > which would return 1(true) and se

Re: [R] regular expression submatch?

2010-02-01 Thread Henrique Dallazuanna
Try this: gsub("hello (.*)", "\\1", "hello world") On Mon, Feb 1, 2010 at 4:57 PM, sjaffe wrote: > > What is the simplest way to extract a matched subexpression? > > Eg. in perl you can do > > "hello world" =~ m/hello (.*)/ > > which would return 1(true) and set $1 to the matched subexpression "

Re: [R] regular expression submatch?

2010-02-01 Thread David Winsemius
On Feb 1, 2010, at 1:57 PM, sjaffe wrote: What is the simplest way to extract a matched subexpression? Eg. in perl you can do "hello world" =~ m/hello (.*)/ which would return 1(true) and set $1 to the matched subexpression "world". If you wanted a logical value returned, then ?grep If

Re: [R] regular expression submatch?

2010-02-01 Thread Steve Lianoglou
Hi, On Mon, Feb 1, 2010 at 1:57 PM, sjaffe wrote: > > What is the simplest way to extract a matched subexpression? > > Eg. in perl you can do > > "hello world" =~ m/hello (.*)/ > > which would return 1(true) and set $1 to the matched subexpression "world". Read through the documentation and exam

Re: [R] Regular expression help

2009-12-07 Thread Gabor Grothendieck
If I understand correctly you wish to extract strings of digits more than 5 characters long: s <- c("UV7C11-F9-E1 MCS#9831019", "MCS Lot #9512516") library(gsubfn) strapply(s, "\\d{6,}", c) Depending on what you want to get back you might wish to add the simplify=TRUE argument to strapply, as wel

Re: [R] Regular expression help

2009-12-07 Thread Phil Spector
Ramya - Try strings = c('UV7C11-F9-E1 MCS#9831019','MCS Lot #9512516') sub('^.*?(\\d{5,}).*?$','\\1',strings,perl=TRUE) [1] "9831019" "9512516" The regular expression finds the first string of five or more numbers in the strings. Since you said the numbers could occur anywhere in the stri

  1   2   >