subject:"\[R\] regular expressions"

Re: [R] Regular expressions and 2 dots

2019-06-28 Thread Rui Barradas

Hello, Please always cc the list. To know more about the regular expressions used by r read help("regex") The one I used is not very complicated. \\. match a dot; it is a meta-character so it needs to be escaped. {2,} repeated at least 2 times, at most an undetermined number of times. .* a

Re: [R] Regular expressions and 2 dots

2019-06-28 Thread Rui Barradas

Hello, Try s <- c( "colone..xx.","coltwo.ft..rr.","colthree.gh..az.","colfour.DG..lm.") sub("\\.{2,}.*$", "", s) #[1] "colone" "coltwo.ft" "colthree.gh" "colfour.DG" Às 09:00 de 28/06/19, lionel sicot via R-help escreveu: c( "colone..xx.","coltwo.ft..rr.","colthree.gh..az.","colfour.DG

[R] Regular expressions and 2 dots

2019-06-28 Thread lionel sicot via R-help

Hello, I have files from an equipment with column names including dots.I would like to simplify these names but all my attempts with sub and regular expressions are unsuccessful. I havec( "colone..xx.","coltwo.ft..rr.","colthree.gh..az.","colfour.DG..lm.")and I would like to have c( "colone","c

Re: [R] Regular expressions, genbank

2014-02-06 Thread arun

HI, May be this helps: lines1 <- readLines(textConnection('text to be ignored... CDS 687..3158 /gene="AXL2" /note="plasma membrane glycoprotein" other text to be ignored... CDS complement(3300..4037)

Re: [R] Regular expressions, genbank

2014-02-06 Thread arun

You could also try: library(gsubfn) strapply(gsub("\\d+<|>\\d+","",vec1),"([0-9]+)",as.numeric,simplify=c) A.K. On Thursday, February 6, 2014 1:55 PM, arun wrote: Hi, One way would be: vec1 <- c("CDS 3300..4037", "CDS complement(3300..4037)", "CDS 3300

Re: [R] Regular expressions, genbank

2014-02-06 Thread arun

Hi, One way would be: vec1 <- c("CDS 3300..4037", "CDS complement(3300..4037)", "CDS 3300<..4037", "CDS join(21467..26641,27577..28890)", "CDS complement(join(30708..31700,31931..31984))", "CDS 3300<..>4037") library(s

Re: [R] Regular expressions on filenames

2014-01-15 Thread Wojtek Poppe

Try sub("\\.[^.]+$", "", basename(FILELIST)) Thanks, Wojtek On Wed, Jan 15, 2014 at 4:37 PM, Fisher Dennis wrote: > R 3.0.2 > OS X > > Colleagues > > I am writing code to read a large number of files in a particular folder. > In some situations, there may be two versions of the file with di

Re: [R] Regular expressions on filenames

2014-01-15 Thread David Winsemius

On Jan 15, 2014, at 4:37 PM, Fisher Dennis wrote: > R 3.0.2 > OS X > > Colleagues > > I am writing code to read a large number of files in a particular folder. In > some situations, there may be two versions of the file with different > extensions, e.g.: > FILE.csv > FILE.xls > I

Re: [R] Regular expressions on filenames

2014-01-15 Thread Jeff Newmiller

You want to match a period and anything that follows to the end of the string, as long as what follows has no period in it. "\\.[^.]*$" --- Jeff NewmillerThe . . Go Live... DCN:

Re: [R] Regular expressions on filenames

2014-01-15 Thread arun

Hi, Try: FILELIST <- list.files() FILELIST #[1] "FILE.csv" "FILE.XXX.csv" "FILE.YYY.xls" sub("(.*)\\..*$", "\\1", basename(FILELIST)) #[1] "FILE" "FILE.XXX" "FILE.YYY" A.K. On Wednesday, January 15, 2014 7:35 PM, Fisher Dennis wrote: R 3.0.2 OS X Colleagues I am writing code to

Re: [R] Regular expressions on filenames

2014-01-15 Thread jim holtman

try this: > x <- c( "FILE.XXX.csv" + , "FILE.YYY.xls") > sub("\\.[^.]*$", "", x) [1] "FILE.XXX" "FILE.YYY" > the '[^.]*' says to match anything BUT a period. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to

[R] Regular expressions on filenames

2014-01-15 Thread Fisher Dennis

R 3.0.2 OS X Colleagues I am writing code to read a large number of files in a particular folder. In some situations, there may be two versions of the file with different extensions, e.g.: FILE.csv FILE.xls I extracted the portion before the extension with: sub("\\..*$"

Re: [R] R Regular Expressions - Metacharacters

2013-02-05 Thread David Winsemius

On Feb 5, 2013, at 9:49 AM, Seth Dickey wrote: > I thought that I can use metacharacters such as \w to match word characters > with one backslash. But for some reason, I need to include two backslashes. > >> grepl(pattern='\w', x="what") > Error: '\w' is an unrecognized escape in character stri

Re: [R] R Regular Expressions - Metacharacters

2013-02-05 Thread Duncan Murdoch

On 05/02/2013 12:49 PM, Seth Dickey wrote: I thought that I can use metacharacters such as \w to match word characters with one backslash. But for some reason, I need to include two backslashes. > grepl(pattern='\w', x="what") Error: '\w' is an unrecognized escape in character string starting "

[R] R Regular Expressions - Metacharacters

2013-02-05 Thread Seth Dickey

I thought that I can use metacharacters such as \w to match word characters with one backslash. But for some reason, I need to include two backslashes. > grepl(pattern='\w', x="what") Error: '\w' is an unrecognized escape in character string starting "\w" > grepl(pattern='\\w', x="what") [1] TRU

Re: [R] Regular expressions: stuck again...

2012-08-24 Thread Noia Raindrops

Hello, try this: x <- c("SELECT [public_tblFiche].[Fichenr], [public_tblArtnr].[Artnr]", "SELECT public_tblFiche.Fichenr, public_tblArtnr.Artnr") # > The square backets [ and ] should removed x <- gsub("[][]", "", x) # > and xxx_xxx.xxx should become \"xxx\".\"xxx\"\".\"xxx\" x <- gsub("([[:al

[R] Regular expressions: stuck again...

2012-08-23 Thread Bart Joosen

Hi, I'm currently reworking a report, originating from a MS Access database, but should be implemented in R. Now I'm facing the task to convert a lot of queries to postgreSQL. What I want to do is make a function which takes the MS Access query as an argument and returns the pgSQL version. So: SE

Re: [R] Regular Expressions in grep - Solution and function to determine significant figures of a number

2012-08-23 Thread Dr. Holger van Lishaut

Am 22.08.2012, 21:46 Uhr, schrieb Dr. Holger van Lishaut : SignifStellen<-function(x){ strx=as.character(x) nchar(regmatches(strx, regexpr("[1-9][0-9]*\\.[0-9]*[1-9]",strx)))-1 } returns the significant figures of a number. Perhaps this can help someone. Sorry, to work, it must

Re: [R] Regular Expressions in grep - Solution and function to determine significant figures of a number

2012-08-22 Thread Bert Gunter

... On Wed, Aug 22, 2012 at 12:46 PM, Dr. Holger van Lishaut wrote: > Dear all, > > regmatches works. > > And, since this has been asked here before: > > SignifStellen<-function(x){ > strx=as.character(x) > nchar(regmatches(strx, regexpr("[1-9][0-9]*\\.[0-9]*[1-9]",strx)))-1 > } > > retur

Re: [R] Regular Expressions in grep - Solution and function to determine significant figures of a number

2012-08-22 Thread Dr. Holger van Lishaut

Dear all, regmatches works. And, since this has been asked here before: SignifStellen<-function(x){ strx=as.character(x) nchar(regmatches(strx, regexpr("[1-9][0-9]*\\.[0-9]*[1-9]",strx)))-1 } returns the significant figures of a number. Perhaps this can help someone. Thanks & best reg

Re: [R] Regular Expressions in grep

2012-08-21 Thread arun

HI, Try this: gsub("^-\\d(\\d{4}.).*","\\1",a) #[1] "1020." gsub("^.*(.\\d{5}).","\\1",a) #[1] ".90920" A.K. - Original Message - From: Dr. Holger van Lishaut To: "r-help@r-project.org" Cc: Sent: Tuesday, August 2

Re: [R] Regular Expressions in grep

2012-08-21 Thread R. Michael Weylandt

You're misreading the docs: from grep, value: if ‘FALSE’, a vector containing the (‘integer’) indices of the matches determined by ‘grep’ is returned, and if ‘TRUE’, a vector containing the matching elements themselves is returned. Since there's a match somewhere

Re: [R] Regular Expressions in grep

2012-08-21 Thread Noia Raindrops

'grep' does not change strings. Use 'gsub' or 'regmatches': # gsub Front <- gsub("^.*?([1-9][0-9]*\\.).*?$", "\\1", a) End <- gsub("^.*?(\\.[0-9]*[1-9]).*?$", "\\1", a) # regexpr and regmatches (R >= 2.14.0) Front <- regmatches(a, regexpr("[1-9][0-9]*\\.", a)) End <- regmatches(a, regexpr("\\.[0-9

Re: [R] Regular Expressions in grep

2012-08-21 Thread Bert Gunter

grep() returns the matches. You want regexpr() and regmatches() -- Bert On Tue, Aug 21, 2012 at 12:24 PM, Dr. Holger van Lishaut wrote: > Dear r-help members, > > I have a number in the form of a string, say: > > a<-"-01020.909200" > > I'd like to extract "1020." as well as ".9092" > > Front<-gr

[R] Regular Expressions in grep

2012-08-21 Thread Dr. Holger van Lishaut

Dear r-help members, I have a number in the form of a string, say: a<-"-01020.909200" I'd like to extract "1020." as well as ".9092" Front<-grep(pattern="[1-9]+[0-9]*\\.", value=TRUE, x=a, fixed=FALSE) End<-grep(pattern="\\.[0-9]*[1-9]+", value=TRUE, x=a, fixed=FALSE) However, both strings gi

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Fred G

6 Chicago Blacksox 1701 made up > 7 7 Chicago Cubs 1702 made up > 8 8 Chicago Whitesox 1703 made up > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > > -Original Message----- > > From: r-help-boun...@r-project.

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread William Dunlap

p-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Rui Barradas > Sent: Friday, August 10, 2012 11:18 AM > To: Fred G > Cc: r-help > Subject: Re: [R] Regular Expressions + Matrices > > Hello, > > Try the following. > > > d

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Fred G

x,1918,ESPN >>> 4,Washington Nationals,2010,ESPN >>> 5,Detroit Tigers,1990,ESPN >>> ",sep=",",header=TRUE,**stringsAsFactors=FALSE) >>> >>> index<-grep("New York.*",dat1$NAME) >>> dat1[i

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Rui Barradas

ESPN 4,Washington Nationals,2010,ESPN 5,Detroit Tigers,1990,ESPN ",sep=",",header=TRUE,stringsAsFactors=FALSE) index<-grep("New York.*",dat1$NAME) dat1[index,] # ID NAME YEAR SOURCE #1 1New York Mets 1900ESPN #2

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread arun

help@r-project.org Cc: Sent: Friday, August 10, 2012 1:41 PM Subject: [R] Regular Expressions + Matrices Hi all, My code looks like the following: inname = read.csv("ID_error_checker.csv", as.is=TRUE) outname = read.csv("output.csv", as.is=TRUE) #My algorithm is the following: #for

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Rui Barradas

Hello, Try the following. d <- read.table(textConnection(" ID NAME YEAR SOURCE 1 'New York Mets' 1900 ESPN 2 'New York Yankees' 1920 Cooperstown 3 'Boston Redsox' 1918 ESPN 4 'Washington Nationals' 2010

Re: [R] Regular Expressions + Matrices

2012-08-10 Thread Fred G

LSE) > > index<-grep("New York.*",dat1$NAME) > dat1[index,] > # ID NAME YEAR SOURCE > #1 1New York Mets 1900ESPN > #2 2 New York Yankees 1920 Cooperstown > > A.K. > > > > - Original Message - > From: Fred G

[R] Regular Expressions + Matrices

2012-08-10 Thread Fred G

Hi all, My code looks like the following: inname = read.csv("ID_error_checker.csv", as.is=TRUE) outname = read.csv("output.csv", as.is=TRUE) #My algorithm is the following: #for line in inname #if first string up to whitespace in row in inname$name = first string up to whitespace in row + 1 in in

Re: [R] regular expressions in R

2011-12-21 Thread jim holtman

To be correct for the regular expression, it should be: dir(pattern = "\\.(txt|doc)$") The form dir(pattern="*.txt") will match 'txt' appearing anywhere in the name; this looks like the argument you would have used to "Sys.glob" which is a UNIX style file name match and not a regular expression

Re: [R] regular expressions in R

2011-12-21 Thread R. Michael Weylandt

Do you wish to include .docx files as well or just .doc? Michael On Wed, Dec 21, 2011 at 10:04 AM, Alaios wrote: > Dear all > I would like to ask from dir function in R (?dir) > to give me only the files that end with .txt or .doc. > > The dir functions supports the use of patterns (is not that

Re: [R] regular expressions in R

2011-12-21 Thread Sarah Goslee

>From the help for dir: File naming conventions are platform dependent. The pattern matching works with the case of file names as returned by the OS On my linux system, this works: > dir(pattern="*.txt") [1] "a.txt" "b.txt" > > dir(pattern="*.doc") [1] "c.doc" > > dir(pattern="*.doc|*

[R] regular expressions in R

2011-12-21 Thread Alaios

Dear all I would like to ask from dir function in R (?dir) to give me only the files that end with .txt or .doc. The dir functions supports the use of patterns (is not that regular expressions) for doing that. print(dir(i,full.names=TRUE,pattern=.)) Could you please help me compose such a

Re: [R] Regular expressions in R

2011-11-16 Thread Michael Griffiths

Thanks to everyone who contributed to my questions. As ever, I am extremely grateful to all those on the R-list who make it what it is. Regards Mike Griffiths On Tue, Nov 15, 2011 at 5:47 PM, Joshua Wiley wrote: > Hi Michael, > > Your strings were long so I made a bit smaller example. Sarah ma

Re: [R] Regular expressions in R

2011-11-15 Thread Joshua Wiley

Hi Michael, Your strings were long so I made a bit smaller example. Sarah made one good point, you want to be using gsub() not sub(), but when I use your code, I do not think it even works precisely for one instance. Try this on for size, you were 99% there: ## simplified cases form1 <- c('produ

Re: [R] Regular expressions in R

2011-11-15 Thread Sarah Goslee

Hi Michael, You need to take another look at the examples you were given, and at the help for ?sub(): The two ‘*sub’ functions differ only in that ‘sub’ replaces only the first occurrence of a ‘pattern’ whereas ‘gsub’ replaces all occurrences. If ‘replacement’ contains backreferen

[R] Regular expressions in R

2011-11-15 Thread Michael Griffiths

Good afternoon list, I have the following character strings; one with spaces between the maths operators and variable names, and one without said spaces. form<-c('~ Sentence + LEGAL + Intro + Intro / Intro1 + Intro * LEGAL + benefit + benefit / benefit1 + product + action * mean + CTA + help + me

Re: [R] Regular Expressions for "Large" Data Set

2011-06-07 Thread Marc Schwartz

On Jun 7, 2011, at 3:55 PM, Abraham Mathew wrote: > I'm running R 2.13 on Ubuntu 10.10 > > I have a data set which is comprised of character strings. > > site = readLines('http://www.census.gov/tiger/tms/gazetteer/zips.txt') > > dat <- c("01, 35004, AL, ACMAR, 86.51557, 33.584132, 6055, 0.00149

[R] Regular Expressions for "Large" Data Set

2011-06-07 Thread Abraham Mathew

I'm running R 2.13 on Ubuntu 10.10 I have a data set which is comprised of character strings. site = readLines('http://www.census.gov/tiger/tms/gazetteer/zips.txt') dat <- c("01, 35004, AL, ACMAR, 86.51557, 33.584132, 6055, 0.001499") dat I want to loop through the data and construct a data fra

Re: [R] Regular Expressions in Column Headings

2011-03-09 Thread Gabor Grothendieck

On Wed, Mar 9, 2011 at 8:52 AM, Matthew DeAngelis wrote: > Hi all, > > I am hoping that someone can help me with a problem I am having with column > headings. I have read a table into R using read.table: the rows are > documents, and the columns are counts of regular expression matches (so that >

[R] Regular Expressions in Column Headings

2011-03-09 Thread Matthew DeAngelis

Hi all, I am hoping that someone can help me with a problem I am having with column headings. I have read a table into R using read.table: the rows are documents, and the columns are counts of regular expression matches (so that the column heading is the given regular expression). My problem is

Re: [R] Regular Expressions

2010-11-05 Thread Gabor Grothendieck

2010/11/5 Brian Diggs : > Is there a standard, built in way to get both (all) backreferences at the > same time with just one call to sub (or the appropriate function)? I can > cobble something together specifically for 2 backreferences (not extensively > tested): > > both_backrefs <- function(patt

Re: [R] Regular Expressions

2010-11-05 Thread Brian Diggs

On 11/5/2010 12:09 AM, Prof Brian Ripley wrote: On Thu, 4 Nov 2010, Noah Silverman wrote: Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example

Re: [R] Regular Expressions

2010-11-05 Thread Noah Silverman

That's perfect! Don't know how I missed that. I want to start playing with some modeling of financial data and the only format I can download is rather ugly. So my plan is to use a series of Regex to extract what I want. Noticed that you are a Prof. in applied stats. I'm at UCLA working on an

Re: [R] Regular Expressions

2010-11-05 Thread Prof Brian Ripley

On Thu, 4 Nov 2010, Noah Silverman wrote: Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example, given the string:"10 Nov 13.00 (PFE1020

[R] Regular Expressions

2010-11-04 Thread Noah Silverman

Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example, given the string:"10 Nov 13.00 (PFE1020K13)" I want to capture the first to digits

Re: [R] Regular expressions: offsets of groups

2010-09-30 Thread Titus von der Malsburg

Ok, we decided to have a shot at modifying gregexpr. Let's see how it works out. If anybody is interested in discussing this please contact me. R-help doesn't seem like the right place for further discussion. Is there a default place for discussing things like that? Thanks everybody for your re

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Titus von der Malsburg

On Wed, Sep 29, 2010 at 1:58 PM, Michael Bedward wrote: > How is your C coding ? Bill ? Anyone else ? I could have a got at > writing some prototype code to test in the next few days, though if > someone else with decent C skills is itching to do it please speak up. We have a skilled C- and R-pr

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Michael Bedward

I'd definitely be a customer for it Titus. And it does seem like an obvious hole in regex processing in R that cries out to be filled. Um, ggregexpr isn't the sexiest of function names :) Perhaps we can think of something a little easier ? How is your C coding ? Bill ? Anyone else ? I could hav

Re: [R] Regular expressions: offsets of groups

2010-09-29 Thread Titus von der Malsburg

Bill, Michael, good to see I'm not the only one who sees potential for improvements in the regexpr domain. Adding a subpattern argument is certainly a step in the right direction and would make my life much easier. However, in my application I need to know not only the position of one group but a

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread Michael Bedward

Ah, that's interesting - thanks Bill. That's certainly on the right track for me (Titus, you too ?) especially if the subpattern argument accepted a vector of multiple group indices. As you say, this is straightforward in C. I'd be happy to (try to) make a patch for the R sources if there was some

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread William Dunlap

> -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Bedward > Sent: Tuesday, September 28, 2010 12:46 AM > To: Titus von der Malsburg > Cc: r-help@r-project.org > Subject: Re: [R] Regular expressio

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread Gabor Grothendieck

On Tue, Sep 28, 2010 at 6:52 AM, Titus von der Malsburg wrote: > On Tue, Sep 28, 2010 at 9:46 AM, Michael Bedward > wrote: >> What Titus wants to do is akin to retrieving capturing groups from a >> Matcher object in Java. > > Precisely. Here's the description: > > http://download.oracle.com/jav

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread Titus von der Malsburg

On Tue, Sep 28, 2010 at 9:46 AM, Michael Bedward wrote: > What Titus wants to do is akin to retrieving capturing groups from a > Matcher object in Java. Precisely. Here's the description: http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Matcher.html#start(int) Gabor's lookbe

Re: [R] Regular expressions: offsets of groups

2010-09-28 Thread Michael Bedward

What Titus wants to do is akin to retrieving capturing groups from a Matcher object in Java. I also thought there must be an existing, elegant solution to this some time ago and searched for it, including looking at the sources (albeit with not much expertise) but came up blank. I also looked at t

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Gabor Grothendieck

On Mon, Sep 27, 2010 at 1:34 PM, Titus von der Malsburg wrote: > On Mon, Sep 27, 2010 at 7:29 PM, Gabor Grothendieck > wrote: >> Try this zero width negative look behind expression: >> >>> gregexpr("(?!a+)(b+)", "abcdaabbc", perl = TRUE) >> [[1]] >> [1] 2 7 >> attr(,"match.length") >> [1] 1 2 > >

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Henrique Dallazuanna

You've tried: gregexpr("b+", "abcdaabbc") On Mon, Sep 27, 2010 at 12:48 PM, Titus von der Malsburg wrote: > Dear list! > > > gregexpr("a+(b+)", "abcdaabbc") > [[1]] > [1] 1 5 > attr(,"match.length") > [1] 2 4 > > What I want is the offsets of the matches for the group (b+), i.e. 2 > and 7, not

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Henrique Dallazuanna

You could do this: gregexpr("ab+", "abcdaabbcbb")[[1]] + 1 On Mon, Sep 27, 2010 at 2:25 PM, Titus von der Malsburg wrote: > On Mon, Sep 27, 2010 at 7:16 PM, Henrique Dallazuanna > wrote: > > You've tried: > > > > gregexpr("b+", "abcdaabbc") > > But this would match the third occurrence of b+ in

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Titus von der Malsburg

On Mon, Sep 27, 2010 at 7:29 PM, Gabor Grothendieck wrote: > Try this zero width negative look behind expression: > >> gregexpr("(?!a+)(b+)", "abcdaabbc", perl = TRUE) > [[1]] > [1] 2 7 > attr(,"match.length") > [1] 1 2 Thanks Gabor, but this gives me the same result as gregexpr("b+", "abcdaab

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Gabor Grothendieck

On Mon, Sep 27, 2010 at 11:48 AM, Titus von der Malsburg wrote: > Dear list! > >> gregexpr("a+(b+)", "abcdaabbc") > [[1]] > [1] 1 5 > attr(,"match.length") > [1] 2 4 > > What I want is the offsets of the matches for the group (b+), i.e. 2 > and 7, not the offsets of the complete matches. Is there

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Titus von der Malsburg

On Mon, Sep 27, 2010 at 7:16 PM, Henrique Dallazuanna wrote: > You've tried: > > gregexpr("b+", "abcdaabbc") But this would match the third occurrence of b+ in "abcdaabbcbb". But in this example I'm only interested in b+ if it's preceded by a+. Titus _

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread Titus von der Malsburg

Thank you Jim, but just as the solution that I discussed, your proposal involves deconstructing the pattern and searching several times. I'm looking for a general and efficient solution. Internally, the regexpr engine has all necessary information after one pass through the string. What I need i

Re: [R] Regular expressions: offsets of groups

2010-09-27 Thread jim holtman

try this: > x <- gregexpr("a+(b+)", "abcdaabbcaaacaaab") > justA <- gregexpr("a+", "abcdaabbcaaacaaab") > # find matches in 'x' for 'justA' > indx <- which(justA[[1]] %in% x[[1]]) > # now determine where 'b' starts > justA[[1]][indx] + attr(justA[[1]], 'match.length')[indx] [1] 2 7 17 > On M

[R] Regular expressions: offsets of groups

2010-09-27 Thread Titus von der Malsburg

Dear list! > gregexpr("a+(b+)", "abcdaabbc") [[1]] [1] 1 5 attr(,"match.length") [1] 2 4 What I want is the offsets of the matches for the group (b+), i.e. 2 and 7, not the offsets of the complete matches. Is there a way in R to get that? I know about gsubgn and strapply, but they only give me

Re: [R] regular expressions

2009-10-26 Thread baptiste auguie

Perfect, thanks! baptiste 2009/10/26 Gabor Grothendieck : > Assuming only START fields match pat: > >> ## this one has more fields: how do I generalize the regular expression? >> st2 = c("START text1 1 text2 2.3 text3 5", "whatever intermediate text", > + "START text1 23.4 text2 3.1415 text3 6")

Re: [R] regular expressions

2009-10-26 Thread Gabor Grothendieck

Assuming only START fields match pat: > ## this one has more fields: how do I generalize the regular expression? > st2 = c("START text1 1 text2 2.3 text3 5", "whatever intermediate text", + "START text1 23.4 text2 3.1415 text3 6") > > pat <- "[[:alnum:]]+ +([0-9.]+)" > s <- strapply(st2, pat, c, s

[R] regular expressions

2009-10-26 Thread baptiste auguie

Dear list, I have the following text to parse (originating from readLines as some lines have unequal size), st = c("START text1 1 text2 2.3", "whatever intermediate text", "START text1 23.4 text2 3.1415") from which I'd like to extract the lines starting with "START", and group the subsequent fi

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Duncan Murdoch

On 06/07/2008 7:37 PM, Gabor Grothendieck wrote: Look at the discussion of zero width lookahead assertions in ?regex . Use perl = TRUE as previously indicated. Thanks, this seems to work: gsub( "(? On Sun, Jul 6, 2008 at 7:29 PM, Duncan Murdoch <[EMAIL PROTECTED]> wrote: On 06/07/2008 5:37 P

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Gabor Grothendieck

Look at the discussion of zero width lookahead assertions in ?regex . Use perl = TRUE as previously indicated. On Sun, Jul 6, 2008 at 7:29 PM, Duncan Murdoch <[EMAIL PROTECTED]> wrote: > On 06/07/2008 5:37 PM, (Ted Harding) wrote: >> >> On 06-Jul-08 21:17:04, Duncan Murdoch wrote: >>> >>> I'm tryi

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Duncan Murdoch

On 06/07/2008 5:37 PM, (Ted Harding) wrote: On 06-Jul-08 21:17:04, Duncan Murdoch wrote: I'm trying to write a gsub() call that takes a string and escapes all the unescaped quote marks in it. So the string \" would be left unchanged, but \\" would be changed to \\\" because the double ba

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Ted Harding

On 06-Jul-08 21:17:04, Duncan Murdoch wrote: > I'm trying to write a gsub() call that takes a string and escapes all > the unescaped quote marks in it. So the string > > \" > > would be left unchanged, but > > \\" > > would be changed to > > \\\" > > because the double backslash doesn't act

Re: [R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Gabor Grothendieck

Try adding perl = TRUE On Sun, Jul 6, 2008 at 5:17 PM, Duncan Murdoch <[EMAIL PROTECTED]> wrote: > I'm trying to write a gsub() call that takes a string and escapes all the > unescaped quote marks in it. So the string > > \" > > would be left unchanged, but > > \\" > > would be changed to > > \\\

[R] Regular expressions: bug or misunderstanding?

2008-07-06 Thread Duncan Murdoch

I'm trying to write a gsub() call that takes a string and escapes all the unescaped quote marks in it. So the string \" would be left unchanged, but \\" would be changed to \\\" because the double backslash doesn't act as an escape for the quote, the first just escapes the second. I have

Re: [R] Regular Expressions

2008-05-13 Thread Gabor Grothendieck

On Tue, May 13, 2008 at 5:02 AM, Shubha Vishwanath Karanth <[EMAIL PROTECTED]> wrote: > Suppose, > > S=c("World_is_beautiful", "one_two_three_four","My_book") > > I need to extract the last but one element of the strings. So, my output > should look like: > > Ans=c("is","three","My") > > gsub() ca

Re: [R] Regular Expressions

2008-05-13 Thread Richard . Cotton

> S=c("World_is_beautiful", "one_two_three_four","My_book") > I need to extract the last but one element of the strings. So, my > output should look like: > Ans=c("is","three","My") > gsub() can do this...but wondering how do I give the regular expression sapply(strsplit(S, "_"), functio

Re: [R] Regular Expressions

2008-05-13 Thread Dimitris Rizopoulos

hubha Vishwanath Karanth" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Tuesday, May 13, 2008 11:02 AM Subject: [R] Regular Expressions Hi R, Again struck with regular expressions... Suppose, S=c("World_is_beautiful", "one_two_three_four","My_

[R] Regular Expressions

2008-05-13 Thread Shubha Vishwanath Karanth

Hi R, Again struck with regular expressions... Suppose, S=c("World_is_beautiful", "one_two_three_four","My_book") I need to extract the last but one element of the strings. So, my output should look like: Ans=c("is","three","My") gsub() can do this...but wondering how do I giv

Re: [R] Regular Expressions Help

2008-04-19 Thread Hans-Jörg Bibiko

On 19.04.2008, at 06:46, maud wrote: > I am having some trouble learning regular expressions. Let me describe > the general problem I am dealing with. Consider the following setup: > > Joe<- c(1,2,3) > Bob<- c(2,4,6) > Alice <- c(9,8,7) > > Matrix <- cbind(Joe, Bob, Alice) > St <- c("Bob", "Alice"

[R] Regular Expressions Help

2008-04-19 Thread maud

I am having some trouble learning regular expressions. Let me describe the general problem I am dealing with. Consider the following setup: Joe<- c(1,2,3) Bob<- c(2,4,6) Alice <- c(9,8,7) Matrix <- cbind(Joe, Bob, Alice) St <- c("Bob", "Alice", "Alice:Bob") Now I want to make a new matrix having

Re: [R] regular expressions

2008-03-12 Thread Christos Hatzis

David > Sent: Wednesday, March 12, 2008 12:15 PM > To: [EMAIL PROTECTED] > Subject: [R] regular expressions > > Hello all, > > Still fighting with regular expressions and such, I am again stuck: > > Suppose I have a vector of character chains. In this vector, >

[R] regular expressions

2008-03-12 Thread GOUACHE David

Hello all, Still fighting with regular expressions and such, I am again stuck: Suppose I have a vector of character chains. In this vector, I wish to identify which character chains start with a given pattern, and then replace everything that comes after said pattern. Here is a quick exampl

85 matches

Mail list logo