Re: [R] using regular expressions to retrieve a digit-digit-dot structure from a string

Gabor Grothendieck Mon, 08 Jun 2009 16:50:19 -0700

On Mon, Jun 8, 2009 at 7:18 PM, Wacek
Kusnierczyk<waclaw.marcin.kusnierc...@idi.ntnu.no> wrote:
> Gabor Grothendieck wrote:
>> Try this.  See ?regex for more.
>>
>>
>>> x <- 'This happened in the 21. century." (the dot behind 21 is'
>>> regexpr("(?![0-9]+)[.]", x, perl = TRUE)
>>>
>> [1] 24
>> attr(,"match.length")
>> [1] 1
>>
>
> yes, but
>
>    gregexpr('(?![0-9]+)[.]', 'a. 1. a1.', perl=TRUE)
>    # 2 5 9


Yes, it should be:

> gregexpr('(?<=[0-9])[.]', 'a. 1. a1.', perl=TRUE)
[[1]]
[1] 5 9
attr(,"match.length")
[1] 1 1

which displays the position of every dot that is preceded
immediately by a digit.  Or just replace gregexpr with regexpr
if its intended that it match only one.

>
> which, i guess, is not what you want.  if what you want is to match all
> and only dots that follow at least one digit preceded by a word
> boundary, then the following should do, as far as i can see:
>
>    gregexpr('\\b[0-9]+\\K[.]', 'a. 1. a1.', perl=TRUE)
>    # 5
>
> vQ
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using regular expressions to retrieve a digit-digit-dot structure from a string

Reply via email to