> Date: Mon, 27 Jun 2011 15:10:43 +0200
> From: Paolo Bonzini
> To: Aharon Robbins
> CC: egg...@cs.ucla.edu, ebl...@redhat.com, bug-g...@gnu.org,
> bug-gnulib@gnu.org, k...@freefriends.org
> Subject: Re: Dealing with character ranges in grep
>
> On 06/16/2011
Paolo Bonzini wrote:
> On 06/15/2011 09:12 PM, Jim Meyering wrote:
>> However, backreferences force these tools to skip the DFA-based
>> optimization and resort to running the regexp code. In that case,
>> there is a dichotomy. Adding a backreference to a range-including
>> regexp would have the
On 06/16/2011 09:06 PM, Aharon Robbins wrote:
I have already corresponded with Chet. He plans to add a shell option
to enable RRI, and we can hope that at some point, it might become the
default. So that has already been started.
I'd do the other way round---make it the default, and add a shell
On 06/15/2011 09:12 PM, Jim Meyering wrote:
However, backreferences force these tools to skip the DFA-based
optimization and resort to running the regexp code. In that case,
there is a dichotomy. Adding a backreference to a range-including
regexp would have the surprising consequence of changin
On 06/16/2011 10:44 AM, Jim Meyering wrote:
> To make this proposed change go through, that configure-time option would
> have to be eliminated, so that we always build with the gnulib-provided
> regex code. Of course, if glibc ever changes, we can detect that and
> automatically prefer it w
Johannes Meixner wrote:
> Hello Jim,
>
> On Jun 16 10:44 Jim Meyering wrote (excerpt):
Thus, if we go this route, we are effectively saying
that people who want self-consistent regex-handling
in our tools must build with --with-included-regex or end
up causing subtle problems.
>
Hello Jim,
On Jun 16 10:44 Jim Meyering wrote (excerpt):
Thus, if we go this route, we are effectively saying
that people who want self-consistent regex-handling
in our tools must build with --with-included-regex or end
up causing subtle problems.
...
It goes like this (at least for gawk, gre
Hi.
> From: Jim Meyering
> To: Bruno Haible
> Cc: Paolo Bonzini , Aharon Robbins ,
> bug-gnulib@gnu.org, bug-grep , k...@freefriends.org
> Subject: Re: Dealing with character ranges in grep
> Date: Thu, 16 Jun 2011 07:58:05 +0200
>
> To make this proposed
Hi All.
> Date: Wed, 15 Jun 2011 14:09:45 -0600
> From: Eric Blake
> To: Paul Eggert
> CC: Aharon Robbins , bonz...@gnu.org, bug-g...@gnu.org,
> bug-gnulib@gnu.org, k...@freefriends.org
> Subject: Re: Dealing with character ranges in grep
>
> > Doesn'
* Jim Meyering (j...@meyering.net) [20110616 10:55]:
> For the record, at least Fedora's grep and sed both build
> --without-included-regex, so would be affected.
SLES and openSUSE also build sed and grep --without-included-regex, so would
also be affected.
Philipp
Hello,
On Jun 16 15:51 Stanislav Brabec wrote:
Johannes Meixner wrote:
Again:
I do not care if this or that special feature is supported or not
because I think that consistent behaviour has topmost priority.
Do you prefer "consistent behavior of regexp in all applications across
the whole d
Johannes Meixner wrote:
> Again:
> I do not care if this or that special feature is supported or not
> because I think that consistent behaviour has topmost priority.
Do you prefer "consistent behavior of regexp in all applications across
the whole distribution" or "consistent behavior of (GNU) g
Jim Meyering wrote:
> Johannes Meixner wrote:
> > recently I became openSUSE package maintainer for grep and gawk.
> >
> > I added Stanislav Brabec, openSUSE package maintainer for sed.
> >
> > In short:
> > I support and appreciate everything which leads to consistence.
> ..
>
> Thanks for the qu
Hello,
On Jun 16 13:51 Stanislav Brabec wrote (excerpt):
grep in openSUSE uses glibc regex by default.
Yes.
Currently grep in openSUSE is built using
"configure --without-included-regex"
as it was built for openSUSE all the time.
Perhaps there is a misunderstanding what I mean.
What I mea
Hello,
recently I became openSUSE package maintainer for grep and gawk.
I added Stanislav Brabec, openSUSE package maintainer for sed.
In short:
I support and appreciate everything which leads to consistence.
On Jun 16 07:58 Jim Meyering wrote:
Jim Meyering wrote:
Bruno Haible wrote:
Pao
Jim Meyering wrote:
...
>> Thus, if we go this route, we are effectively saying
>> that people who want self-consistent regex-handling
>> in our tools must build with --with-included-regex or end
>> up causing subtle problems.
>>
>> That's a big leap.
>> I'm not saying I won't take upstream grep ov
Johannes Meixner wrote:
> recently I became openSUSE package maintainer for grep and gawk.
>
> I added Stanislav Brabec, openSUSE package maintainer for sed.
>
> In short:
> I support and appreciate everything which leads to consistence.
...
Thanks for the quick reply and the support.
Jim Meyering wrote:
> Bruno Haible wrote:
>> Paolo,
>>
>>> > [=e=] to match "e" as well as accented versions like é, è and ê).
>>> > That is the one feature that you get with glibc, and that you would
>>> > sacrifice when building --with-included-regex.
>>>
>>> I agree. It's up to distros to choo
On 06/15/2011 12:36 PM, Paul Eggert wrote:
> On 06/15/11 10:00, Aharon Robbins wrote:
>> Can I get a clear "yes, grep and sed are going to change to Reasonable
>> Range Interpretation"?
>
> I can't speak for grep and sed since I'm not a maintainer of
> either, but to my mind the only thing that ma
Bruno Haible wrote:
> Paolo,
>
>> > [=e=] to match "e" as well as accented versions like é, è and ê).
>> > That is the one feature that you get with glibc, and that you would
>> > sacrifice when building --with-included-regex.
>>
>> I agree. It's up to distros to choose, of course.
>
> If you are
On 06/15/11 10:00, Aharon Robbins wrote:
> Can I get a clear "yes, grep and sed are going to change to Reasonable
> Range Interpretation"?
I can't speak for grep and sed since I'm not a maintainer of
either, but to my mind the only thing that makes sense is for
regular expressions like [a-z] to ha
Hi All.
Can I get a clear "yes, grep and sed are going to change to Reasonable
Range Interpretation"?
I was looking into the code, in terms of not using RE_RANGES_IGNORE_LOCALES
but simply always doing it based on character set ordering.
Doing so lets up throw away hard_locale.[ch] also.
Befor
Hi.
> From: Paolo Bonzini
> Date: Tue, 14 Jun 2011 13:11:32 +0200
> Subject: Re: Dealing with character ranges in grep
> To: Aharon Robbins
> Cc: egg...@cs.ucla.edu, k...@freefriends.org, bug-g...@gnu.org,
> bug-gnulib@gnu.org
>
> > ? In principle, I'm al
* Karl Berry (k...@freefriends.org) [20110611 01:50]:
> Because whatever changes they might or might not agree to make, they
> obviously won't reach user systems for years.
Not necessarily. Linux distributors do backports of changes they deem good
to have now and then.
Philipp
> In principle, I'm all for this, but in practice, I'm going to leave gawk's
> code alone for now (there's always 4.1 :-).
As long as --posix is not affecting the choice, that's fine. However,
please make sure that compiling gawk --without-included-regex works
(it should go without saying)!
Hi All.
> Date: Thu, 09 Jun 2011 10:14:01 -0700
> From: Paul Eggert
> To: Paolo Bonzini
> CC: Aharon Robbins , bug-grep ,
> bug-gnulib , k...@freefriends.org
> Subject: Re: Dealing with character ranges in grep
>
> On 06/08/2011 10:14 PM, Aharon Robbins wrote:
&g
I guess I don't follow the purpose of involving glibc now. Because
whatever changes they might or might not agree to make, they obviously
won't reach user systems for years. So for anyone to make use of the
new options, it all has to be implemented in gnulib regex anyway. If
the goal is to minim
Bruno Haible wrote:
>> With my proposal, distros/people that use --with-included-regex would
>> get understandable semantics + no equivalence classes
>> ...
>> locale behavior of regex are irremediably
>> broken. For example, when you have a collation element, you can match
>> it using ranges (e.g
: Re: Dealing with character ranges in grep
>
> On 06/08/2011 10:14 PM, Aharon Robbins wrote:
>
> > So, for the upcoming gawk 4.0, I decided (as Karl put it) to cut the
> > Gordian knot and make ranges behave like the C locale, the way it's long
> > been documented, an
On Thu, Jun 9, 2011 at 19:14, Paul Eggert wrote:
> On 06/08/2011 10:14 PM, Aharon Robbins wrote:
>
>> So, for the upcoming gawk 4.0, I decided (as Karl put it) to cut the
>> Gordian knot and make ranges behave like the C locale, the way it's long
>> been documented, and as most people expect. Tho
On 06/08/2011 10:14 PM, Aharon Robbins wrote:
> So, for the upcoming gawk 4.0, I decided (as Karl put it) to cut the
> Gordian knot and make ranges behave like the C locale, the way it's long
> been documented, and as most people expect. Those who want the POSIX
> behavior can still get it using
On 06/09/2011 01:53 PM, Bruno Haible wrote:
Paolo,
My proposal wouldn't change defaults, which is why I believe that this
is a separate topic.
But at the same time you are pushing for the use of --with-included-regex.
We found out that by doing this, the equivalence classes feature gets lost,
Paolo,
> My proposal wouldn't change defaults, which is why I believe that this
> is a separate topic.
But at the same time you are pushing for the use of --with-included-regex.
We found out that by doing this, the equivalence classes feature gets lost,
and the divergence between glibc and gnuli
On 06/09/2011 01:12 PM, Bruno Haible wrote:
What would it take to let distros/people use --with-included-regex and
get understandable semantics for ranges + working equivalence classes?
I would prefer that to your proposal, because it cannot be seen as a
regression by people who care about equiv
Paolo,
> With my proposal, distros/people that use --with-included-regex would
> get understandable semantics + no equivalence classes
> ...
> locale behavior of regex are irremediably
> broken. For example, when you have a collation element, you can match
> it using ranges (e.g. [d-i] matches
On 06/09/2011 11:58 AM, Bruno Haible wrote:
Paolo,
[=e=] to match "e" as well as accented versions like é, è and ê).
That is the one feature that you get with glibc, and that you would
sacrifice when building --with-included-regex.
I agree. It's up to distros to choose, of course.
If you a
Paolo,
> > [=e=] to match "e" as well as accented versions like é, è and ê).
> > That is the one feature that you get with glibc, and that you would
> > sacrifice when building --with-included-regex.
>
> I agree. It's up to distros to choose, of course.
If you are on the point of sacrificing a
On 06/09/2011 11:33 AM, Jim Meyering wrote:
I like the idea.
However a potential sticking point is the equivalence class (e.g., using
[=e=] to match "e" as well as accented versions like é, è and ê).
That is the one feature that you get with glibc, and that you would
sacrifice when building --wit
Paolo Bonzini wrote:
> [making this public, there should be no reason not to]
>
> On 06/08/2011 10:14 PM, Aharon Robbins wrote:
>> Hi. As we've discussed a little previously, I finally got tired of
>> trying to explain to users why the character range [a-z] was matching
>> most uppercase letters a
[making this public, there should be no reason not to]
On 06/08/2011 10:14 PM, Aharon Robbins wrote:
Hi. As we've discussed a little previously, I finally got tired of
trying to explain to users why the character range [a-z] was matching
most uppercase letters also. ("I've found a bug in gawk!
40 matches
Mail list logo