Paolo Bonzini wrote:
> * tests/inconsistent-ranges: New.
> * tests/Makefile.am (TESTS): Add it.
> ---
> tests/Makefile.am | 1 +
> tests/inconsistent-range | 17 +++++++++++++++++
> 2 files changed, 18 insertions(+), 0 deletions(-)
> create mode 100644 tests/inconsistent-range
>
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index f66543f..3db1cfb 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -59,6 +59,7 @@ TESTS = \
> help-version \
> ignore-mmap \
> include-exclude \
> + inconsistent-range \
> khadafy \
> max-count-vs-context \
> options \
> diff --git a/tests/inconsistent-range b/tests/inconsistent-range
> new file mode 100644
> index 0000000..e28acde
> --- /dev/null
> +++ b/tests/inconsistent-range
> @@ -0,0 +1,17 @@
> +#!/bin/sh
> +# This would fail for grep-2.6
> +. "${srcdir=.}/init.sh"; path_prepend_ ../src
> +
> +printf '00a\n00g\n00z\n00A\n00G\n00Z\n' > in || framework_failure_
> +
> +fail=0
> +
> +for LOC in en_US.UTF-8 en_US zh_CN $LOCALE_FR_UTF8 C; do
Hi Paolo,
Thanks for the fix and the test.
Both look fine. Nice trick to cross-check that way,
so the test passes even on systems lacking support for those locales.
> + out1=out1-$LOC
> + LC_ALL=$LOC grep -E '(.)\1[A-Z]' in > $out1 || fail=1
> + out2=out2-$LOC
> + LC_ALL=$LOC grep -E '[A-Z]' in > $out2 || fail=1
> + compare $out1 $out2 || fail=1
> +done
For the record, this changes how ranges work.
This shows the bad-old behavior (grep-2.7 and earlier)
note how [A-Z] matches lower case letters:
printf '00a\n00g\n00z\n00A\n00G\n00Z\n' > in
$ LC_ALL=en_US.UTF-8 /bin/grep -E '[A-Z]' in
00g
00z
00A
00G
00Z
With Paolo's change we avoid that common source of confusion:
$ LC_ALL=en_US.UTF-8 ./grep -E '[A-Z]' in
00A
00G
00Z