tags 751828 + patch retitle 751828 pcre3: FTBFS: test failure: no longer produces "number too big in {} quantifier" found 751828 1:8.35-2 thanks
On Wed, 18 Jun 2014 at 11:38:11 -0300, Mauricio Faria de Oliveira wrote: > /a{11111111111111111111}/I > -Failed: number too big in {} quantifier at offset 22 > +Capturing subpattern count = 0 > +No options > +First char = 'a' > +No need char > > /(){64294967295}/I > -Failed: number too big in {} quantifier at offset 14 > +Failed: regular expression is too large at offset 15 > > /(){2,4294967295}/I > -Failed: number too big in {} quantifier at offset 15 > +Failed: numbers out of order in {} quantifier at offset 15 > [...] I can reproduce this on amd64 in 1:8.35-2, and it seems likely that the FTBFS on every architecture except amd64 has the same cause. These numbers are parsed using this loop, or matching code to parse max: while (IS_DIGIT(*p)) min = min * 10 + (int)(*p++ - CHAR_0); if (min < 0 || min > 65535) { *errorcodeptr = ERR5; return p; If min gets larger than 214748365 (slightly more than 2^31 / 10) in any given iteration, then the next multiplication by 10 is a signed integer overflow, which is formally undefined behaviour in C. Older versions of gcc effectively guaranteed that signed integer overflow wraps around in the way you might expect for twos-complement, e.g. 214748365 * 10 == (int) 2147483650U == -2147483646, but more recent gcc (when -fwrapv is not given) optimizes on the assumption that undefined behaviour will not be reached, so this overflow leads to unpredictable results. In the current implementation, even with older gcc or -fwrapv, it is probably also possible to construct invalid regexes that are incorrectly parsed because a large limit wraps all the way round into the valid range [0,65535] and is interpreted as a smaller limit. One solution that seems reasonable is to check for out-of-range after each digit is added. This should be portable to every platform with at least 32-bit int (strictly speaking, every platform where 655359 fits in an int). The attached patch 0002 fixes this. I also attach patch 0001 to fix #755052, which is non-essential but would help a lot with debugging any future pcre3 test failures. I might NMU this if it isn't fixed for a while, but I'll give the maintainer a chance to sanity-check the patch first. Regards, S
>From d0a1e185e351cde74b7fc0c072087d8d234e85b5 Mon Sep 17 00:00:00 2001 From: Simon McVittie <s...@debian.org> Date: Thu, 17 Jul 2014 10:21:06 +0100 Subject: [PATCH 1/2] Run tests with VERBOSE=1 so we can see the logs for failing tests --- debian/changelog | 8 ++++++++ debian/rules | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/debian/changelog b/debian/changelog index 0467f9c..5e10cfc 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,11 @@ +pcre3 (1:8.35-2.1) UNRELEASED; urgency=medium + + * Non-maintainer upload. + * Run tests with VERBOSE=1 so we can see the logs for failing tests + (Closes: #755052) + + -- Simon McVittie <s...@debian.org> Thu, 17 Jul 2014 09:27:30 +0100 + pcre3 (1:8.35-2) unstable; urgency=medium * Build-depends on auto-reconf (Closes: 754540) diff --git a/debian/rules b/debian/rules index 108a738..41ef1a3 100755 --- a/debian/rules +++ b/debian/rules @@ -50,7 +50,7 @@ build-stamp: configure-stamp # Add here commands to compile the package. $(MAKE) ifeq ($(DEB_BUILD_GNU_TYPE),$(DEB_HOST_GNU_TYPE)) - $(MAKE) check + $(MAKE) check VERBOSE=1 endif touch build-stamp -- 2.0.1
>From 536eda8ebff29c0bb08817da0b45298ac312b6b7 Mon Sep 17 00:00:00 2001 From: Simon McVittie <s...@debian.org> Date: Thu, 17 Jul 2014 10:21:25 +0100 Subject: [PATCH 2/2] Alter checks for too-large repeat count so they do not rely on integer overflow behaviour (closes: #751828) --- debian/changelog | 2 + ...g-repeat-count-check-for-overflow-after-e.patch | 93 ++++++++++++++++++++++ debian/patches/series | 1 + 3 files changed, 96 insertions(+) create mode 100644 debian/patches/0001-When-parsing-repeat-count-check-for-overflow-after-e.patch diff --git a/debian/changelog b/debian/changelog index 5e10cfc..f6880a2 100644 --- a/debian/changelog +++ b/debian/changelog @@ -3,6 +3,8 @@ pcre3 (1:8.35-2.1) UNRELEASED; urgency=medium * Non-maintainer upload. * Run tests with VERBOSE=1 so we can see the logs for failing tests (Closes: #755052) + * Alter checks for too-large repeat count so they do not rely on + integer overflow behaviour (closes: #751828) -- Simon McVittie <s...@debian.org> Thu, 17 Jul 2014 09:27:30 +0100 diff --git a/debian/patches/0001-When-parsing-repeat-count-check-for-overflow-after-e.patch b/debian/patches/0001-When-parsing-repeat-count-check-for-overflow-after-e.patch new file mode 100644 index 0000000..a909502 --- /dev/null +++ b/debian/patches/0001-When-parsing-repeat-count-check-for-overflow-after-e.patch @@ -0,0 +1,93 @@ +From 4bd1f2d7b149e8e2eb62447d32bfc2a21b6c1486 Mon Sep 17 00:00:00 2001 +From: Simon McVittie <s...@debian.org> +Date: Thu, 17 Jul 2014 10:15:29 +0100 +Subject: [PATCH] When parsing repeat count, check for overflow after each + digit + +Otherwise, if we try to parse a large number, we might wrap around +far enough that we get back into the allowed range [0,65535] and +incorrectly interpret a large number as a smaller number. + +This results in detecting overflows sooner, so the expected test +output changes. + +Strictly speaking, signed overflow is undefined behaviour in C, +and recent gcc takes advantage of this for optimizations, which might +be why this code broke recently. + +The check for min < 0, max < 0 is no longer needed with this +implementation (as long as an int is large enough to hold 655359) so +it can be deleted or kept. I left it in for now. + +Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=751828 +--- + pcre_compile.c | 22 ++++++++++++++-------- + testdata/testoutput2 | 6 +++--- + 2 files changed, 17 insertions(+), 11 deletions(-) + +diff --git a/pcre_compile.c b/pcre_compile.c +index 8a5b723..068a2df 100644 +--- a/pcre_compile.c ++++ b/pcre_compile.c +@@ -1586,11 +1586,14 @@ int max = -1; + /* Read the minimum value and do a paranoid check: a negative value indicates + an integer overflow. */ + +-while (IS_DIGIT(*p)) min = min * 10 + (int)(*p++ - CHAR_0); +-if (min < 0 || min > 65535) ++while (IS_DIGIT(*p)) + { +- *errorcodeptr = ERR5; +- return p; ++ min = min * 10 + (int)(*p++ - CHAR_0); ++ if (min < 0 || min > 65535) ++ { ++ *errorcodeptr = ERR5; ++ return p; ++ } + } + + /* Read the maximum value if there is one, and again do a paranoid on its size. +@@ -1601,11 +1604,14 @@ if (*p == CHAR_RIGHT_CURLY_BRACKET) max = min; else + if (*(++p) != CHAR_RIGHT_CURLY_BRACKET) + { + max = 0; +- while(IS_DIGIT(*p)) max = max * 10 + (int)(*p++ - CHAR_0); +- if (max < 0 || max > 65535) ++ while(IS_DIGIT(*p)) + { +- *errorcodeptr = ERR5; +- return p; ++ max = max * 10 + (int)(*p++ - CHAR_0); ++ if (max < 0 || max > 65535) ++ { ++ *errorcodeptr = ERR5; ++ return p; ++ } + } + if (max < min) + { +diff --git a/testdata/testoutput2 b/testdata/testoutput2 +index b6da7df..cfb446e 100644 +--- a/testdata/testoutput2 ++++ b/testdata/testoutput2 +@@ -5821,13 +5821,13 @@ No match + No match + + /a{11111111111111111111}/I +-Failed: number too big in {} quantifier at offset 22 ++Failed: number too big in {} quantifier at offset 8 + + /(){64294967295}/I +-Failed: number too big in {} quantifier at offset 14 ++Failed: number too big in {} quantifier at offset 9 + + /(){2,4294967295}/I +-Failed: number too big in {} quantifier at offset 15 ++Failed: number too big in {} quantifier at offset 11 + + "(?i:a)(?i:b)(?i:c)(?i:d)(?i:e)(?i:f)(?i:g)(?i:h)(?i:i)(?i:j)(k)(?i:l)A\1B"I + Capturing subpattern count = 1 +-- +2.0.1 + diff --git a/debian/patches/series b/debian/patches/series index aef6044..354651c 100644 --- a/debian/patches/series +++ b/debian/patches/series @@ -4,3 +4,4 @@ pcre_info.patch pcregrep.1-patch soname.patch no_jit_ppc64el.patch +0001-When-parsing-repeat-count-check-for-overflow-after-e.patch -- 2.0.1