Re: ubsan: d-demangle.c:214 signed integer overflow

Iain Buclaw via Gcc-patches Thu, 03 Sep 2020 14:03:24 -0700

Excerpts from Alan Modra's message of September 3, 2020 3:01 pm:
> Running the libiberty testsuite
> ./test-demangle < libiberty/testsuite/d-demangle-expected
> libiberty/d-demangle.c:214:14: runtime error: signed integer overflow: 
> 922337203 * 10 cannot be represented in type 'long int'
> 
> On looking at silencing ubsan, I found a real bug in dlang_number.
> For a 32-bit long, some overflows won't be detected.  For example,
> 21474836480.  Why?  Well 214748364 * 10 is 0x7FFFFFF8 (no overflow so
> far).  Adding 8 gives 0x80000000 (which does overflow but there is no
> test for that overflow in the code).  Then multiplying 0x80000000 * 10
> = 0x500000000 = 0 won't be caught by the multiplication overflow test.
> The same holds for a 64-bit long using similarly crafted digit
> sequences.
> 
> This patch replaces the mod 10 test with a simpler limit test, and
> similarly the mod 26 test in dlang_decode_backref.
> 
> About the limit test:
>   val * 10 + digit > ULONG_MAX is the condition for overflow
> ie.
>   val * 10 > ULONG_MAX - digit
> or
>   val > (ULONG_MAX - digit) / 10
> or assuming the largest digit
>   val > (ULONG_MAX - 9) / 10
> 
> I resisted the aesthetic appeal of simplifying this further to
>   val > -10UL / 10
> since -1UL for ULONG_MAX is only correct for 2's complement numbers.
> 
> Passes all the libiberty tests, on both 32-bit and 64-bit hosts.  OK
> to apply?
>


Thanks Alan, change seems reasonable, however on giving it a mull over,
I see that the largest number that dlang_number would need to be able to
handle is UINT_MAX.  These two tests which decode a wchar value are
representative of that (first is valid, second invalid).

#
--format=dlang
_D4test21__T3funVwi4294967295Z3funFNaNbNiNfZv
test.fun!('\Uffffffff').fun()
#
--format=dlang
_D4test21__T3funVwi4294967296Z3funFNaNbNiNfZv
_D4test21__T3funVwi4294967296Z3funFNaNbNiNfZv

I'm fine with creating a new PR and dealing with the above in a separate
change though, as it will require a few more replacements to adjust the
result parameter type to 'unsigned' or 'long long'.  

Iain.


>       * d-demangle.c: Include limits.h.
>       (ULONG_MAX): Provide fall-back definition.
>       (dlang_number): Simplify and correct overflow test.  Only
>       write *ret on returning non-NULL.
>       (dlang_decode_backref): Likewise.
> 
> diff --git a/libiberty/d-demangle.c b/libiberty/d-demangle.c
> index f2d6946eca..59e6ae007a 100644
> --- a/libiberty/d-demangle.c
> +++ b/libiberty/d-demangle.c
> @@ -31,6 +31,9 @@ If not, see <http://www.gnu.org/licenses/>.  */
>  #ifdef HAVE_CONFIG_H
>  #include "config.h"
>  #endif
> +#ifdef HAVE_LIMITS_H
> +#include <limits.h>
> +#endif
>  
>  #include "safe-ctype.h"
>  
> @@ -45,6 +48,10 @@ If not, see <http://www.gnu.org/licenses/>.  */
>  #include <demangle.h>
>  #include "libiberty.h"
>  
> +#ifndef ULONG_MAX
> +#define      ULONG_MAX       (~0UL)
> +#endif
> +
>  /* A mini string-handling package */
>  
>  typedef struct string                /* Beware: these aren't required to be 
> */
> @@ -207,24 +214,24 @@ dlang_number (const char *mangled, long *ret)
>    if (mangled == NULL || !ISDIGIT (*mangled))
>      return NULL;
>  
> -  (*ret) = 0;
> +  unsigned long val = 0;
>  
>    while (ISDIGIT (*mangled))
>      {
> -      (*ret) *= 10;
> -
> -      /* If an overflow occured when multiplying by ten, the result
> -      will not be a multiple of ten.  */
> -      if ((*ret % 10) != 0)
> +      /* Check for overflow.  Yes, we return NULL here for some digits
> +      that don't overflow "val * 10 + digit", but that doesn't
> +      matter given the later "(long) val < 0" test.  */
> +      if (val > (ULONG_MAX - 9) / 10)
>       return NULL;
>  
> -      (*ret) += mangled[0] - '0';
> +      val = val * 10 + mangled[0] - '0';
>        mangled++;
>      }
>  
> -  if (*mangled == '\0' || *ret < 0)
> +  if (*mangled == '\0' || (long) val < 0)
>      return NULL;
>  
> +  *ret = val;
>    return mangled;
>  }
>  
> @@ -294,24 +301,24 @@ dlang_decode_backref (const char *mangled, long *ret)
>           [A-Z] NumberBackRef
>           ^
>     */
> -  (*ret) = 0;
> +  unsigned long val = 0;
>  
>    while (ISALPHA (*mangled))
>      {
> -      (*ret) *= 26;
> +      /* Check for overflow.  */
> +      if (val > (ULONG_MAX - 25) / 26)
> +     break;
>  
> -      /* If an overflow occured when multiplying by 26, the result
> -      will not be a multiple of 26.  */
> -      if ((*ret % 26) != 0)
> -     return NULL;
> +      val *= 26;
>  
>        if (mangled[0] >= 'a' && mangled[0] <= 'z')
>       {
> -       (*ret) += mangled[0] - 'a';
> +       val += mangled[0] - 'a';
> +       *ret = val;
>         return mangled + 1;
>       }
>  
> -      (*ret) += mangled[0] - 'A';
> +      val += mangled[0] - 'A';
>        mangled++;
>      }
>  
> -- 
> Alan Modra
> Australia Development Lab, IBM
>

Re: ubsan: d-demangle.c:214 signed integer overflow

Reply via email to