For some reason I split out sanitizing sizetypes from the no-undefined-overflow-branch. In fact the following patch tries to only fix one thing - make unsigned sizetypes no longer sign-extended.
This has (unfortunately) interesting side-effects, and some of them have been mitigated by already committed patches during the last half year. The main remaining issue I run into (and that sort-of blocks me from persuing this really really ...) is that there is a lot of code in GCC that assumes it can do modulo arithmetic on HOST_WIDE_INTs for sizetypes (pointers / offsets in general). Which happens to be wrong when a HOST_WIDE_INT is wider than sizetype (on -m32 multilib on x86_64 for example). It happens to work in almost all cases if we sign-extend from 32bit to 64bit HWI but not when we zero-extend. I'm not exactly sure how to proceeed here, other than adding fixups to various places (as seen in the patch below) and hope to catch all existing errors. [yes, I already tried making all sizetypes signed, really signed, but that has loads of fallout as well] That said, the fact that sizetypes are sign-extended has caused wrong-code bugs in the past, and TYPE_UNSIGNED is really used inconsistently. So I think this is definitely worth fixing. Eventually a first move would be to (finally) get rid of the requirement of sizetype offset arguments for POINTER_PLUS_EXPR. I didn't yet start on that project though. I remember the following bootstrapped fine on x86_64-unknown-linux-gnu at least for C, with a few remaining regressions. I'm trying a full all-lang bootstrap & regtest now. Comments? Questions? Thanks, Richard. 2011-06-16 Richard Guenther <rguent...@suse.de> * fold-const.c (div_if_zero_remainder): sizetypes no longer sign-extend. * stor-layout.c (initialize_sizetypes): Likewise. * tree-ssa-ccp.c (bit_value_unop_1): Likewise. (bit_value_binop_1): Likewise. * tree.c (double_int_to_tree): Likewise. (double_int_fits_to_tree_p): Likewise. (force_fit_type_double): Likewise. (host_integerp): Likewise. (int_fits_type_p): Likewise. * expr.c (get_inner_reference): Sign-extend offset. * tree-ssa-structalias.c (get_constraint_for_ptr_offset): Likewise. * tree-cfg.c (verify_types_in_gimple_reference): Do not compare sizes by pointer. Index: trunk/gcc/fold-const.c =================================================================== *** trunk.orig/gcc/fold-const.c 2011-06-10 12:57:13.000000000 +0200 --- trunk/gcc/fold-const.c 2011-06-14 15:29:04.000000000 +0200 *************** div_if_zero_remainder (enum tree_code co *** 194,202 **** does the correct thing for POINTER_PLUS_EXPR where we want a signed division. */ uns = TYPE_UNSIGNED (TREE_TYPE (arg2)); - if (TREE_CODE (TREE_TYPE (arg2)) == INTEGER_TYPE - && TYPE_IS_SIZETYPE (TREE_TYPE (arg2))) - uns = false; quo = double_int_divmod (tree_to_double_int (arg1), tree_to_double_int (arg2), --- 194,199 ---- Index: trunk/gcc/stor-layout.c =================================================================== *** trunk.orig/gcc/stor-layout.c 2011-06-09 14:49:35.000000000 +0200 --- trunk/gcc/stor-layout.c 2011-06-14 15:31:36.000000000 +0200 *************** initialize_sizetypes (void) *** 2232,2242 **** TYPE_SIZE_UNIT (sizetype) = size_int (GET_MODE_SIZE (TYPE_MODE (sizetype))); set_min_and_max_values_for_integral_type (sizetype, precision, /*is_unsigned=*/true); - /* sizetype is unsigned but we need to fix TYPE_MAX_VALUE so that it is - sign-extended in a way consistent with force_fit_type. */ - TYPE_MAX_VALUE (sizetype) - = double_int_to_tree (sizetype, - tree_to_double_int (TYPE_MAX_VALUE (sizetype))); SET_TYPE_MODE (bitsizetype, smallest_mode_for_size (bprecision, MODE_INT)); TYPE_ALIGN (bitsizetype) = GET_MODE_ALIGNMENT (TYPE_MODE (bitsizetype)); --- 2232,2237 ---- *************** initialize_sizetypes (void) *** 2245,2251 **** = size_int (GET_MODE_SIZE (TYPE_MODE (bitsizetype))); set_min_and_max_values_for_integral_type (bitsizetype, bprecision, /*is_unsigned=*/true); - /* ??? TYPE_MAX_VALUE is not properly sign-extended. */ /* Create the signed variants of *sizetype. */ ssizetype = make_signed_type (TYPE_PRECISION (sizetype)); --- 2240,2245 ---- Index: trunk/gcc/tree-ssa-ccp.c =================================================================== *** trunk.orig/gcc/tree-ssa-ccp.c 2011-05-03 15:05:25.000000000 +0200 --- trunk/gcc/tree-ssa-ccp.c 2011-06-14 15:29:04.000000000 +0200 *************** bit_value_unop_1 (enum tree_code code, t *** 1152,1165 **** bool uns; /* First extend mask and value according to the original type. */ ! uns = (TREE_CODE (rtype) == INTEGER_TYPE && TYPE_IS_SIZETYPE (rtype) ! ? 0 : TYPE_UNSIGNED (rtype)); *mask = double_int_ext (rmask, TYPE_PRECISION (rtype), uns); *val = double_int_ext (rval, TYPE_PRECISION (rtype), uns); /* Then extend mask and value according to the target type. */ ! uns = (TREE_CODE (type) == INTEGER_TYPE && TYPE_IS_SIZETYPE (type) ! ? 0 : TYPE_UNSIGNED (type)); *mask = double_int_ext (*mask, TYPE_PRECISION (type), uns); *val = double_int_ext (*val, TYPE_PRECISION (type), uns); break; --- 1152,1163 ---- bool uns; /* First extend mask and value according to the original type. */ ! uns = TYPE_UNSIGNED (rtype); *mask = double_int_ext (rmask, TYPE_PRECISION (rtype), uns); *val = double_int_ext (rval, TYPE_PRECISION (rtype), uns); /* Then extend mask and value according to the target type. */ ! uns = TYPE_UNSIGNED (type); *mask = double_int_ext (*mask, TYPE_PRECISION (type), uns); *val = double_int_ext (*val, TYPE_PRECISION (type), uns); break; *************** bit_value_binop_1 (enum tree_code code, *** 1181,1188 **** tree r1type, double_int r1val, double_int r1mask, tree r2type, double_int r2val, double_int r2mask) { ! bool uns = (TREE_CODE (type) == INTEGER_TYPE ! && TYPE_IS_SIZETYPE (type) ? 0 : TYPE_UNSIGNED (type)); /* Assume we'll get a constant result. Use an initial varying value, we fall back to varying in the end if necessary. */ *mask = double_int_minus_one; --- 1179,1185 ---- tree r1type, double_int r1val, double_int r1mask, tree r2type, double_int r2val, double_int r2mask) { ! bool uns = TYPE_UNSIGNED (type); /* Assume we'll get a constant result. Use an initial varying value, we fall back to varying in the end if necessary. */ *mask = double_int_minus_one; *************** bit_value_binop_1 (enum tree_code code, *** 1249,1261 **** } else if (shift < 0) { - /* ??? We can have sizetype related inconsistencies in - the IL. */ - if ((TREE_CODE (r1type) == INTEGER_TYPE - && (TYPE_IS_SIZETYPE (r1type) - ? 0 : TYPE_UNSIGNED (r1type))) != uns) - break; - shift = -shift; *mask = double_int_rshift (r1mask, shift, TYPE_PRECISION (type), !uns); --- 1246,1251 ---- *************** bit_value_binop_1 (enum tree_code code, *** 1367,1378 **** break; /* For comparisons the signedness is in the comparison operands. */ ! uns = (TREE_CODE (r1type) == INTEGER_TYPE ! && TYPE_IS_SIZETYPE (r1type) ? 0 : TYPE_UNSIGNED (r1type)); ! /* ??? We can have sizetype related inconsistencies in the IL. */ ! if ((TREE_CODE (r2type) == INTEGER_TYPE ! && TYPE_IS_SIZETYPE (r2type) ? 0 : TYPE_UNSIGNED (r2type)) != uns) ! break; /* If we know the most significant bits we know the values value ranges by means of treating varying bits as zero --- 1357,1363 ---- break; /* For comparisons the signedness is in the comparison operands. */ ! uns = TYPE_UNSIGNED (r1type); /* If we know the most significant bits we know the values value ranges by means of treating varying bits as zero Index: trunk/gcc/tree.c =================================================================== *** trunk.orig/gcc/tree.c 2011-06-07 16:36:52.000000000 +0200 --- trunk/gcc/tree.c 2011-06-14 15:29:04.000000000 +0200 *************** tree *** 1051,1059 **** double_int_to_tree (tree type, double_int cst) { /* Size types *are* sign extended. */ ! bool sign_extended_type = (!TYPE_UNSIGNED (type) ! || (TREE_CODE (type) == INTEGER_TYPE ! && TYPE_IS_SIZETYPE (type))); cst = double_int_ext (cst, TYPE_PRECISION (type), !sign_extended_type); --- 1051,1057 ---- double_int_to_tree (tree type, double_int cst) { /* Size types *are* sign extended. */ ! bool sign_extended_type = !TYPE_UNSIGNED (type); cst = double_int_ext (cst, TYPE_PRECISION (type), !sign_extended_type); *************** bool *** 1067,1075 **** double_int_fits_to_tree_p (const_tree type, double_int cst) { /* Size types *are* sign extended. */ ! bool sign_extended_type = (!TYPE_UNSIGNED (type) ! || (TREE_CODE (type) == INTEGER_TYPE ! && TYPE_IS_SIZETYPE (type))); double_int ext = double_int_ext (cst, TYPE_PRECISION (type), !sign_extended_type); --- 1065,1071 ---- double_int_fits_to_tree_p (const_tree type, double_int cst) { /* Size types *are* sign extended. */ ! bool sign_extended_type = !TYPE_UNSIGNED (type); double_int ext = double_int_ext (cst, TYPE_PRECISION (type), !sign_extended_type); *************** force_fit_type_double (tree type, double *** 1099,1107 **** bool sign_extended_type; /* Size types *are* sign extended. */ ! sign_extended_type = (!TYPE_UNSIGNED (type) ! || (TREE_CODE (type) == INTEGER_TYPE ! && TYPE_IS_SIZETYPE (type))); /* If we need to set overflow flags, return a new unshared node. */ if (overflowed || !double_int_fits_to_tree_p(type, cst)) --- 1095,1101 ---- bool sign_extended_type; /* Size types *are* sign extended. */ ! sign_extended_type = !TYPE_UNSIGNED (type); /* If we need to set overflow flags, return a new unshared node. */ if (overflowed || !double_int_fits_to_tree_p(type, cst)) *************** host_integerp (const_tree t, int pos) *** 6425,6433 **** && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) >= 0) || (! pos && TREE_INT_CST_HIGH (t) == -1 && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) < 0 ! && (!TYPE_UNSIGNED (TREE_TYPE (t)) ! || (TREE_CODE (TREE_TYPE (t)) == INTEGER_TYPE ! && TYPE_IS_SIZETYPE (TREE_TYPE (t))))) || (pos && TREE_INT_CST_HIGH (t) == 0))); } --- 6419,6425 ---- && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) >= 0) || (! pos && TREE_INT_CST_HIGH (t) == -1 && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) < 0 ! && !TYPE_UNSIGNED (TREE_TYPE (t))) || (pos && TREE_INT_CST_HIGH (t) == 0))); } *************** int_fits_type_p (const_tree c, const_tre *** 8117,8134 **** dc = tree_to_double_int (c); unsc = TYPE_UNSIGNED (TREE_TYPE (c)); - if (TREE_CODE (TREE_TYPE (c)) == INTEGER_TYPE - && TYPE_IS_SIZETYPE (TREE_TYPE (c)) - && unsc) - /* So c is an unsigned integer whose type is sizetype and type is not. - sizetype'd integers are sign extended even though they are - unsigned. If the integer value fits in the lower end word of c, - and if the higher end word has all its bits set to 1, that - means the higher end bits are set to 1 only for sign extension. - So let's convert c into an equivalent zero extended unsigned - integer. */ - dc = double_int_zext (dc, TYPE_PRECISION (TREE_TYPE (c))); - retry: type_low_bound = TYPE_MIN_VALUE (type); type_high_bound = TYPE_MAX_VALUE (type); --- 8109,8114 ---- *************** retry: *** 8147,8156 **** if (type_low_bound && TREE_CODE (type_low_bound) == INTEGER_CST) { dd = tree_to_double_int (type_low_bound); - if (TREE_CODE (type) == INTEGER_TYPE - && TYPE_IS_SIZETYPE (type) - && TYPE_UNSIGNED (type)) - dd = double_int_zext (dd, TYPE_PRECISION (type)); if (unsc != TYPE_UNSIGNED (TREE_TYPE (type_low_bound))) { int c_neg = (!unsc && double_int_negative_p (dc)); --- 8127,8132 ---- *************** retry: *** 8172,8181 **** if (type_high_bound && TREE_CODE (type_high_bound) == INTEGER_CST) { dd = tree_to_double_int (type_high_bound); - if (TREE_CODE (type) == INTEGER_TYPE - && TYPE_IS_SIZETYPE (type) - && TYPE_UNSIGNED (type)) - dd = double_int_zext (dd, TYPE_PRECISION (type)); if (unsc != TYPE_UNSIGNED (TREE_TYPE (type_high_bound))) { int c_neg = (!unsc && double_int_negative_p (dc)); --- 8148,8153 ---- Index: trunk/gcc/expr.c =================================================================== *** trunk.orig/gcc/expr.c 2011-06-07 16:38:18.000000000 +0200 --- trunk/gcc/expr.c 2011-06-14 15:29:04.000000000 +0200 *************** get_inner_reference (tree exp, HOST_WIDE *** 6137,6148 **** /* If OFFSET is constant, see if we can return the whole thing as a constant bit position. Make sure to handle overflow during this conversion. */ ! if (host_integerp (offset, 0)) { ! double_int tem = double_int_lshift (tree_to_double_int (offset), ! BITS_PER_UNIT == 8 ! ? 3 : exact_log2 (BITS_PER_UNIT), ! HOST_BITS_PER_DOUBLE_INT, true); tem = double_int_add (tem, bit_offset); if (double_int_fits_in_shwi_p (tem)) { --- 6137,6157 ---- /* If OFFSET is constant, see if we can return the whole thing as a constant bit position. Make sure to handle overflow during this conversion. */ ! if (TREE_CODE (offset) == INTEGER_CST) { ! double_int tem = tree_to_double_int (offset); ! /* Sign-extend the offset as HWI can have more precision ! than sizetype. ! ??? Alternatively we could truncate and sign-extend the final ! result to sizetype bit precision. But that is a functional ! change compared to the idea of tracking bit position with ! infinite precision (which isn't fulfilled already because of ! tracking the tree part in machine precision). */ ! tem = double_int_sext (tem, TYPE_PRECISION (sizetype)); ! tem = double_int_lshift (tem, ! BITS_PER_UNIT == 8 ! ? 3 : exact_log2 (BITS_PER_UNIT), ! HOST_BITS_PER_DOUBLE_INT, true); tem = double_int_add (tem, bit_offset); if (double_int_fits_in_shwi_p (tem)) { Index: trunk/gcc/tree-ssa-structalias.c =================================================================== *** trunk.orig/gcc/tree-ssa-structalias.c 2011-06-14 14:26:14.000000000 +0200 --- trunk/gcc/tree-ssa-structalias.c 2011-06-14 15:29:04.000000000 +0200 *************** get_constraint_for_ptr_offset (tree ptr, *** 2856,2862 **** { struct constraint_expr c; unsigned int j, n; ! HOST_WIDE_INT rhsunitoffset, rhsoffset; /* If we do not do field-sensitive PTA adding offsets to pointers does not change the points-to solution. */ --- 2856,2862 ---- { struct constraint_expr c; unsigned int j, n; ! HOST_WIDE_INT rhsoffset; /* If we do not do field-sensitive PTA adding offsets to pointers does not change the points-to solution. */ *************** get_constraint_for_ptr_offset (tree ptr, *** 2875,2882 **** rhsoffset = UNKNOWN_OFFSET; else { ! /* Make sure the bit-offset also fits. */ ! rhsunitoffset = TREE_INT_CST_LOW (offset); rhsoffset = rhsunitoffset * BITS_PER_UNIT; if (rhsunitoffset != rhsoffset / BITS_PER_UNIT) rhsoffset = UNKNOWN_OFFSET; --- 2875,2884 ---- rhsoffset = UNKNOWN_OFFSET; else { ! /* Sign-extend the unit offset and make sure the bit-offset also fits. */ ! HOST_WIDE_INT rhsunitoffset ! = double_int_sext (tree_to_double_int (offset), ! TYPE_PRECISION (TREE_TYPE (offset))).low; rhsoffset = rhsunitoffset * BITS_PER_UNIT; if (rhsunitoffset != rhsoffset / BITS_PER_UNIT) rhsoffset = UNKNOWN_OFFSET; Index: trunk/gcc/tree-cfg.c =================================================================== *** trunk.orig/gcc/tree-cfg.c 2011-06-08 16:45:50.000000000 +0200 --- trunk/gcc/tree-cfg.c 2011-06-14 15:29:04.000000000 +0200 *************** verify_types_in_gimple_reference (tree e *** 3033,3039 **** return true; } else if (TREE_CODE (op) == SSA_NAME ! && TYPE_SIZE (TREE_TYPE (expr)) != TYPE_SIZE (TREE_TYPE (op))) { error ("conversion of register to a different size"); debug_generic_stmt (expr); --- 3033,3040 ---- return true; } else if (TREE_CODE (op) == SSA_NAME ! && !tree_int_cst_equal (TYPE_SIZE (TREE_TYPE (expr)), ! TYPE_SIZE (TREE_TYPE (op)))) { error ("conversion of register to a different size"); debug_generic_stmt (expr);