Hi! On Wed, 25 Apr 2012 13:51:16 +0200, Richard Guenther <richard.guent...@gmail.com> wrote: > On Wed, Apr 25, 2012 at 1:27 PM, Thomas Schwinge > <tho...@codesourcery.com> wrote: > > On Thu, 19 Apr 2012 19:46:17 +0200, I wrote: > >> Here is my current test case, reduced from gcc.dg/tree-ssa/20030922-1.c: > >> > >> extern void abort (void); > >> > >> enum tree_code > >> { > >> BIND_EXPR, > >> }; > >> struct tree_common > >> { > >> enum tree_code code:8; > >> }; > >> void > >> voidify_wrapper_expr (struct tree_common *common) > >> { > >> switch (common->code) > >> { > >> case BIND_EXPR: > >> if (common->code != BIND_EXPR) > >> abort (); > >> } > >> } > >> > >> The diff for -fdump-tree-all between compiling with > >> -fno-strict-volatile-bitfields and -fstrict-volatile-bitfields begins > >> with the following, right in 20030922-1.c.003t.original: > >> > >> diff -ru fnsvb/20030922-1.c.003t.original fsvb/20030922-1.c.003t.original > >> --- fnsvb/20030922-1.c.003t.original 2012-04-19 16:51:18.322150866 +0200 > >> +++ fsvb/20030922-1.c.003t.original 2012-04-19 16:49:18.132088498 +0200 > >> @@ -7,7 +7,7 @@ > >> switch ((int) common->code) > >> { > >> case 0:; > >> - if (common->code != 0) > >> + if ((BIT_FIELD_REF <*common, 32, 0> & 255) != 0) > >> { > >> abort (); > >> } > >> > >> That is, for -fno-strict-volatile-bitfields the second instance of > >> »common->code« it is a component_ref, whereas for > >> -fstrict-volatile-bitfields it is a bit_field_ref. The former will be > >> optimized as intended, the latter will not. So, what should be emitted > >> in the -fstrict-volatile-bitfields case? A component_ref -- but this is > >> what Bernd's commit r182545 (for PR51200) changed, I think? Or should > >> later optimization stages learn to better deal with a bit_field_ref, and > >> where would this have to happen? > > > > I figured out why this generic code is behaving differently for SH > > targets. I compared to ARM as well as x86, for which indeed the > > optimization possibility is not lost even when compiling with > > -fstrict-volatile-bitfields. > > > > The component_ref inside the if statement (but not the one in the switch > > statement) is fed into optimize_bit_field_compare where it is optimized > > to »BIT_FIELD_REF <*common, 32, 0> & 255« only for SH, because > > get_best_mode for SH returns SImode (ARM, x86: QImode), because SH has > > »#define SLOW_BYTE_ACCESS 1«, and thus the original QImode is widened to > > SImode, hoping for better code generation »since it may eliminate > > subsequent memory access if subsequent accesses occur to other fields in > > the same word of the structure, but to different bytes«. (Well, the > > converse happens here...) After that, in optimize_bit_field_compare, for > > ARM and x86, »nbitsize == lbitsize«, and thus the optimization is > > aborted, whereas for SH it will be performed, that is, the component_ref > > is replaced with »BIT_FIELD_REF <*common, 32, 0> & 255«. > > > > If I manually force SH's SImode back to QImode (that is, abort > > optimize_bit_field_compare), the later optimization passes work as they > > do for ARM and x86. > > > > Before Bernd's r182545, optimize_bit_field_compare returned early (that > > is, aborted this optimization attempt), because »lbitsize == > > GET_MODE_BITSIZE (lmode)« (8 bits). (With the current sources, lmode is > > VOIDmode.) > > > > Is emmitting »BIT_FIELD_REF <*common, 32, 0> & 255« wrong in this case, > > or should a later optimization pass be able to figure out that > > »BIT_FIELD_REF <*common, 32, 0> & 255« is in fact the same as > > common->code, and then be able to conflate these? Any suggestions > > where/how to tackle this? > > The BIT_FIELD_REF is somewhat of a red-herring. It is created by fold-const.c > in optimize_bit_field_compare, code that I think should be removed completely. > Or it needs to be made aware of strict-volatile bitfield and C++ memory model > details.
As discussed with richi on IRC, I have now done testing (just gcc testsuite) with optimize_bit_field_compare completely removed. For SH, this cures my testcase posted above as well as the the following FAILs, all as expected: -FAIL: gcc.dg/tree-ssa/20030922-1.c scan-tree-dump-times dom2 "if " 0 +PASS: gcc.dg/tree-ssa/20030922-1.c scan-tree-dump-times dom2 "if " 0 -FAIL: gcc.dg/tree-ssa/foldconst-3.c scan-tree-dump-not optimized "tree_code_type" +PASS: gcc.dg/tree-ssa/foldconst-3.c scan-tree-dump-not optimized "tree_code_type" -FAIL: gcc.dg/tree-ssa/vrp15.c scan-tree-dump-times vrp1 "tree_code_length.42." 1 +PASS: gcc.dg/tree-ssa/vrp15.c scan-tree-dump-times vrp1 "tree_code_length.42." 1 No regressions for ARM and x86. I also manually confirmed that the x86-specific tests introduced for PR32748 (gcc.target/i386/pr37248-*.c) still produce the optimized (itermediate and assembly) code as suggested in the *.c files. (So, this optimization is (also) done elsewhere, instead of in optimize_bit_field_compare.) So, is removing it indeed the way to go? Instead of removing it completely -- are the diagonsis messages (warnings) that it emits worth preserving, or are these covered elsewhere? gcc/ * fold-const.c (fold_binary_loc): Don't invoke optimize_bit_field_compare. (optimize_bit_field_compare): Remove function. Index: fold-const.c =================================================================== --- fold-const.c (revision 186856) +++ fold-const.c (working copy) @@ -103,8 +103,6 @@ static tree pedantic_omit_one_operand_loc (locatio static tree distribute_bit_expr (location_t, enum tree_code, tree, tree, tree); static tree make_bit_field_ref (location_t, tree, tree, HOST_WIDE_INT, HOST_WIDE_INT, int); -static tree optimize_bit_field_compare (location_t, enum tree_code, - tree, tree, tree); static tree decode_field_reference (location_t, tree, HOST_WIDE_INT *, HOST_WIDE_INT *, enum machine_mode *, int *, int *, @@ -3306,182 +3304,6 @@ make_bit_field_ref (location_t loc, tree inner, tr return result; } - -/* Optimize a bit-field compare. - - There are two cases: First is a compare against a constant and the - second is a comparison of two items where the fields are at the same - bit position relative to the start of a chunk (byte, halfword, word) - large enough to contain it. In these cases we can avoid the shift - implicit in bitfield extractions. - - For constants, we emit a compare of the shifted constant with the - BIT_AND_EXPR of a mask and a byte, halfword, or word of the operand being - compared. For two fields at the same position, we do the ANDs with the - similar mask and compare the result of the ANDs. - - CODE is the comparison code, known to be either NE_EXPR or EQ_EXPR. - COMPARE_TYPE is the type of the comparison, and LHS and RHS - are the left and right operands of the comparison, respectively. - - If the optimization described above can be done, we return the resulting - tree. Otherwise we return zero. */ - -static tree -optimize_bit_field_compare (location_t loc, enum tree_code code, - tree compare_type, tree lhs, tree rhs) -{ - HOST_WIDE_INT lbitpos, lbitsize, rbitpos, rbitsize, nbitpos, nbitsize; - tree type = TREE_TYPE (lhs); - tree signed_type, unsigned_type; - int const_p = TREE_CODE (rhs) == INTEGER_CST; - enum machine_mode lmode, rmode, nmode; - int lunsignedp, runsignedp; - int lvolatilep = 0, rvolatilep = 0; - tree linner, rinner = NULL_TREE; - tree mask; - tree offset; - - /* Get all the information about the extractions being done. If the bit size - if the same as the size of the underlying object, we aren't doing an - extraction at all and so can do nothing. We also don't want to - do anything if the inner expression is a PLACEHOLDER_EXPR since we - then will no longer be able to replace it. */ - linner = get_inner_reference (lhs, &lbitsize, &lbitpos, &offset, &lmode, - &lunsignedp, &lvolatilep, false); - if (linner == lhs || lbitsize == GET_MODE_BITSIZE (lmode) || lbitsize < 0 - || offset != 0 || TREE_CODE (linner) == PLACEHOLDER_EXPR) - return 0; - - if (!const_p) - { - /* If this is not a constant, we can only do something if bit positions, - sizes, and signedness are the same. */ - rinner = get_inner_reference (rhs, &rbitsize, &rbitpos, &offset, &rmode, - &runsignedp, &rvolatilep, false); - - if (rinner == rhs || lbitpos != rbitpos || lbitsize != rbitsize - || lunsignedp != runsignedp || offset != 0 - || TREE_CODE (rinner) == PLACEHOLDER_EXPR) - return 0; - } - - /* See if we can find a mode to refer to this field. We should be able to, - but fail if we can't. */ - if (lvolatilep - && GET_MODE_BITSIZE (lmode) > 0 - && flag_strict_volatile_bitfields > 0) - nmode = lmode; - else - nmode = get_best_mode (lbitsize, lbitpos, 0, 0, - const_p ? TYPE_ALIGN (TREE_TYPE (linner)) - : MIN (TYPE_ALIGN (TREE_TYPE (linner)), - TYPE_ALIGN (TREE_TYPE (rinner))), - word_mode, lvolatilep || rvolatilep); - if (nmode == VOIDmode) - return 0; - - /* Set signed and unsigned types of the precision of this mode for the - shifts below. */ - signed_type = lang_hooks.types.type_for_mode (nmode, 0); - unsigned_type = lang_hooks.types.type_for_mode (nmode, 1); - - /* Compute the bit position and size for the new reference and our offset - within it. If the new reference is the same size as the original, we - won't optimize anything, so return zero. */ - nbitsize = GET_MODE_BITSIZE (nmode); - nbitpos = lbitpos & ~ (nbitsize - 1); - lbitpos -= nbitpos; - if (nbitsize == lbitsize) - return 0; - - if (BYTES_BIG_ENDIAN) - lbitpos = nbitsize - lbitsize - lbitpos; - - /* Make the mask to be used against the extracted field. */ - mask = build_int_cst_type (unsigned_type, -1); - mask = const_binop (LSHIFT_EXPR, mask, size_int (nbitsize - lbitsize)); - mask = const_binop (RSHIFT_EXPR, mask, - size_int (nbitsize - lbitsize - lbitpos)); - - if (! const_p) - /* If not comparing with constant, just rework the comparison - and return. */ - return fold_build2_loc (loc, code, compare_type, - fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type, - make_bit_field_ref (loc, linner, - unsigned_type, - nbitsize, nbitpos, - 1), - mask), - fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type, - make_bit_field_ref (loc, rinner, - unsigned_type, - nbitsize, nbitpos, - 1), - mask)); - - /* Otherwise, we are handling the constant case. See if the constant is too - big for the field. Warn and return a tree of for 0 (false) if so. We do - this not only for its own sake, but to avoid having to test for this - error case below. If we didn't, we might generate wrong code. - - For unsigned fields, the constant shifted right by the field length should - be all zero. For signed fields, the high-order bits should agree with - the sign bit. */ - - if (lunsignedp) - { - if (! integer_zerop (const_binop (RSHIFT_EXPR, - fold_convert_loc (loc, - unsigned_type, rhs), - size_int (lbitsize)))) - { - warning (0, "comparison is always %d due to width of bit-field", - code == NE_EXPR); - return constant_boolean_node (code == NE_EXPR, compare_type); - } - } - else - { - tree tem = const_binop (RSHIFT_EXPR, - fold_convert_loc (loc, signed_type, rhs), - size_int (lbitsize - 1)); - if (! integer_zerop (tem) && ! integer_all_onesp (tem)) - { - warning (0, "comparison is always %d due to width of bit-field", - code == NE_EXPR); - return constant_boolean_node (code == NE_EXPR, compare_type); - } - } - - /* Single-bit compares should always be against zero. */ - if (lbitsize == 1 && ! integer_zerop (rhs)) - { - code = code == EQ_EXPR ? NE_EXPR : EQ_EXPR; - rhs = build_int_cst (type, 0); - } - - /* Make a new bitfield reference, shift the constant over the - appropriate number of bits and mask it with the computed mask - (in case this was a signed field). If we changed it, make a new one. */ - lhs = make_bit_field_ref (loc, linner, unsigned_type, nbitsize, nbitpos, 1); - if (lvolatilep) - { - TREE_SIDE_EFFECTS (lhs) = 1; - TREE_THIS_VOLATILE (lhs) = 1; - } - - rhs = const_binop (BIT_AND_EXPR, - const_binop (LSHIFT_EXPR, - fold_convert_loc (loc, unsigned_type, rhs), - size_int (lbitpos)), - mask); - - lhs = build2_loc (loc, code, compare_type, - build2 (BIT_AND_EXPR, unsigned_type, lhs, mask), rhs); - return lhs; -} /* Subroutine for fold_truth_andor_1: decode a field reference. @@ -12811,18 +12633,6 @@ fold_binary_loc (location_t loc, return omit_one_operand_loc (loc, type, rslt, arg0); } - /* If this is a comparison of a field, we may be able to simplify it. */ - if ((TREE_CODE (arg0) == COMPONENT_REF - || TREE_CODE (arg0) == BIT_FIELD_REF) - /* Handle the constant case even without -O - to make sure the warnings are given. */ - && (optimize || TREE_CODE (arg1) == INTEGER_CST)) - { - t1 = optimize_bit_field_compare (loc, code, type, arg0, arg1); - if (t1) - return t1; - } - /* Optimize comparisons of strlen vs zero to a compare of the first character of the string vs zero. To wit, strlen(ptr) == 0 => *ptr == 0 Grüße, Thomas
pgpjszTEH5LRm.pgp
Description: PGP signature