On Wed, Nov 28, 2012 at 3:35 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Wed, Nov 28, 2012 at 3:29 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Wed, Nov 28, 2012 at 3:05 PM, Tom de Vries <vr...@codesourcery.com> wrote: >>> On 27/11/12 13:41, Richard Biener wrote: >>>> On Mon, Nov 19, 2012 at 3:33 AM, Tom de Vries <tom_devr...@mentor.com> >>>> wrote: >>>>> Richard, >>>>> >>>>> Consider the PR55124 example test.c: >>>>> ... >>>>> int a, b; >>>>> long c; >>>>> >>>>> static void inline >>>>> f2 (void) >>>>> { >>>>> unsigned long k = 1; >>>>> >>>>> foo (b ? k = 0 : 0); >>>>> >>>>> b = (((c = b) >>>>> ? (k ?: (c = 0)) >>>>> : a) >>>>> * c); >>>>> } >>>>> >>>>> void >>>>> f1 (void) >>>>> { >>>>> f2 (); >>>>> a = b | c; >>>>> } >>>>> ... >>>>> >>>>> when compiling with -O2, we're running into the following assertion in >>>>> pre: >>>>> ... >>>>> test.c:18:1: internal compiler error: in find_or_generate_expression, at >>>>> tree-ssa-pre.c:2802 >>>>> f1 (void) >>>>> ^ >>>>> 0xcf41d3 find_or_generate_expression >>>>> gcc/tree-ssa-pre.c:2802 >>>>> 0xcf42f6 create_expression_by_pieces >>>>> gcc/tree-ssa-pre.c:2861 >>>>> 0xcf4193 find_or_generate_expression >>>>> gcc/tree-ssa-pre.c:2799 >>>>> 0xcf42f6 create_expression_by_pieces >>>>> gcc/tree-ssa-pre.c:2861 >>>>> 0xcf4e28 insert_into_preds_of_block >>>>> gcc/tree-ssa-pre.c:3096 >>>>> 0xcf5c7d do_regular_insertion >>>>> gcc/tree-ssa-pre.c:3386 >>>>> ... >>>>> >>>>> We're hitting the assert at the end of find_or_generate_expression: >>>>> ... >>>>> static tree >>>>> find_or_generate_expression (basic_block block, tree op, gimple_seq >>>>> *stmts) >>>>> { >>>>> pre_expr expr = get_or_alloc_expr_for (op); >>>>> unsigned int lookfor = get_expr_value_id (expr); >>>>> pre_expr leader = bitmap_find_leader (AVAIL_OUT (block), lookfor); >>>>> if (leader) >>>>> { >>>>> if (leader->kind == NAME) >>>>> return PRE_EXPR_NAME (leader); >>>>> else if (leader->kind == CONSTANT) >>>>> return PRE_EXPR_CONSTANT (leader); >>>>> } >>>>> >>>>> /* It must be a complex expression, so generate it recursively. */ >>>>> bitmap exprset = VEC_index (bitmap, value_expressions, lookfor); >>>>> bitmap_iterator bi; >>>>> unsigned int i; >>>>> EXECUTE_IF_SET_IN_BITMAP (exprset, 0, i, bi) >>>>> { >>>>> pre_expr temp = expression_for_id (i); >>>>> if (temp->kind != NAME) >>>>> return create_expression_by_pieces (block, temp, stmts, >>>>> get_expr_type (expr)); >>>>> } >>>>> >>>>> gcc_unreachable (); >>>>> } >>>>> ... >>>>> >>>>> The state in which we're asserting is the following: >>>>> ... >>>>> #5 0x0000000000cf41d4 in find_or_generate_expression >>>>> (block=0x7ffff6210f08, >>>>> op=0x7ffff62384c8, stmts=0x7fffffffdb78) at gcc/tree-ssa-pre.c:2802 >>>>> 2802 gcc_unreachable (); >>>>> (gdb) p block.index >>>>> $13 = 4 >>>>> (gdb) call debug_generic_expr (op) >>>>> b.4_3 >>>>> (gdb) call debug_pre_expr (expr) >>>>> b.4_3 >>>>> (gdb) p lookfor >>>>> $11 = 7 >>>>> (gdb) call debug_bitmap_set (((bb_value_sets_t) >>>>> ((block)->aux))->avail_out) >>>>> debug[0] := { b.4_8 (0012), a.10_13 (0013), _14 (0014), iftmp.5_15 (0015) >>>>> } >>>>> (gdb) p leader >>>>> $12 = (pre_expr) 0x0 >>>>> (gdb) call debug_bitmap ( exprset ) >>>>> first = 0x21fb058 current = 0x21fb058 indx = 0 >>>>> 0x21fb058 next = (nil) prev = (nil) indx = 0 >>>>> bits = { 22 25 } >>>>> (gdb) call debug_pre_expr (expression_for_id (22)) >>>>> b.4_3 >>>>> (gdb) call debug_pre_expr (expression_for_id (25)) >>>>> b.4_31 >>>>> ... >>>>> We're trying to find or generate an expr with value-id 0007 in block 4. >>>>> We can't >>>>> find it (there's no leader) and we can't generate it because there are no >>>>> exprs >>>>> with that value that are not names. >>>>> >>>>> Higher up in the call stack and skipping create_expression_by_pieces, the >>>>> state >>>>> is as follows: >>>>> ... >>>>> #7 0x0000000000cf4194 in find_or_generate_expression >>>>> (block=0x7ffff6210f08, >>>>> op=0x7ffff6238558, stmts=0x7fffffffdb78) at gcc/tree-ssa-pre.c:2799 >>>>> 2799 get_expr_type (expr)); >>>>> (gdb) call debug_generic_expr (op) >>>>> c.6_5 >>>>> (gdb) call debug_pre_expr (expr) >>>>> c.6_5 >>>>> (gdb) p lookfor >>>>> $14 = 9 >>>>> (gdb) p leader >>>>> $15 = (pre_expr) 0x0 >>>>> (gdb) call debug_bitmap ( exprset ) >>>>> first = 0x21fb0f8 current = 0x21fb0f8 indx = 0 >>>>> 0x21fb0f8 next = (nil) prev = (nil) indx = 0 >>>>> bits = { 23 24 26 27 } >>>>> (gdb) call debug_pre_expr (temp) >>>>> {nop_expr,b.4_3} >>>>> (gdb) call debug_pre_expr (expression_for_id (23)) >>>>> c.6_5 >>>>> (gdb) call debug_pre_expr (expression_for_id (24)) >>>>> {nop_expr,b.4_3} >>>>> (gdb) call debug_pre_expr (expression_for_id (26)) >>>>> c.6_32 >>>>> (gdb) call debug_pre_expr (expression_for_id (27)) >>>>> {mem_ref<0B>,addr_expr<&c>}@.MEM_28 >>>>> ... >>>>> We're trying to find or generate an expr with value-id 0009 (in block 4). >>>>> We >>>>> can't find it. We're trying to generate it using {nop_expr,b.4_3}, but as >>>>> we've >>>>> seen above that won't work. The generation using expr 27 would work >>>>> though. >>>>> >>>>> Again higher up in the call stack and skipping >>>>> create_expression_by_pieces, the >>>>> state is as follows: >>>>> ... >>>>> (gdb) up >>>>> #9 0x0000000000cf4e29 in insert_into_preds_of_block >>>>> (block=0x7ffff6210f70, >>>>> exprnum=19, avail=0x22102e0) at gcc/tree-ssa-pre.c:3096 >>>>> 3096 &stmts, type); >>>>> (gdb) l >>>>> 3091 eprime = VEC_index (pre_expr, avail, pred->dest_idx); >>>>> 3092 >>>>> 3093 if (eprime->kind != NAME && eprime->kind != CONSTANT) >>>>> 3094 { >>>>> 3095 builtexpr = create_expression_by_pieces (bprime, eprime, >>>>> 3096 &stmts, type); >>>>> (gdb) p block.index >>>>> $17 = 5 >>>>> (gdb) call debug_pre_expr (expr) >>>>> {convert_expr,c.7_16} >>>>> (gdb) p val >>>>> $18 = 8 >>>>> (gdb) call debug_pre_expr (eprime) >>>>> {convert_expr,c.6_5} >>>>> (gdb) call get_expr_value_id (eprime) >>>>> $16 = 26 >>>>> ... >>>>> So we're trying to insert expr {convert_expr,c.6_5} with value-id 0026 >>>>> into >>>>> block 4. The expr is the phi-translation of expr {convert_expr,c.7_16} >>>>> with >>>>> value-id 0008 in block 5. >>>>> >>>>> One of the reasons why we're inserting the phi-translation of expr >>>>> {convert_expr,c.7_16} in block 4 is because it's a member of ANTIC_IN[5]: >>>>> ... >>>>> ANTIC_IN[5] := { iftmp.5_18 (0018), {mem_ref<0B>,addr_expr<&c>}@.MEM_23 >>>>> (0016), >>>>> {nop_expr,c.7_16} (0017), {mult_expr,_17,iftmp.5_18} (0019), >>>>> {nop_expr,_19} (0020), {convert_expr,c.7_16} (0008), >>>>> {bit_ior_expr,_4,b.11_20} (0010) } >>>>> ... >>>>> A requirement for an expr to be in ANTIC_IN is that that it's either 'a >>>>> live >>>>> temporary or a non-simple expression whose operands are represented in the >>>>> anti-leader set'. The operand is c.7_16, which has value id 0016, as we >>>>> can see >>>>> here: >>>>> ... >>>>> tmp_gen[5] := { c.7_16 (0016), _17 (0017), _19 (0019), b.11_20 (0020), _4 >>>>> (0008), a.2_6 (0010) } >>>>> ... >>>>> And it has this representation in ANTIC_IN[5] in expr >>>>> {mem_ref<0B>,addr_expr<&c>}@.MEM_23. So that looks ok. >>>>> >>>>> The order in which we traverse ANTIC_IN[5] in do_regular_insertion is >>>>> this: >>>>> ... >>>>> (gdb) call debug_pre_expr ( exprs.vec_[0] ) >>>>> {convert_expr,c.7_16} >>>>> (gdb) call debug_pre_expr ( exprs.vec_[1] ) >>>>> {bit_ior_expr,_4,b.11_20} >>>>> (gdb) call debug_pre_expr ( exprs.vec_[2] ) >>>>> {mem_ref<0B>,addr_expr<&c>}@.MEM_23 >>>>> (gdb) call debug_pre_expr ( exprs.vec_[3] ) >>>>> {nop_expr,c.7_16} >>>>> (gdb) call debug_pre_expr ( exprs.vec_[4] ) >>>>> iftmp.5_18 >>>>> (gdb) call debug_pre_expr ( exprs.vec_[5] ) >>>>> {mult_expr,_17,iftmp.5_18} >>>>> (gdb) call debug_pre_expr ( exprs.vec_[6] ) >>>>> {nop_expr,_19} >>>>> ... >>>>> >>>>> The order is indeed in increasing value-id order: >>>>> ... >>>>> (gdb) call get_expr_value_id ( exprs.vec_[0] ) >>>>> $11 = 8 >>>>> (gdb) call get_expr_value_id ( exprs.vec_[1] ) >>>>> $12 = 10 >>>>> (gdb) call get_expr_value_id ( exprs.vec_[2] ) >>>>> $13 = 16 >>>>> (gdb) call get_expr_value_id ( exprs.vec_[3] ) >>>>> $14 = 17 >>>>> (gdb) call get_expr_value_id ( exprs.vec_[4] ) >>>>> $15 = 18 >>>>> (gdb) call get_expr_value_id ( exprs.vec_[5] ) >>>>> $16 = 19 >>>>> (gdb) call get_expr_value_id ( exprs.vec_[6] ) >>>>> $17 = 20 >>>>> ... >>>>> >>>>> But the operand of the first expr {convert_expr,c.7_16} has value-id >>>>> 0016, which >>>>> corresponds to the 3rd expr {mem_ref<0B>,addr_expr<&c>}@.MEM_23. So if I >>>>> understand the intended topological sort correctly, this is in the wrong >>>>> order, >>>>> we should be processing the 3rd element before the first element. I'm not >>>>> quite >>>>> sure this is the root cause of the problem though. >>>>> >>>>> Assuming for the moment that the order is correct, I've written a >>>>> tentative >>>>> patch that fixes the assert, simply by predicting whether >>>>> create_expression_by_pieces will succeed or not, and to skip those calls >>>>> that >>>>> will fail in find_or_generate_expression. The patch has only been tested >>>>> with a >>>>> tree-ssa.exp testrun, but no issues found there. >>>>> >>>>> Do you think this patch is the way to fix this ICE, or is it the order of >>>>> generation that needs fixing, or is the problem yet something else? >>>> >>>> This looks like an ordering issue. But rather in what value-numbers were >>>> assigned to the expressions, not the sorting itself. >>> >>> The sorting done by sorted_array_from_bitmap_set assumes that value_id >>> order is >>> in topological order: >>> ... >>> FOR_EACH_VALUE_ID_IN_SET (set, i, bi) >>> { >>> /* The number of expressions having a given value is usually >>> relatively small. Thus, rather than making a vector of all >>> the expressions and sorting it by value-id, we walk the values >>> and check in the reverse mapping that tells us what expressions >>> have a given value, to filter those in our set. As a result, >>> the expressions are inserted in value-id order, which means >>> topological order. >>> >>> If this is somehow a significant lose for some cases, we can >>> choose which set to walk based on the set size. */ >>> bitmap exprset = VEC_index (bitmap, value_expressions, i); >>> EXECUTE_IF_SET_IN_BITMAP (exprset, 0, j, bj) >>> { >>> if (bitmap_bit_p (&set->expressions, j)) >>> VEC_safe_push (pre_expr, heap, result, expression_for_id (j)); >>> } >>> } >>> ... >>> >>> The relevant ssa-names are _4 and _16: >>> ... >>> # VUSE <.MEM_23> >>> c.7_16 = cD.1716; >>> _4 = (intD.6) c.7_16; >>> ... >>> >>> which have the following value ids, which means that they're not in >>> topological >>> order: >>> ... >>> _4 = _4 value_id 8 >>> c.7_16 = c.7_16 value_id 16 >>> ... >>> >>> If I revert patch r189321, the compiler doesn't assert anymore. But if I >>> look at >>> the relevant ssa-names, the value numbers are still not in topological >>> order: >>> ... >>> _4 = _4 value_id 5 >>> c.7_16 = c.7_16 value_id 13 >>> ... >>> >>> Assigning these value_ids is done in run_scc_vn. I don't find any evidence >>> there >>> that an effort is done to number values in topological order, so my >>> conclusion >>> is that the premise in sorted_array_from_bitmap_set about value-id order >>> meaning >>> topological order is invalid. I suspect that value_ids introduced after >>> value >>> numbering, by pre itself, are in topological order though. >> >> Ick. >> >>>> I suppose it may >>>> result from your vitrual operand numbering changes and compute_avail >>>> doing >>>> >>>> case VN_REFERENCE: >>>> { >>>> vn_reference_t ref; >>>> vn_reference_lookup (gimple_assign_rhs1 (stmt), >>>> gimple_vuse (stmt), >>>> VN_WALK, &ref); >>>> >>>> which valueizes the VUSE here? >>>> >>> >>> The value numbers are out of order, with and without the patch, so I don't >>> see >>> the connection with the patch or with virtual operand numbering changes. >>> >>> >>> I can think of a few ways to fix this: >>> - add assignment of value_id during value numbering rather than after >>> value numbering >>> - try to add a topo_id <-> value_id mapping during building up the >>> pre_exprs, >>> and use that in sorted_array_from_bitmap_set >>> - do actual topological sorting in sorted_array_from_bitmap_set (FWIW, patch >>> attached, passes tree-ssa.exp) >>> >>> In what way do you think should this be fixed? >> >> I think that now that FRE (and thus PRE eliminate ()) no longer needs >> value-id's we should move value-id's back to PRE and assign them at >> compute_avail time (which walks in proper oder). >> >> Something like the attached, testing in progress. > > Ok, that doesn't seem to work because vn_set_hashtable_value_ids is called > too late. I suppose we cannot do without an extra topological walk over > SSA name defs :/
But maybe it can be made to work if also assigning value-ids in compute_avail when we lookup from the hashtable (and that value-id assigning doing the propagation bit itself). Eventually the whole ->value_id stuff should be moved to the PRE IL instead of being in the SCCVN IL. Richard. > >> Richard. >> >>> Thanks, >>> - Tom