Michael Matz <[EMAIL PROTECTED]> writes: > Both, the assessment of far-stretchedness and these numbers seem to be > invented ad hoc. The latter is irrelevant (it's not interesting how many > cases there are, but how important those cases which occur are, for some > metric, let's say performance). And the former isn't true, i.e. the > concern is not far-stretched. For 456.hmmer for instance it is crucial > that this transformation happens, the basic situation looks like so:
What do people think of this patch? This seems to fix the problem case without breaking Michael's case. It basically avoids store speculation: we don't write to a MEM unless the function unconditionally writes to the MEM anyhow. This is basically a public relations exercise. I doubt this optimization is especially important, so I think it's OK to disable it to keep people happy. Even though the optimization has been there since gcc 3.4 and nobody noticed. Of course this kind of thing will break again until somebody takes the time to fully implement something like the C++0x memory model. I haven't tested this patch. Ian Index: ifcvt.c =================================================================== --- ifcvt.c (revision 128958) +++ ifcvt.c (working copy) @@ -2139,6 +2139,32 @@ noce_mem_write_may_trap_or_fault_p (cons return false; } +/* Return whether a MEM is unconditionally set in the function + following TOP_BB. */ + +static bool +noce_mem_unconditionally_set_p (basic_block top_bb, const_rtx mem) +{ + basic_block dominator; + + for (dominator = get_immediate_dominator (CDI_POST_DOMINATORS, top_bb); + dominator != NULL; + dominator = get_immediate_dominator (CDI_POST_DOMINATORS, dominator)) + { + rtx insn; + + FOR_BB_INSNS (dominator, insn) + { + if (memory_modified_in_insn_p (mem, insn)) + return true; + if (modified_in_p (XEXP (mem, 0), insn)) + return false; + } + } + + return false; +} + /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert it without using conditional execution. Return TRUE if we were successful at converting the block. */ @@ -2292,17 +2318,31 @@ noce_process_if_block (struct noce_if_in goto success; } - /* Disallow the "if (...) x = a;" form (with an implicit "else x = x;") - for optimizations if writing to x may trap or fault, i.e. it's a memory - other than a static var or a stack slot, is misaligned on strict - aligned machines or is read-only. - If x is a read-only memory, then the program is valid only if we - avoid the store into it. If there are stores on both the THEN and - ELSE arms, then we can go ahead with the conversion; either the - program is broken, or the condition is always false such that the - other memory is selected. */ - if (!set_b && MEM_P (orig_x) && noce_mem_write_may_trap_or_fault_p (orig_x)) - return FALSE; + if (!set_b && MEM_P (orig_x)) + { + /* Disallow the "if (...) x = a;" form (implicit "else x = x;") + for optimizations if writing to x may trap or fault, + i.e. it's a memory other than a static var or a stack slot, + is misaligned on strict aligned machines or is read-only. If + x is a read-only memory, then the program is valid only if we + avoid the store into it. If there are stores on both the + THEN and ELSE arms, then we can go ahead with the conversion; + either the program is broken, or the condition is always + false such that the other memory is selected. */ + if (noce_mem_write_may_trap_or_fault_p (orig_x)) + return FALSE; + + /* Avoid store speculation: given "if (...) x = a" where x is a + MEM, we only want to do the store if x is always set + somewhere in the function. This avoids cases like + if (pthread_mutex_trylock(mutex)) + ++global_variable; + where we only want global_variable to be changed if the mutex + is held. FIXME: This should ideally be expressed directly in + RTL somehow. */ + if (!noce_mem_unconditionally_set_p (test_bb, orig_x)) + return FALSE; + } if (noce_try_move (if_info)) goto success; @@ -3957,7 +3997,7 @@ dead_or_predicable (basic_block test_bb, /* Main entry point for all if-conversion. */ static void -if_convert (bool recompute_dominance) +if_convert (void) { basic_block bb; int pass; @@ -3977,9 +4017,8 @@ if_convert (bool recompute_dominance) loop_optimizer_finalize (); free_dominance_info (CDI_DOMINATORS); - /* Compute postdominators if we think we'll use them. */ - if (HAVE_conditional_execution || recompute_dominance) - calculate_dominance_info (CDI_POST_DOMINATORS); + /* Compute postdominators. */ + calculate_dominance_info (CDI_POST_DOMINATORS); df_set_flags (DF_LR_RUN_DCE); @@ -4068,7 +4107,7 @@ rest_of_handle_if_conversion (void) if (dump_file) dump_flow_info (dump_file, dump_flags); cleanup_cfg (CLEANUP_EXPENSIVE); - if_convert (false); + if_convert (); } cleanup_cfg (0); @@ -4105,7 +4144,7 @@ gate_handle_if_after_combine (void) static unsigned int rest_of_handle_if_after_combine (void) { - if_convert (true); + if_convert (); return 0; } @@ -4138,7 +4177,7 @@ gate_handle_if_after_reload (void) static unsigned int rest_of_handle_if_after_reload (void) { - if_convert (true); + if_convert (); return 0; }