On 1 June 2011 15:14, Richard Guenther <[email protected]> wrote:
> On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen <[email protected]> wrote:
>> On 1 June 2011 12:42, Richard Guenther <[email protected]> wrote:
>>
>>> Did you think about moving pass_optimize_widening_mul before
>>> loop optimizations? Does that pass catch the cases you are
>>> teaching the pattern recognizer? I think we should try to expose
>>> these more complicated instructions to loop optimizers.
>>>
>>
>> pass_optimize_widening_mul doesn't catch these cases, but I can try to
>> teach it instead of the vectorizer.
>> I am now testing
>>
>> Index: passes.c
>> ===================================================================
>> --- passes.c (revision 174391)
>> +++ passes.c (working copy)
>> @@ -870,6 +870,7 @@
>> NEXT_PASS (pass_split_crit_edges);
>> NEXT_PASS (pass_pre);
>> NEXT_PASS (pass_sink_code);
>> + NEXT_PASS (pass_optimize_widening_mul);
>> NEXT_PASS (pass_tree_loop);
>> {
>> struct opt_pass **p = &pass_tree_loop.pass.sub;
>> @@ -934,7 +935,6 @@
>> NEXT_PASS (pass_forwprop);
>> NEXT_PASS (pass_phiopt);
>> NEXT_PASS (pass_fold_builtins);
>> - NEXT_PASS (pass_optimize_widening_mul);
>> NEXT_PASS (pass_tail_calls);
>> NEXT_PASS (pass_rename_ssa_copies);
>> NEXT_PASS (pass_uncprop);
>>
>> to see how it affects other loop optimizations (vectorizer pattern
>> tests obviously fail).
Looks like it needs copy_prop and dce as well:
Index: passes.c
===================================================================
--- passes.c (revision 174391)
+++ passes.c (working copy)
@@ -870,6 +870,9 @@
NEXT_PASS (pass_split_crit_edges);
NEXT_PASS (pass_pre);
NEXT_PASS (pass_sink_code);
+ NEXT_PASS (pass_copy_prop);
+ NEXT_PASS (pass_dce);
+ NEXT_PASS (pass_optimize_widening_mul);
NEXT_PASS (pass_tree_loop);
{
struct opt_pass **p = &pass_tree_loop.pass.sub;
@@ -934,7 +937,6 @@
NEXT_PASS (pass_forwprop);
NEXT_PASS (pass_phiopt);
NEXT_PASS (pass_fold_builtins);
- NEXT_PASS (pass_optimize_widening_mul);
NEXT_PASS (pass_tail_calls);
NEXT_PASS (pass_rename_ssa_copies);
NEXT_PASS (pass_uncprop);
otherwise I get (on x86_64-suse-linux)
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd
Ira
>
> Thanks. I would hope that we eventually can get rid of the
> pattern recognizer ... at least for SSE there is also always
> a scalar variant instruction for each vectorized one.
>
> Richard.
>