> Am 15.07.2024 um 19:08 schrieb Richard Sandiford <richard.sandif...@arm.com>:
> 
> Richard Biener <rguent...@suse.de> writes:
>> The following adds a new --param for debugging the vectorizers alignment
>> peeling by increasing the cost of aligned stores.
>> 
>> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>> 
>> This makes the PR115843 testcase fail again on trunk (but not on the
>> branch), seemingly uncovering another backend issue.  It makes the
>> testcase get alignment peeling even with the zen4 costs fixed.
>> 
>> Any objection?
> 
> Seems ok to me.  Not sure I understand the mechanics of how this makes
> the testcase get alignment peeling though.  I'd assumed increasing the
> cost of aligned loads & stores would discourage peeling relative to
> unaligned accesses.

I guess it simulates the bug in the x86 backend where unaligned loads are 
cheaper than aligned ones.  Unfortunately params are not signed integers so we 
can only bias with a positive value.  I also chose to bias aligned accesses 
because for unaligned cost depends on the actual misalignment and the exact 
mode of operation.

Richard 

> Thanks,
> Richard
> 
>> 
>>    * params.opt (--param=vect-aligned-ldst-cost-bias): New.
>>    * doc/invoke.texi (--param=vect-aligned-ldst-cost-bias): Document.
>>    * tree-vect-stmts.cc (vect_get_store_cost): Honor
>>    param_vect_aligned_ldst_cost_bias.
>>    (vect_get_load_cost): Likewise.
>> 
>>    * gcc.dg/vect/pr115843.c: Use it.
>> ---
>> gcc/doc/invoke.texi                  | 4 ++++
>> gcc/params.opt                       | 4 ++++
>> gcc/testsuite/gcc.dg/vect/pr115843.c | 1 +
>> gcc/tree-vect-stmts.cc               | 2 ++
>> 4 files changed, 11 insertions(+)
>> 
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 1360cae3986..e542cefbb4a 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -16914,6 +16914,10 @@ permit performing redundancy elimination after 
>> reload.
>> The maximum number of insns in loop header duplicated
>> by the copy loop headers pass.
>> 
>> +@item vect-aligned-ldst-cost-bias
>> +Bias to apply to the cost of aligned loads and stores.  This
>> +is useful for debugging only.
>> +
>> @item vect-epilogues-nomask
>> Enable loop epilogue vectorization using smaller vector size.
>> 
>> diff --git a/gcc/params.opt b/gcc/params.opt
>> index 3c4369fa052..5f86d564421 100644
>> --- a/gcc/params.opt
>> +++ b/gcc/params.opt
>> @@ -1166,6 +1166,10 @@ Use direct poisoning/unpoisoning instructions for 
>> variables smaller or equal to
>> Common Joined UInteger Var(param_use_canonical_types) Init(1) 
>> IntegerRange(0, 1) Param
>> Whether to use canonical types.
>> 
>> +-param=vect-aligned-ldst-cost-bias=
>> +Common Joined UInteger Var(param_vect_aligned_ldst_cost_bias) Init(0) Param 
>> Optimization
>> +Bias to apply to the cost of aligned loads and stores.
>> +
>> -param=vect-epilogues-nomask=
>> Common Joined UInteger Var(param_vect_epilogues_nomask) Init(1) 
>> IntegerRange(0, 1) Param Optimization
>> Enable loop epilogue vectorization using smaller vector size.
>> diff --git a/gcc/testsuite/gcc.dg/vect/pr115843.c 
>> b/gcc/testsuite/gcc.dg/vect/pr115843.c
>> index 1b3fe277209..6701fa3499a 100644
>> --- a/gcc/testsuite/gcc.dg/vect/pr115843.c
>> +++ b/gcc/testsuite/gcc.dg/vect/pr115843.c
>> @@ -1,3 +1,4 @@
>> +/* { dg-additional-options "--param vect-aligned-ldst-cost-bias=100" } */
>> /* { dg-additional-options "-mavx512vl --param vect-partial-vector-usage=2" 
>> { target { avx512f_runtime && avx512vl } } } */
>> 
>> #include "tree-vect.h"
>> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
>> index fc02e84b4b4..2502dbd5413 100644
>> --- a/gcc/tree-vect-stmts.cc
>> +++ b/gcc/tree-vect-stmts.cc
>> @@ -997,6 +997,7 @@ vect_get_store_cost (vec_info *, stmt_vec_info 
>> stmt_info, int ncopies,
>>    *inside_cost += record_stmt_cost (body_cost_vec, ncopies,
>>                      vector_store, stmt_info, 0,
>>                      vect_body);
>> +    *inside_cost += param_vect_aligned_ldst_cost_bias * ncopies;
>> 
>>         if (dump_enabled_p ())
>>           dump_printf_loc (MSG_NOTE, vect_location,
>> @@ -1049,6 +1050,7 @@ vect_get_load_cost (vec_info *, stmt_vec_info 
>> stmt_info, int ncopies,
>>       {
>>    *inside_cost += record_stmt_cost (body_cost_vec, ncopies, vector_load,
>>                      stmt_info, 0, vect_body);
>> +    *inside_cost += param_vect_aligned_ldst_cost_bias * ncopies;
>> 
>>         if (dump_enabled_p ())
>>           dump_printf_loc (MSG_NOTE, vect_location,

Reply via email to