When diverting to VMAT_GATHER_SCATTER we fail to zero *poffset
which was previously set if a load was classified as
VMAT_CONTIGUOUS_REVERSE. The following refactors
get_group_load_store_type a bit to avoid this but this all needs
some serious TLC.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
Richard.
PR tree-optimization/117709
* tree-vect-stmts.cc (get_group_load_store_type): Only
set *poffset when we end up with VMAT_CONTIGUOUS_DOWN
or VMAT_CONTIGUOUS_REVERSE.
---
gcc/tree-vect-stmts.cc | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 752ee457f6d..522e9f7f90f 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2048,6 +2048,7 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info
stmt_info,
unsigned int group_size;
unsigned HOST_WIDE_INT gap;
bool single_element_p;
+ poly_int64 neg_ldst_offset = 0;
if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
{
first_stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info);
@@ -2105,7 +2106,8 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info
stmt_info,
/* ??? The VMAT_CONTIGUOUS_REVERSE code generation is
only correct for single element "interleaving" SLP. */
*memory_access_type = get_negative_load_store_type
- (vinfo, stmt_info, vectype, vls_type, 1, poffset);
+ (vinfo, stmt_info, vectype, vls_type, 1,
+ &neg_ldst_offset);
else
{
/* Try to use consecutive accesses of DR_GROUP_SIZE elements,
@@ -2375,6 +2377,10 @@ get_group_load_store_type (vec_info *vinfo,
stmt_vec_info stmt_info,
masked_p, gs_info, elsvals))
*memory_access_type = VMAT_GATHER_SCATTER;
+ if (*memory_access_type == VMAT_CONTIGUOUS_DOWN
+ || *memory_access_type == VMAT_CONTIGUOUS_REVERSE)
+ *poffset = neg_ldst_offset;
+
if (*memory_access_type == VMAT_GATHER_SCATTER
|| *memory_access_type == VMAT_ELEMENTWISE
|| *memory_access_type == VMAT_STRIDED_SLP
--
2.43.0