Hi,
This patch fixes an ICE in testcase gcc.dg/vect/vect-ctor-1.c:
during GIMPLE pass: vect
dump file: vect-ctor-1.c.159t.vect
.../gcc.dg/vect/vect-ctor-1.c: In function 'intrapred_luma_16x16':
.../gcc.dg/vect/vect-ctor-1.c:9:6: internal compiler error: in
exact_div, at poly-int.h:2162
0xdf845f poly_int<1u, poly_result<unsigned long, if_nonpoly<unsigned
long, unsigned long, poly_int_traits<unsigned long>::is_poly>::type,
poly_coeff_pair_traits<unsigned long, if_nonpoly<unsigned long, unsigned
long, poly_int_traits<unsigned
long>::is_poly>::type>::result_kind>::type> exact_div<1u, unsigned long,
unsigned long>(poly_int_pod<1u, unsigned long> const&, unsigned long)
/scratch/astubbs/amd/src/gcc-mainline/gcc/poly-int.h:2162
0xdf649a poly_int<1u, poly_result<unsigned long, unsigned long,
poly_coeff_pair_traits<unsigned long, unsigned
long>::result_kind>::type> exact_div<1u, unsigned long, unsigned
long>(poly_int_pod<1u, unsigned long> const&, poly_int_pod<1u, unsigned
long> const&)
/scratch/astubbs/amd/src/gcc-mainline/gcc/poly-int.h:2175
0x1c473cd vect_get_num_vectors
/scratch/astubbs/amd/src/gcc-mainline/gcc/tree-vectorizer.h:1520
0x1c4bd35 vect_enhance_data_refs_alignment(_loop_vec_info*)
/scratch/astubbs/amd/src/gcc-mainline/gcc/tree-vect-data-refs.c:1798
0x1596732 vect_analyze_loop_2
/scratch/astubbs/amd/src/gcc-mainline/gcc/tree-vect-loop.c:2095
0x15980f3 vect_analyze_loop(loop*, vec_info_shared*)
/scratch/astubbs/amd/src/gcc-mainline/gcc/tree-vect-loop.c:2536
0x15d7b36 try_vectorize_loop_1
/scratch/astubbs/amd/src/gcc-mainline/gcc/tree-vectorizer.c:892
0x15d831f try_vectorize_loop
/scratch/astubbs/amd/src/gcc-mainline/gcc/tree-vectorizer.c:1044
0x15d84f9 vectorize_loops()
/scratch/astubbs/amd/src/gcc-mainline/gcc/tree-vectorizer.c:1125
0x144f0af execute
/scratch/astubbs/amd/src/gcc-mainline/gcc/tree-ssa-loop.c:414
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
The problem is that exact_div is being asked to do "8 / 64", which it
won't. The comment on the function says "NUNITS should be based on the
vectorization factor, so it is always a known multiple of the number of
elements in VECTYPE". This is on the amdgcn target where the
vectorization factor is always 64, but smaller tasks can be vectorized
using masking.
I think what's happening here is that the assumption described in the
comment is invalid in the presence of masked vectors.
The attached patch fixes the ICE in the testcase, but I suspect does not
go far enough. Can it happen that NUNITS can be greater than the
vectorization factor, but not a multiple? Is this even a valid fix in
the first place? Must it be conditionalized on masking being available?
Is the exactness even worth checking, in the presence of exceptions?
Thanks
Andrew
WIP Fix vect-ctor-1.c ICE
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 51a13f1d207..bf1c3eeda85 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -1513,6 +1513,10 @@ vect_use_loop_mask_for_alignment_p (loop_vec_info loop_vinfo)
static inline unsigned int
vect_get_num_vectors (poly_uint64 nunits, tree vectype)
{
+ /* Masked vectors can cause partial vector use. */
+ if (known_lt (nunits, TYPE_VECTOR_SUBPARTS (vectype)))
+ return 1;
+
return exact_div (nunits, TYPE_VECTOR_SUBPARTS (vectype)).to_constant ();
}