The support for the new boolean reduction optabs didn't quite work for VLA
because the code later on insists on the target still having a shift-and-insert
optab.
This is however not needed if the target can do the reduction using the new
optabs. This change makes it an OR not and AND requirement to have
shift-and-insert and the reduc sbool optabs.
Bootstrapped Regtested on aarch64-none-linux-gnu,
arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
-m32, -m64 and no issues
Any objections for master?
Thanks,
Tamar
gcc/ChangeLog:
* tree-vect-loop.cc (vectorizable_reduction): Don't require
IFN_VEC_SHL_INSERT when using reduc sbool optabs.
---
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index
6c2420249718237c4f70720b2bd03d4951bd8a5d..ca9ff35c0bb8d8c4a161a922bae9c19028492b66
100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7565,6 +7565,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
values into the low-numbered elements. */
if ((double_reduc || neutral_op)
&& !nunits_out.is_constant ()
+ && (reduc_fn == IFN_LAST || !VECTOR_BOOLEAN_TYPE_P (vectype_out))
&& !direct_internal_fn_supported_p (IFN_VEC_SHL_INSERT,
vectype_out, OPTIMIZE_FOR_SPEED))
{
--
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 6c2420249718237c4f70720b2bd03d4951bd8a5d..ca9ff35c0bb8d8c4a161a922bae9c19028492b66 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -7565,6 +7565,7 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
values into the low-numbered elements. */
if ((double_reduc || neutral_op)
&& !nunits_out.is_constant ()
+ && (reduc_fn == IFN_LAST || !VECTOR_BOOLEAN_TYPE_P (vectype_out))
&& !direct_internal_fn_supported_p (IFN_VEC_SHL_INSERT,
vectype_out, OPTIMIZE_FOR_SPEED))
{