Do not mark vector element broadcast resulting from OpenCL operations as 
having side effects for targets that have a suitable hardware operation, 
so that the `vec_duplicateM' standard RTL pattern can be directly used 
for them.  This does not happen currently, because any RTL pattern named 
`vec_duplicateM' cannot express the side effects presumed, and therefore 
a `!TREE_SIDE_EFFECTS' condition placed in `store_constructor' correctly 
prevents any `vec_duplicateM' pattern from being used.

This side-effect annotation was introduced for the C frontend with: 
<https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01075.html>, and then: 
<https://gcc.gnu.org/ml/gcc-patches/2011-08/msg00891.html> ("Scalar 
vector binary operation"), and finally landed with commit 0e3a99ae913c 
("c-typeck.c (scalar_to_vector): New function."), along with original 
support for scalar-to-vector OpenCL operations.

Support for these operations was then gradually added to C++ with: 
<https://gcc.gnu.org/ml/gcc-patches/2012-09/msg01557.html>, and then: 
<https://gcc.gnu.org/ml/gcc-patches/2012-10/msg00778.html> ("[C++] Mixed 
scalar-vector operations"), landing with commit a212e43fca22 ("re PR 
c++/54427 (Expose more vector extensions)"), followed by: 
<https://gcc.gnu.org/ml/gcc-patches/2012-10/msg01665.html>, and then: 
<https://gcc.gnu.org/ml/gcc-patches/2012-10/msg02253.html> ("[C++] 
Handle ?: for vectors"), landing with commit 93100c6b5b45 ("re PR 
c++/54427 (Expose more vector extensions)").

And side-effect annotation for C++ was retrofitted, again gradually, 
with: <https://gcc.gnu.org/ml/gcc-patches/2013-05/msg00254.html> ("[C++] 
Missing save_expr in vector-scalar ops"), landing with commit 
6698175d1591 ("typeck.c (cp_build_binary_op): Call save_expr before 
build_vector_from_val."), and then: 
<https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00233.html> ("vector 
conditional expression with scalar arguments"), landing with commit 
07298ffd6f7c ("call.c (build_conditional_expr_1): Handle the case with 
1 vector and 2 scalars.")

This was all long before we gained support for the `vec_duplicateM' 
standard RTL pattern, with commit be4c1d4a42c5 ("Add VEC_DUPLICATE_EXPR 
and associated optab") and it may have had sense for where the element 
broadcast operation was always open-coded.  However where provided by 
hardware the element broadcast operation is just like any other vector 
operation, presumably cheap, and whether to repeat it or not should be 
based on the cost of the operation and not hardwired.

So there is no point in artificially marking the operation as having a 
side effect for the lone purpose of preventing repetition.  This does 
not currently prevent targets like the Aarch64 and MIPS ones from using 
the hardware element broadcast operation, but it requires the backend to 
wire it explicitly via the `vec_initMN' standard RTL pattern rather than 
relying on the middle end to pick it up via the `vec_duplicateM' pattern 
automatically.  This change lifts this limitation.

        gcc/
        * c/c-typeck.cc (build_binary_op) Do not annotate vector element 
        broadcast operations as having a side effect if there is such an 
        operation provided by the backend.
        * cp/call.cc (build_conditional_expr): Likewise.
        * cp/typeck.cc (cp_build_binary_op): Likewise.
---
Hi,

 I've come across this while working on an out-of-tree microarchitecture, 
which does not require a `vec_initMN' RTL pattern, however has hardware 
support suitable for cheap `vec_duplicateM'.  I could not persuade the 
middle end to make use of the new `vec_duplicateM' pattern however until I 
removed the side-effect anbnotation.  So I think this is a move in the 
right direction even if we do not have support for said microarchitecture 
at the moment.

 For some reason I could not persuade our current Aarch64 code to use the 
`vec_duplicateM' pattern directly either.  This is because at the time 
`store_constructor' is called the vector mode for the element broadcast 
operation is say `V4SFmode' (for a vector of floats), however there is no 
corresponding `vec_duplicatev4sf' pattern.  Code in the `vec_initv4sfsf' 
pattern does the right thing though, referring to `aarch64_simd_dupv4sf' 
by hand instead.  There is a `vec_duplicatevnx4sf' pattern corresponding 
to `VNx4SFmode', but I was unable to figure out how to activate it.

 This change has passed regression-testing with the `aarch64-linux-gnu' 
target (with `-march=armv9-a') and QEMU in the user emulation mode across 
all the GCC languages and libraries.

 NB there are indentation issues with code surrounding places I have 
modified, but I have chosen not to address them so as not to obfuscate 
this change.  Likewise there are unnecessary compound statements used in 
the switch statements in the C++ frontend parts modified.

 Any questions or comments?  Otherwise OK once GCC 13 has opened?

  Maciej
---
 gcc/c/c-typeck.cc |    9 +++++++--
 gcc/cp/call.cc    |   19 +++++++++++++++----
 gcc/cp/typeck.cc  |    9 +++++++--
 3 files changed, 29 insertions(+), 8 deletions(-)

gcc-scalar-to-vector-no-side-effect.diff
Index: gcc/gcc/c/c-typeck.cc
===================================================================
--- gcc.orig/gcc/c/c-typeck.cc
+++ gcc/gcc/c/c-typeck.cc
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
 #include "gomp-constants.h"
 #include "spellcheck-tree.h"
 #include "gcc-rich-location.h"
+#include "optabs-query.h"
 #include "stringpool.h"
 #include "attribs.h"
 #include "asan.h"
@@ -11923,7 +11924,9 @@ build_binary_op (location_t location, en
               bool maybe_const = true;
               tree sc;
               sc = c_fully_fold (op0, false, &maybe_const);
-              sc = save_expr (sc);
+             if (optab_handler (vec_duplicate_optab,
+                                TYPE_MODE (type1)) == CODE_FOR_nothing)
+               sc = save_expr (sc);
               sc = convert (TREE_TYPE (type1), sc);
               op0 = build_vector_from_val (type1, sc);
               if (!maybe_const)
@@ -11938,7 +11941,9 @@ build_binary_op (location_t location, en
              bool maybe_const = true;
              tree sc;
              sc = c_fully_fold (op1, false, &maybe_const);
-             sc = save_expr (sc);
+             if (optab_handler (vec_duplicate_optab,
+                                TYPE_MODE (type0)) == CODE_FOR_nothing)
+               sc = save_expr (sc);
              sc = convert (TREE_TYPE (type0), sc);
              op1 = build_vector_from_val (type0, sc);
              if (!maybe_const)
Index: gcc/gcc/cp/call.cc
===================================================================
--- gcc.orig/gcc/cp/call.cc
+++ gcc/gcc/cp/call.cc
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.
 #include "langhooks.h"
 #include "c-family/c-objc.h"
 #include "internal-fn.h"
+#include "optabs-query.h"
 #include "stringpool.h"
 #include "attribs.h"
 #include "gcc-rich-location.h"
@@ -5435,12 +5436,18 @@ build_conditional_expr (const op_locatio
              return error_mark_node;
            }
 
+         bool have_vec_duplicate
+           = optab_handler (vec_duplicate_optab,
+                            TYPE_MODE (vtype)) != CODE_FOR_nothing;
+
          arg2 = cp_convert (stype, arg2, complain);
-         arg2 = save_expr (arg2);
+         if (!have_vec_duplicate)
+           arg2 = save_expr (arg2);
          arg2 = build_vector_from_val (vtype, arg2);
          arg2_type = vtype;
          arg3 = cp_convert (stype, arg3, complain);
-         arg3 = save_expr (arg3);
+         if (!have_vec_duplicate)
+           arg3 = save_expr (arg3);
          arg3 = build_vector_from_val (vtype, arg3);
          arg3_type = vtype;
        }
@@ -5458,7 +5465,9 @@ build_conditional_expr (const op_locatio
                return error_mark_node;
              case stv_firstarg:
                {
-                 arg2 = save_expr (arg2);
+                 if (optab_handler (vec_duplicate_optab,
+                                    TYPE_MODE (arg3_type)) == CODE_FOR_nothing)
+                   arg2 = save_expr (arg2);
                  arg2 = convert (TREE_TYPE (arg3_type), arg2);
                  arg2 = build_vector_from_val (arg3_type, arg2);
                  arg2_type = TREE_TYPE (arg2);
@@ -5466,7 +5475,9 @@ build_conditional_expr (const op_locatio
                }
              case stv_secondarg:
                {
-                 arg3 = save_expr (arg3);
+                 if (optab_handler (vec_duplicate_optab,
+                                    TYPE_MODE (arg2_type)) == CODE_FOR_nothing)
+                   arg3 = save_expr (arg3);
                  arg3 = convert (TREE_TYPE (arg2_type), arg3);
                  arg3 = build_vector_from_val (arg2_type, arg3);
                  arg3_type = TREE_TYPE (arg3);
Index: gcc/gcc/cp/typeck.cc
===================================================================
--- gcc.orig/gcc/cp/typeck.cc
+++ gcc/gcc/cp/typeck.cc
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.
 #include "c-family/c-objc.h"
 #include "c-family/c-ubsan.h"
 #include "gcc-rich-location.h"
+#include "optabs-query.h"
 #include "stringpool.h"
 #include "attribs.h"
 #include "asan.h"
@@ -5066,7 +5067,9 @@ cp_build_binary_op (const op_location_t
           case stv_firstarg:
             {
               op0 = convert (TREE_TYPE (type1), op0);
-             op0 = save_expr (op0);
+             if (optab_handler (vec_duplicate_optab,
+                                TYPE_MODE (type1)) == CODE_FOR_nothing)
+               op0 = save_expr (op0);
               op0 = build_vector_from_val (type1, op0);
               type0 = TREE_TYPE (op0);
               code0 = TREE_CODE (type0);
@@ -5076,7 +5079,9 @@ cp_build_binary_op (const op_location_t
           case stv_secondarg:
             {
               op1 = convert (TREE_TYPE (type0), op1);
-             op1 = save_expr (op1);
+             if (optab_handler (vec_duplicate_optab,
+                                TYPE_MODE (type0)) == CODE_FOR_nothing)
+               op1 = save_expr (op1);
               op1 = build_vector_from_val (type0, op1);
               type1 = TREE_TYPE (op1);
               code1 = TREE_CODE (type1);

Reply via email to