On Thu, Jan 27, 2022 at 1:16 PM Maciej W. Rozycki <ma...@embecosm.com> wrote:
>
> Do not mark vector element broadcast resulting from OpenCL operations as
> having side effects for targets that have a suitable hardware operation,
> so that the `vec_duplicateM' standard RTL pattern can be directly used
> for them.  This does not happen currently, because any RTL pattern named
> `vec_duplicateM' cannot express the side effects presumed, and therefore
> a `!TREE_SIDE_EFFECTS' condition placed in `store_constructor' correctly
> prevents any `vec_duplicateM' pattern from being used.
>
> This side-effect annotation was introduced for the C frontend with:
> <https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01075.html>, and then:
> <https://gcc.gnu.org/ml/gcc-patches/2011-08/msg00891.html> ("Scalar
> vector binary operation"), and finally landed with commit 0e3a99ae913c
> ("c-typeck.c (scalar_to_vector): New function."), along with original
> support for scalar-to-vector OpenCL operations.
>
> Support for these operations was then gradually added to C++ with:
> <https://gcc.gnu.org/ml/gcc-patches/2012-09/msg01557.html>, and then:
> <https://gcc.gnu.org/ml/gcc-patches/2012-10/msg00778.html> ("[C++] Mixed
> scalar-vector operations"), landing with commit a212e43fca22 ("re PR
> c++/54427 (Expose more vector extensions)"), followed by:
> <https://gcc.gnu.org/ml/gcc-patches/2012-10/msg01665.html>, and then:
> <https://gcc.gnu.org/ml/gcc-patches/2012-10/msg02253.html> ("[C++]
> Handle ?: for vectors"), landing with commit 93100c6b5b45 ("re PR
> c++/54427 (Expose more vector extensions)").
>
> And side-effect annotation for C++ was retrofitted, again gradually,
> with: <https://gcc.gnu.org/ml/gcc-patches/2013-05/msg00254.html> ("[C++]
> Missing save_expr in vector-scalar ops"), landing with commit
> 6698175d1591 ("typeck.c (cp_build_binary_op): Call save_expr before
> build_vector_from_val."), and then:
> <https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00233.html> ("vector
> conditional expression with scalar arguments"), landing with commit
> 07298ffd6f7c ("call.c (build_conditional_expr_1): Handle the case with
> 1 vector and 2 scalars.")
>
> This was all long before we gained support for the `vec_duplicateM'
> standard RTL pattern, with commit be4c1d4a42c5 ("Add VEC_DUPLICATE_EXPR
> and associated optab") and it may have had sense for where the element
> broadcast operation was always open-coded.  However where provided by
> hardware the element broadcast operation is just like any other vector
> operation, presumably cheap, and whether to repeat it or not should be
> based on the cost of the operation and not hardwired.
>
> So there is no point in artificially marking the operation as having a
> side effect for the lone purpose of preventing repetition.  This does
> not currently prevent targets like the Aarch64 and MIPS ones from using
> the hardware element broadcast operation, but it requires the backend to
> wire it explicitly via the `vec_initMN' standard RTL pattern rather than
> relying on the middle end to pick it up via the `vec_duplicateM' pattern
> automatically.  This change lifts this limitation.
>
>         gcc/
>         * c/c-typeck.cc (build_binary_op) Do not annotate vector element
>         broadcast operations as having a side effect if there is such an
>         operation provided by the backend.
>         * cp/call.cc (build_conditional_expr): Likewise.
>         * cp/typeck.cc (cp_build_binary_op): Likewise.
> ---
> Hi,
>
>  I've come across this while working on an out-of-tree microarchitecture,
> which does not require a `vec_initMN' RTL pattern, however has hardware
> support suitable for cheap `vec_duplicateM'.  I could not persuade the
> middle end to make use of the new `vec_duplicateM' pattern however until I
> removed the side-effect anbnotation.  So I think this is a move in the
> right direction even if we do not have support for said microarchitecture
> at the moment.
>
>  For some reason I could not persuade our current Aarch64 code to use the
> `vec_duplicateM' pattern directly either.  This is because at the time
> `store_constructor' is called the vector mode for the element broadcast
> operation is say `V4SFmode' (for a vector of floats), however there is no
> corresponding `vec_duplicatev4sf' pattern.  Code in the `vec_initv4sfsf'
> pattern does the right thing though, referring to `aarch64_simd_dupv4sf'
> by hand instead.  There is a `vec_duplicatevnx4sf' pattern corresponding
> to `VNx4SFmode', but I was unable to figure out how to activate it.
>
>  This change has passed regression-testing with the `aarch64-linux-gnu'
> target (with `-march=armv9-a') and QEMU in the user emulation mode across
> all the GCC languages and libraries.
>
>  NB there are indentation issues with code surrounding places I have
> modified, but I have chosen not to address them so as not to obfuscate
> this change.  Likewise there are unnecessary compound statements used in
> the switch statements in the C++ frontend parts modified.
>
>  Any questions or comments?  Otherwise OK once GCC 13 has opened?
>
>   Maciej
> ---
>  gcc/c/c-typeck.cc |    9 +++++++--
>  gcc/cp/call.cc    |   19 +++++++++++++++----
>  gcc/cp/typeck.cc  |    9 +++++++--
>  3 files changed, 29 insertions(+), 8 deletions(-)
>
> gcc-scalar-to-vector-no-side-effect.diff
> Index: gcc/gcc/c/c-typeck.cc
> ===================================================================
> --- gcc.orig/gcc/c/c-typeck.cc
> +++ gcc/gcc/c/c-typeck.cc
> @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
>  #include "gomp-constants.h"
>  #include "spellcheck-tree.h"
>  #include "gcc-rich-location.h"
> +#include "optabs-query.h"
>  #include "stringpool.h"
>  #include "attribs.h"
>  #include "asan.h"
> @@ -11923,7 +11924,9 @@ build_binary_op (location_t location, en
>                bool maybe_const = true;
>                tree sc;
>                sc = c_fully_fold (op0, false, &maybe_const);
> -              sc = save_expr (sc);
> +             if (optab_handler (vec_duplicate_optab,
> +                                TYPE_MODE (type1)) == CODE_FOR_nothing)
> +               sc = save_expr (sc);

This doesn't make much sense - I suppose the CONSTRUCTOR retains
TREE_SIDE_EFFECTS but such flag has no meaning on GIMPLE
and thus should have been cleared during gimplification or in the end
ignored by RTL expansion.

You do not add a testcase so it's hard to see what goes wrong where exactly.

Richard.

>                sc = convert (TREE_TYPE (type1), sc);
>                op0 = build_vector_from_val (type1, sc);
>                if (!maybe_const)
> @@ -11938,7 +11941,9 @@ build_binary_op (location_t location, en
>               bool maybe_const = true;
>               tree sc;
>               sc = c_fully_fold (op1, false, &maybe_const);
> -             sc = save_expr (sc);
> +             if (optab_handler (vec_duplicate_optab,
> +                                TYPE_MODE (type0)) == CODE_FOR_nothing)
> +               sc = save_expr (sc);
>               sc = convert (TREE_TYPE (type0), sc);
>               op1 = build_vector_from_val (type0, sc);
>               if (!maybe_const)
> Index: gcc/gcc/cp/call.cc
> ===================================================================
> --- gcc.orig/gcc/cp/call.cc
> +++ gcc/gcc/cp/call.cc
> @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.
>  #include "langhooks.h"
>  #include "c-family/c-objc.h"
>  #include "internal-fn.h"
> +#include "optabs-query.h"
>  #include "stringpool.h"
>  #include "attribs.h"
>  #include "gcc-rich-location.h"
> @@ -5435,12 +5436,18 @@ build_conditional_expr (const op_locatio
>               return error_mark_node;
>             }
>
> +         bool have_vec_duplicate
> +           = optab_handler (vec_duplicate_optab,
> +                            TYPE_MODE (vtype)) != CODE_FOR_nothing;
> +
>           arg2 = cp_convert (stype, arg2, complain);
> -         arg2 = save_expr (arg2);
> +         if (!have_vec_duplicate)
> +           arg2 = save_expr (arg2);
>           arg2 = build_vector_from_val (vtype, arg2);
>           arg2_type = vtype;
>           arg3 = cp_convert (stype, arg3, complain);
> -         arg3 = save_expr (arg3);
> +         if (!have_vec_duplicate)
> +           arg3 = save_expr (arg3);
>           arg3 = build_vector_from_val (vtype, arg3);
>           arg3_type = vtype;
>         }
> @@ -5458,7 +5465,9 @@ build_conditional_expr (const op_locatio
>                 return error_mark_node;
>               case stv_firstarg:
>                 {
> -                 arg2 = save_expr (arg2);
> +                 if (optab_handler (vec_duplicate_optab,
> +                                    TYPE_MODE (arg3_type)) == 
> CODE_FOR_nothing)
> +                   arg2 = save_expr (arg2);
>                   arg2 = convert (TREE_TYPE (arg3_type), arg2);
>                   arg2 = build_vector_from_val (arg3_type, arg2);
>                   arg2_type = TREE_TYPE (arg2);
> @@ -5466,7 +5475,9 @@ build_conditional_expr (const op_locatio
>                 }
>               case stv_secondarg:
>                 {
> -                 arg3 = save_expr (arg3);
> +                 if (optab_handler (vec_duplicate_optab,
> +                                    TYPE_MODE (arg2_type)) == 
> CODE_FOR_nothing)
> +                   arg3 = save_expr (arg3);
>                   arg3 = convert (TREE_TYPE (arg2_type), arg3);
>                   arg3 = build_vector_from_val (arg2_type, arg3);
>                   arg3_type = TREE_TYPE (arg3);
> Index: gcc/gcc/cp/typeck.cc
> ===================================================================
> --- gcc.orig/gcc/cp/typeck.cc
> +++ gcc/gcc/cp/typeck.cc
> @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.
>  #include "c-family/c-objc.h"
>  #include "c-family/c-ubsan.h"
>  #include "gcc-rich-location.h"
> +#include "optabs-query.h"
>  #include "stringpool.h"
>  #include "attribs.h"
>  #include "asan.h"
> @@ -5066,7 +5067,9 @@ cp_build_binary_op (const op_location_t
>            case stv_firstarg:
>              {
>                op0 = convert (TREE_TYPE (type1), op0);
> -             op0 = save_expr (op0);
> +             if (optab_handler (vec_duplicate_optab,
> +                                TYPE_MODE (type1)) == CODE_FOR_nothing)
> +               op0 = save_expr (op0);
>                op0 = build_vector_from_val (type1, op0);
>                type0 = TREE_TYPE (op0);
>                code0 = TREE_CODE (type0);
> @@ -5076,7 +5079,9 @@ cp_build_binary_op (const op_location_t
>            case stv_secondarg:
>              {
>                op1 = convert (TREE_TYPE (type0), op1);
> -             op1 = save_expr (op1);
> +             if (optab_handler (vec_duplicate_optab,
> +                                TYPE_MODE (type0)) == CODE_FOR_nothing)
> +               op1 = save_expr (op1);
>                op1 = build_vector_from_val (type0, op1);
>                type1 = TREE_TYPE (op1);
>                code1 = TREE_CODE (type1);

Reply via email to