apinski--- via Gcc-patches <[email protected]> writes:
> From: Andrew Pinski <[email protected]>
>
> This patch adds simple folding of __builtin_aarch64_im_lane_boundsi where
> we are not going to error out. It fixes the problem by the removal
> of the function from the IR.
>
> OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin_lane_check):
> New function.
> (aarch64_general_fold_builtin): Handle AARCH64_SIMD_BUILTIN_LANE_CHECK.
> (aarch64_general_gimple_fold_builtin): Likewise.
> ---
> gcc/config/aarch64/aarch64-builtins.c | 35 +++++++++++++++++++++++++++
> 1 file changed, 35 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.c
> b/gcc/config/aarch64/aarch64-builtins.c
> index f6b41d9c200..d4414373aa4 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -29,6 +29,7 @@
> #include "rtl.h"
> #include "tree.h"
> #include "gimple.h"
> +#include "ssa.h"
> #include "memmodel.h"
> #include "tm_p.h"
> #include "expmed.h"
> @@ -2333,6 +2334,27 @@ aarch64_general_builtin_rsqrt (unsigned int fn)
> return NULL_TREE;
> }
>
> +/* Return true if the lane check can be removed as there is no
> + error going to be emitted. */
> +static bool
> +aarch64_fold_builtin_lane_check (tree arg0, tree arg1, tree arg2)
> +{
> + if (TREE_CODE (arg0) != INTEGER_CST)
> + return false;
> + if (TREE_CODE (arg1) != INTEGER_CST)
> + return false;
> + if (TREE_CODE (arg2) != INTEGER_CST)
> + return false;
> +
> + auto totalsize = wi::to_widest (arg0);
> + auto elementsize = wi::to_widest (arg1);
> + if (totalsize == 0 || elementsize == 0)
> + return false;
> + auto lane = wi::to_widest (arg2);
> + auto high = wi::udiv_trunc (totalsize, elementsize);
> + return wi::ltu_p (lane, high);
> +}
> +
> #undef VAR1
> #define VAR1(T, N, MAP, FLAG, A) \
> case AARCH64_SIMD_BUILTIN_##T##_##N##A:
> @@ -2353,6 +2375,11 @@ aarch64_general_fold_builtin (unsigned int fcode, tree
> type,
> VAR1 (UNOP, floatv4si, 2, ALL, v4sf)
> VAR1 (UNOP, floatv2di, 2, ALL, v2df)
> return fold_build1 (FLOAT_EXPR, type, args[0]);
> + case AARCH64_SIMD_BUILTIN_LANE_CHECK:
> + if (n_args == 3
Do we need this check? If it's safe to rely on frontend testing
for aarch64_general_gimple_fold_builtin then hopefully it should
be here too. I think an assert would be better.
(FTR, we have extra checking for SVE because the overloaded functions
don't have definite prototypes.)
> + && aarch64_fold_builtin_lane_check (args[0], args[1], args[2]))
> + return fold_convert (void_type_node, integer_zero_node);
Could this just be void_node instead? VOID_CST is the tree constant
code for void_type_node.
It would be good to add the testcase, as a -fdump-tree-optimized test
that checks for a single instance of { = \*ptr}.
Thanks,
Richard