Richard Sandiford <richard.sandif...@arm.com> writes:
> Richard Biener <richard.guent...@gmail.com> writes:
>> On December 14, 2019 11:43:48 AM GMT+01:00, Richard Sandiford 
>> <richard.sandif...@arm.com> wrote:
>>>Richard Biener <richard.guent...@gmail.com> writes:
>>>> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford
>>><richard.sandif...@arm.com> wrote:
>>>>>Richard Biener <richard.guent...@gmail.com> writes:
>>>>>>>>>The AArch64 port emits an error if calls pass values of SVE type
>>>to
>>>>>>>an
>>>>>>>>>unprototyped function.  To do that we need to know whether the
>>>>>value
>>>>>>>>>really is an SVE type rathr than a plain vector.
>>>>>>>>>
>>>>>>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
>>>>>>>>>same problem there once we support -msve-vector-bits=128, since
>>>the
>>>>>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
>>>>>>>>
>>>>>>>> But then why don't you have different modes?
>>>>>>>
>>>>>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
>>>>>>>difference.  But from a vector value POV, a vector of 4 ints is a
>>>>>>>vector
>>>>>>>of 4 ints, so even distinguishing based on the mode is artificial.
>>>>>>
>>>>>> True. 
>>>>>>
>>>>>>>SVE is AFAIK the first target to have different modes for
>>>potentially
>>>>>>>the "same" vector type, and I had to add new infrastructure to
>>>allow
>>>>>>>targets to define multiple modes of the same size.  So the fact
>>>that
>>>>>>>gimple distinguishes otherwise identical vectors based on mode is a
>>>>>>>relatively recent thing.  AFAIK it just fell out in the wash rather
>>>>>>>than being deliberately planned.  It happens to be convenient in
>>>this
>>>>>>>context, but it hasn't been important until now.
>>>>>>>
>>>>>>>The hook doesn't seem any worse than distinguishing based on the
>>>>>mode.
>>>>>>>Another way to avoid this would have been to define separate SVE
>>>>>modes
>>>>>>>for the predefined vectors.  The big downside of that is that we'd
>>>>>end
>>>>>>>up doubling the number of SVE patterns.
>>>>>>>
>>>>>>>Extra on-the-side metadata is going to be easy to drop
>>>accidentally,
>>>>>>>and this is something we need for correctness rather than
>>>>>optimisation.
>>>>>>
>>>>>> Still selecting the ABI during call expansion only and based on
>>>>>values types at that point is fragile.
>>>>>
>>>>>Agreed.  But it's fragile in general, not just for this case. 
>>>Changing
>>>>>something as fundamental as that would be a lot of work and seems
>>>>>likely
>>>>>to introduce accidental ABI breakage.
>>>>>
>>>>>> The frontend are in charge of specifying the actual argument type
>>>and
>>>>>> at that point the target may fix the ABI. The ABI can be recorded
>>>in
>>>>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
>>>>>> ways for varargs functions (in full generality that would mean
>>>>>> attaching varargs ABI meta to each call).
>>>>>>
>>>>>> The alternative is to have an actual argument type vector
>>>associated
>>>>>> with each call.
>>>>>
>>>>>I think multiple pieces of gimple code would then have to cope with
>>>>>that
>>>>>as a special case.  E.g. if:
>>>>>
>>>>>   void foo (int, ...);
>>>>>
>>>>>   type1 a;
>>>>>   b = VIEW_CONVERT_EXPR<type2> (a);
>>>>>   if (a)
>>>>>     foo (1, a);
>>>>>   else
>>>>>     foo (1, b);
>>>>>
>>>>>gets converted to:
>>>>>
>>>>>   if (a)
>>>>>     foo (1, a);
>>>>>   else
>>>>>     foo (1, a);
>>>>>
>>>>>on the basis that type1 and type2 are "the same" despite having
>>>>>different calling conventions, we have to be sure that the calls
>>>>>are not treated as equivalent:
>>>>>
>>>>>   foo (1, a);
>>>>>
>>>>>Things like IPA clones would also need to handle this specially.
>>>>>Anything that generates new calls based on old ones will need
>>>>>to copy this information too.
>>>>>
>>>>>This also sounds like it would be fragile and seems a bit too
>>>>>invasive for stage 3.
>>>>
>>>> But we are already relying on this to work (fntype non-propagation)
>>>because function pointer conversions are dropped on the floor. 
>>>>
>>>> The real change would be introducing (per call) fntype for calls to
>>>unprototyped functions and somehow dealing with varargs. 
>>>
>>>It looks like this itself relies on useless_type_conversion_p,
>>>is that right?  E.g. we have things like:
>>>
>>>bool
>>>func_checker::compare_gimple_call (gcall *s1, gcall *s2)
>>>{
>>>  ...
>>>  tree fntype1 = gimple_call_fntype (s1);
>>>  tree fntype2 = gimple_call_fntype (s2);
>>>  if ((fntype1 && !fntype2)
>>>      || (!fntype1 && fntype2)
>>>      || (fntype1 && !types_compatible_p (fntype1, fntype2)))
>>>return return_false_with_msg ("call function types are not
>>>compatible");
>>>
>>>and useless_type_conversion_p has:
>>>
>>>  else if ((TREE_CODE (inner_type) == FUNCTION_TYPE
>>>         || TREE_CODE (inner_type) == METHOD_TYPE)
>>>        && TREE_CODE (inner_type) == TREE_CODE (outer_type))
>>>    {
>>>      tree outer_parm, inner_parm;
>>>
>>>      /* If the return types are not compatible bail out.  */
>>>      if (!useless_type_conversion_p (TREE_TYPE (outer_type),
>>>                                   TREE_TYPE (inner_type)))
>>>     return false;
>>>
>>>      /* Method types should belong to a compatible base class.  */
>>>      if (TREE_CODE (inner_type) == METHOD_TYPE
>>>       && !useless_type_conversion_p (TYPE_METHOD_BASETYPE (outer_type),
>>>                                      TYPE_METHOD_BASETYPE (inner_type)))
>>>     return false;
>>>
>>>      /* A conversion to an unprototyped argument list is ok.  */
>>>      if (!prototype_p (outer_type))
>>>     return true;
>>>
>>>     /* If the unqualified argument types are compatible the conversion
>>>      is useless.  */
>>>      if (TYPE_ARG_TYPES (outer_type) == TYPE_ARG_TYPES (inner_type))
>>>     return true;
>>>
>>>      for (outer_parm = TYPE_ARG_TYPES (outer_type),
>>>        inner_parm = TYPE_ARG_TYPES (inner_type);
>>>        outer_parm && inner_parm;
>>>        outer_parm = TREE_CHAIN (outer_parm),
>>>        inner_parm = TREE_CHAIN (inner_parm))
>>>     if (!useless_type_conversion_p
>>>            (TYPE_MAIN_VARIANT (TREE_VALUE (outer_parm)),
>>>             TYPE_MAIN_VARIANT (TREE_VALUE (inner_parm))))
>>>       return false;
>>>
>>>So it looks like we'd still need to distinguish the vector types in
>>>useless_type_conversion_p even if we went the fntype route.  The
>>>difference
>>>is that the fntype route would give us the option of only
>>>distinguishing
>>>the vectors for return and argument types and not in general.
>>>
>>>But if we are going to have to distinguish the vectors here anyway
>>>in some form, could we go with the patch as-is for stage 3 and leave
>>>restricting this to just return and argument types as a follow-on
>>>optimisation?
>>
>> How does this get around the LTO canonical type merging machinery? That is, 
>> how are those types streamed and how are they identified by the backend? 
>> Just by means of being pointer equal to some statically built type in the 
>> backend? 
>> Or does the type have some attribute on it or on the component? How does the 
>> middle end build a related type with the same ABI, like a vector with the 
>> half number of elements? 
>
> Hmm...
>
> At the moment it's based on pointer equality between the TYPE_MAIN_VARIANT
> and statically-built types.  We predefine the only available SVE "ABI types"
> and there's no way to create "new" ones.
>
> But you're right that that doesn't work for LTO -- in general, not just
> for this conversion patch -- because no streamed types end up as ABI types.
> So we'll need an attribute after all, with the ABI decisions keyed off that
> rather than TYPE_MAIN_VARIANT pointer equality.  Will fix...

Now fixed :-)

> Once that's fixed, the fact that we use SET_TYPE_STRUCTURAL_EQUALITY
> for the ABI types means that the types remain distinct from "normal"
> vector types even for TYPE_CANONICAL purposes, since:
>
>      As a special case, if TYPE_CANONICAL is NULL_TREE, and thus
>      TYPE_STRUCTURAL_EQUALITY_P is true, then it cannot
>      be used for comparison against other types.  Instead, the type is
>      said to require structural equality checks, described in
>      TYPE_STRUCTURAL_EQUALITY_P.
>      [...]
>   #define TYPE_CANONICAL(NODE) (TYPE_CHECK (NODE)->type_common.canonical)
>   /* Indicates that the type node requires structural equality
>      checks.  The compiler will need to look at the composition of the
>      type to determine whether it is equal to another type, rather than
>      just comparing canonical type pointers.  For instance, we would need
>      to look at the return and parameter types of a FUNCTION_TYPE
>      node.  */
>   #define TYPE_STRUCTURAL_EQUALITY_P(NODE) (TYPE_CANONICAL (NODE) == 
> NULL_TREE)
>
> We also have:
>
> /* Return ture if get_alias_set care about TYPE_CANONICAL of given type.
>    We don't define the types for pointers, arrays and vectors.  The reason is
>    that pointers are handled specially: ptr_type_node accesses conflict with
>    accesses to all other pointers.  This is done by alias.c.
>    Because alias sets of arrays and vectors are the same as types of their
>    elements, we can't compute canonical type either.  Otherwise we could go
>    form void *[10] to int *[10] (because they are equivalent for canonical 
> type
>    machinery) and get wrong TBAA.  */
>
> inline bool
> canonical_type_used_p (const_tree t)
> {
>   return !(POINTER_TYPE_P (t)
>          || TREE_CODE (t) == ARRAY_TYPE
>          || TREE_CODE (t) == VECTOR_TYPE);
> }
>
> So with the attribute added (needed anyway), the patch does seem to
> work for LTO too.

Given the above, is the patch OK?  I agree it isn't very elegant,
but at the moment we have no choice but to distinguish the vector
types at some point during gimple.

Thanks,
Richard


2020-01-07  Richard Sandiford  <richard.sandif...@arm.com>

gcc/
        * target.def (compatible_vector_types_p): New target hook.
        * hooks.h (hook_bool_const_tree_const_tree_true): Declare.
        * hooks.c (hook_bool_const_tree_const_tree_true): New function.
        * doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
        * doc/tm.texi: Regenerate.
        * gimple-expr.c: Include target.h.
        (useless_type_conversion_p): Use targetm.compatible_vector_types_p.
        * config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
        function.
        (TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
        * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_pred):
        Use the original predicate if it already has a suitable type.

gcc/testsuite/
        * gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
        * gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.

Index: gcc/target.def
===================================================================
--- gcc/target.def      2020-01-06 12:57:55.753930730 +0000
+++ gcc/target.def      2020-01-07 10:24:01.546344751 +0000
@@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
  hook_bool_mode_false)
 
 DEFHOOK
+(compatible_vector_types_p,
+ "Return true if there is no target-specific reason for treating\n\
+vector types @var{type1} and @var{type2} as distinct types.  The caller\n\
+has already checked for target-independent reasons, meaning that the\n\
+types are known to have the same mode, to have the same number of elements,\n\
+and to have what the caller considers to be compatible element types.\n\
+\n\
+The main reason for defining this hook is to reject pairs of types\n\
+that are handled differently by the target's calling convention.\n\
+For example, when a new @var{N}-bit vector architecture is added\n\
+to a target, the target may want to handle normal @var{N}-bit\n\
+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
+before, to maintain backwards compatibility.  However, it may also\n\
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\
+and returned in a more efficient way.  It is then important to maintain\n\
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\
+architecture-specific ones.\n\
+\n\
+The default implementation returns true, which is correct for most targets.",
+ bool, (const_tree type1, const_tree type2),
+ hook_bool_const_tree_const_tree_true)
+
+DEFHOOK
 (vector_alignment,
  "This hook can be used to define the alignment for a vector of type\n\
 @var{type}, in order to comply with a platform ABI.  The default is to\n\
Index: gcc/hooks.h
===================================================================
--- gcc/hooks.h 2020-01-06 12:57:54.749937335 +0000
+++ gcc/hooks.h 2020-01-07 10:24:01.542344777 +0000
@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
 extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
 extern bool hook_bool_tree_false (tree);
 extern bool hook_bool_const_tree_false (const_tree);
+extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree);
 extern bool hook_bool_tree_true (tree);
 extern bool hook_bool_const_tree_true (const_tree);
 extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
Index: gcc/hooks.c
===================================================================
--- gcc/hooks.c 2020-01-06 12:57:54.745937361 +0000
+++ gcc/hooks.c 2020-01-07 10:24:01.542344777 +0000
@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
 }
 
 bool
+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
+{
+  return true;
+}
+
+bool
 hook_bool_tree_true (tree)
 {
   return true;
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in  2020-01-06 12:57:53.657944518 +0000
+++ gcc/doc/tm.texi.in  2020-01-07 10:24:01.542344777 +0000
@@ -3365,6 +3365,8 @@ stack.
 
 @hook TARGET_VECTOR_MODE_SUPPORTED_P
 
+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
+
 @hook TARGET_ARRAY_MODE
 
 @hook TARGET_ARRAY_MODE_SUPPORTED_P
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi     2020-01-06 12:57:53.649944570 +0000
+++ gcc/doc/tm.texi     2020-01-07 10:24:01.542344777 +0000
@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
 must have move patterns for this mode.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree 
@var{type1}, const_tree @var{type2})
+Return true if there is no target-specific reason for treating
+vector types @var{type1} and @var{type2} as distinct types.  The caller
+has already checked for target-independent reasons, meaning that the
+types are known to have the same mode, to have the same number of elements,
+and to have what the caller considers to be compatible element types.
+
+The main reason for defining this hook is to reject pairs of types
+that are handled differently by the target's calling convention.
+For example, when a new @var{N}-bit vector architecture is added
+to a target, the target may want to handle normal @var{N}-bit
+@code{VECTOR_TYPE} arguments and return values in the same way as
+before, to maintain backwards compatibility.  However, it may also
+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
+and returned in a more efficient way.  It is then important to maintain
+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
+architecture-specific ones.
+
+The default implementation returns true, which is correct for most targets.
+@end deftypefn
+
 @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode 
@var{mode}, unsigned HOST_WIDE_INT @var{nelems})
 Return the mode that GCC should use for an array that has
 @var{nelems} elements, with each element having mode @var{mode}.
Index: gcc/gimple-expr.c
===================================================================
--- gcc/gimple-expr.c   2020-01-06 12:58:10.545833431 +0000
+++ gcc/gimple-expr.c   2020-01-07 10:24:01.542344777 +0000
@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
 #include "tree-pass.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "target.h"
 
 /* ----- Type related -----  */
 
@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
 
   /* Recurse for vector types with the same number of subparts.  */
   else if (TREE_CODE (inner_type) == VECTOR_TYPE
-          && TREE_CODE (outer_type) == VECTOR_TYPE
-          && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
-    return useless_type_conversion_p (TREE_TYPE (outer_type),
-                                     TREE_TYPE (inner_type));
+          && TREE_CODE (outer_type) == VECTOR_TYPE)
+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
+                     TYPE_VECTOR_SUBPARTS (outer_type))
+           && useless_type_conversion_p (TREE_TYPE (outer_type),
+                                         TREE_TYPE (inner_type))
+           && targetm.compatible_vector_types_p (inner_type, outer_type));
 
   else if (TREE_CODE (inner_type) == ARRAY_TYPE
           && TREE_CODE (outer_type) == ARRAY_TYPE)
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c        2020-01-07 10:18:06.572651552 +0000
+++ gcc/config/aarch64/aarch64.c        2020-01-07 10:24:01.538344801 +0000
@@ -2098,6 +2098,15 @@ aarch64_fntype_abi (const_tree fntype)
   return default_function_abi;
 }
 
+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
+
+static bool
+aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
+{
+  return (aarch64_sve::builtin_type_p (type1)
+         == aarch64_sve::builtin_type_p (type2));
+}
+
 /* Return true if we should emit CFI for register REGNO.  */
 
 static bool
@@ -22099,6 +22108,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
 
+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
+#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p
+
 #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
 #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
   aarch64_builtin_support_vector_misalignment
Index: gcc/config/aarch64/aarch64-sve-builtins.cc
===================================================================
--- gcc/config/aarch64/aarch64-sve-builtins.cc  2020-01-07 10:21:17.575410530 
+0000
+++ gcc/config/aarch64/aarch64-sve-builtins.cc  2020-01-07 10:24:01.534344828 
+0000
@@ -2265,9 +2265,13 @@ tree
 gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
                             unsigned int argno)
 {
-  tree predtype = truth_type_for (vectype);
   tree pred = gimple_call_arg (call, argno);
-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
+               TYPE_VECTOR_SUBPARTS (vectype)))
+    return pred;
+
+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
+                      truth_type_for (vectype), pred);
 }
 
 /* Return a pointer to the address in a contiguous load or store,
Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
===================================================================
--- /dev/null   2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c    2020-01-07 
10:24:01.546344751 +0000
@@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (float16x16_t);
+void float32_callee (float32x8_t);
+void float64_callee (float64x4_t);
+void int8_callee (int8x32_t);
+void int16_callee (int16x16_t);
+void int32_callee (int32x8_t);
+void int64_callee (int64x4_t);
+void uint8_callee (uint8x32_t);
+void uint16_callee (uint16x16_t);
+void uint32_callee (uint32x8_t);
+void uint64_callee (uint64x4_t);
+
+void
+float16_caller (void)
+{
+  float16_callee (svdup_f16 (1.0));
+}
+
+void
+float32_caller (void)
+{
+  float32_callee (svdup_f32 (2.0));
+}
+
+void
+float64_caller (void)
+{
+  float64_callee (svdup_f64 (3.0));
+}
+
+void
+int8_caller (void)
+{
+  int8_callee (svindex_s8 (0, 1));
+}
+
+void
+int16_caller (void)
+{
+  int16_callee (svindex_s16 (0, 2));
+}
+
+void
+int32_caller (void)
+{
+  int32_callee (svindex_s32 (0, 3));
+}
+
+void
+int64_caller (void)
+{
+  int64_callee (svindex_s64 (0, 4));
+}
+
+void
+uint8_caller (void)
+{
+  uint8_callee (svindex_u8 (1, 1));
+}
+
+void
+uint16_caller (void)
+{
+  uint16_callee (svindex_u16 (1, 2));
+}
+
+void
+uint32_caller (void)
+{
+  uint32_callee (svindex_u32 (1, 3));
+}
+
+void
+uint64_caller (void)
+{
+  uint64_callee (svindex_u64 (1, 4));
+}
+
+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } 
} */
+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 } 
} */
+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } 
} */
+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } 
} */
+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
===================================================================
--- /dev/null   2019-09-17 11:41:18.176664108 +0100
+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c    2020-01-07 
10:24:01.546344751 +0000
@@ -0,0 +1,99 @@
+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
+
+#include <arm_sve.h>
+
+typedef float16_t float16x16_t __attribute__((vector_size (32)));
+typedef float32_t float32x8_t __attribute__((vector_size (32)));
+typedef float64_t float64x4_t __attribute__((vector_size (32)));
+typedef int8_t int8x32_t __attribute__((vector_size (32)));
+typedef int16_t int16x16_t __attribute__((vector_size (32)));
+typedef int32_t int32x8_t __attribute__((vector_size (32)));
+typedef int64_t int64x4_t __attribute__((vector_size (32)));
+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
+
+void float16_callee (svfloat16_t);
+void float32_callee (svfloat32_t);
+void float64_callee (svfloat64_t);
+void int8_callee (svint8_t);
+void int16_callee (svint16_t);
+void int32_callee (svint32_t);
+void int64_callee (svint64_t);
+void uint8_callee (svuint8_t);
+void uint16_callee (svuint16_t);
+void uint32_callee (svuint32_t);
+void uint64_callee (svuint64_t);
+
+void
+float16_caller (float16x16_t arg)
+{
+  float16_callee (arg);
+}
+
+void
+float32_caller (float32x8_t arg)
+{
+  float32_callee (arg);
+}
+
+void
+float64_caller (float64x4_t arg)
+{
+  float64_callee (arg);
+}
+
+void
+int8_caller (int8x32_t arg)
+{
+  int8_callee (arg);
+}
+
+void
+int16_caller (int16x16_t arg)
+{
+  int16_callee (arg);
+}
+
+void
+int32_caller (int32x8_t arg)
+{
+  int32_callee (arg);
+}
+
+void
+int64_caller (int64x4_t arg)
+{
+  int64_callee (arg);
+}
+
+void
+uint8_caller (uint8x32_t arg)
+{
+  uint8_callee (arg);
+}
+
+void
+uint16_caller (uint16x16_t arg)
+{
+  uint16_callee (arg);
+}
+
+void
+uint32_caller (uint32x8_t arg)
+{
+  uint32_callee (arg);
+}
+
+void
+uint64_caller (uint64x4_t arg)
+{
+  uint64_callee (arg);
+}
+
+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */
+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */
+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

Reply via email to