<alfie.richa...@arm.com> writes:
> From: Alfie Richards <alfie.richa...@arm.com>
>
> Adds an optimisation in FMV to redirect to a specific target if possible.
>
> A call is redirected to a specific target if both:
> - the caller can always call the callee version
> - and, it is possible to rule out all higher priority versions of the callee
>   fmv set. That is estabilished either by the callee being the highest 
> priority
>   version, or each higher priority version of the callee implying that, were 
> it
>   resolved, a higher priority version of the caller would have been selected.
>
> For this logic, introduces the new TARGET_OPTION_FUNCTIONS_B_RESOLVABLE_FROM_A
> hook. Adds a full implementation for Aarch64, and a weaker default version
> for other targets.
>
> This allows the target to replace the previous optimisation as the new one is
> able to cover the same case where two function sets implement the same 
> versions.

Thanks for the update.  The new semantics look good to me.

>
> gcc/ChangeLog:
>
>       * config/aarch64/aarch64.cc (aarch64_functions_b_resolvable_from_a): New
>       function.
>       (TARGET_OPTION_FUNCTIONS_B_RESOLVABLE_FROM_A): New define.
>       * doc/tm.texi: Regenerate.
>       * doc/tm.texi.in: Add documentation for version_a_implies_version_b.
>       * multiple_target.cc (redirect_to_specific_clone): Add new optimisation
>       logic.
>       (ipa_target_clone): Remove check for TARGET_HAS_FMV_TARGET_ATTRIBUTE.
>       * target.def: Document new hook..
>       * attribs.cc: (functions_b_resolvable_from_a) New function.
>       * attribs.h: (functions_b_resolvable_from_a) New function.
>
> gcc/testsuite/ChangeLog:
>
>       * g++.target/aarch64/fmv-selection1.C: New test.
>       * g++.target/aarch64/fmv-selection2.C: New test.
>       * g++.target/aarch64/fmv-selection3.C: New test.
>       * g++.target/aarch64/fmv-selection4.C: New test.
>       * g++.target/aarch64/fmv-selection5.C: New test.
>       * g++.target/aarch64/fmv-selection6.C: New test.
>       * g++.target/aarch64/fmv-selection7.C: New test.
> ---
>  gcc/attribs.cc                                |  20 ++++
>  gcc/attribs.h                                 |   1 +
>  gcc/config/aarch64/aarch64.cc                 |  28 +++++
>  gcc/doc/tm.texi                               |   8 ++
>  gcc/doc/tm.texi.in                            |   2 +
>  gcc/multiple_target.cc                        | 110 +++++++++++-------
>  gcc/target.def                                |  11 ++
>  .../g++.target/aarch64/fmv-selection1.C       |  40 +++++++
>  .../g++.target/aarch64/fmv-selection2.C       |  40 +++++++
>  .../g++.target/aarch64/fmv-selection3.C       |  25 ++++
>  .../g++.target/aarch64/fmv-selection4.C       |  30 +++++
>  .../g++.target/aarch64/fmv-selection5.C       |  28 +++++
>  .../g++.target/aarch64/fmv-selection6.C       |  27 +++++
>  .../g++.target/aarch64/fmv-selection7.C       |  65 +++++++++++
>  14 files changed, 392 insertions(+), 43 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection1.C
>  create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection2.C
>  create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection3.C
>  create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection4.C
>  create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection5.C
>  create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection6.C
>  create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection7.C
>
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index 2ca82674f7c..fc1751a5f2e 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -1095,6 +1095,26 @@ common_function_versions (string_slice fn1 
> ATTRIBUTE_UNUSED,
>    gcc_unreachable ();
>  }
>  
> +/* Default implementation of 
> TARGET_OPTION_FUNCTION_B_RESOLVBLE_FROM_FUNCTION_A.

TARGET_OPTION_FUNCTIONS_B_RESOLVABLE_FROM_A

> +   Used to check very basically if FN2 is callable from FN1.
> +   For now this checks if the version strings are the same.  */
> +
> +bool
> +functions_b_resolvable_from_a (tree fn1, tree fn2, tree base 
> ATTRIBUTE_UNUSED)

Calling these decl_a and decl_b might be better, as for the aarch64
version.  I realise it doesn't matter here, since the function is
symmetric, but it would make it clearer what "a" and "b" refer to.

> +{
> +  const char *attr_name = TARGET_HAS_FMV_TARGET_ATTRIBUTE
> +                       ? "target"
> +                       : "target_version";
> +
> +  tree attr1 = lookup_attribute (attr_name, DECL_ATTRIBUTES (fn1));
> +  tree attr2 = lookup_attribute (attr_name, DECL_ATTRIBUTES (fn2));

Motivated by the documentation suggestion below, I think we should
assert attr2 (attr_b).  Same for the aarch64 version.

> +
> +  if (!attr1 || !attr2)
> +    return false;
> +
> +  return attribute_value_equal (attr1, attr2);
> +}
> +
>  /* Comparator function to be used in qsort routine to sort attribute
>     specification strings to "target".  */
>  
> [...]
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 47755ad32eb..fddca8c83e0 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -10972,6 +10972,14 @@ This target hook returns @code{true} if the 
> target/target-version strings
>  @var{fn1} and @var{fn2} imply the same function version.
>  @end deftypefn
>  
> +@deftypefn {Target Hook} bool TARGET_OPTION_FUNCTIONS_B_RESOLVABLE_FROM_A 
> (tree @var{v_a}, tree @var{v_b}, tree @var{base})
> +Determines if a function with the arch implied by @var{v_b}'s

Here too, I think decl_a and decl_b would be easier to follow.

I think it would be worth expanding "arch" to "architecture".

> +FMV version attributes (either @code{target} or @code{target_version} for

function multi-versioning (FMV)

> +target FMV or target_version FMV semantics respectively) would always be
> +resolvable from a function with @var{v_a}'s FMV version attributes.
> +Uses a function decl @var{base} to establish the baseline architecture.

It would be good to expand on what the default is.

How about:

------------------------------------------------------------------------------
@var{decl_b} is a function declaration with a function multi-versioning (FMV)
attribute; this attribute is either @code{target} or @code{target_version},
depending on @code{TARGET_HAS_FMV_TARGET_ATTRIBUTE}.  @var{decl_a} is
a function declaration that may or may not have an FMV attribute.

Return true if we have enough information to determine that the
requirements of @var{decl_b}'s FMV attribute are met whenever @var{decl_a}
is executed, given that the target supports all features required by
function declaration @var{base}.

The default implementation just checks whether @var{decl_a} has the same
FMV attribute as @var{decl_b}.  This is conservatively correct,
but ports can do better by taking the relationships between architecture
features into account.  For example, on AArch64, @code{sve} is present
whenever @code{sve2} is present.
------------------------------------------------------------------------------

Feel free to reword.  It was just a way of listing the information
that seemed useful to include.

(It looks like we're missing documentation for
TARGET_HAS_FMV_TARGET_ATTRIBUTE though.)

> +@end deftypefn
> +
>  @deftypefn {Target Hook} bool TARGET_CAN_INLINE_P (tree @var{caller}, tree 
> @var{callee})
>  This target hook returns @code{false} if the @var{caller} function
>  cannot inline @var{callee}, based on target specific information.  By
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index eaab62652a7..99f47cb9d78 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -7132,6 +7132,8 @@ with the target specific attributes.  The default value 
> is @code{','}.
>  
>  @hook TARGET_OPTION_FUNCTION_VERSIONS
>  
> +@hook TARGET_OPTION_FUNCTIONS_B_RESOLVABLE_FROM_A
> +
>  @hook TARGET_CAN_INLINE_P
>  
>  @hook TARGET_UPDATE_IPA_FN_TARGET_INFO
> diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc
> index 7ab025d7d46..681b586e61c 100644
> --- a/gcc/multiple_target.cc
> +++ b/gcc/multiple_target.cc
> @@ -423,61 +423,86 @@ expand_target_clones (struct cgraph_node *node, bool 
> definition)
>    return true;
>  }
>  
> -/* When NODE is a target clone, consider all callees and redirect
> -   to a clone with equal target attributes.  That prevents multiple
> -   multi-versioning dispatches and a call-chain can be optimized.
> -
> -   This optimisation might pick the wrong version in some cases, since 
> knowing
> -   that we meet the target requirements for a matching callee version does 
> not
> -   tell us that we won't also meet the target requirements for a higher
> -   priority callee version at runtime.  Since this is longstanding behaviour
> -   for x86 and powerpc, we preserve it for those targets, but skip the 
> optimisation
> -   for targets that use the "target_version" attribute for multi-versioning. 
>  */
> +/* When NODE is part of an FMV function set, consider all callees and check 
> if
> +   any can provably always resolve a certain version and then call that 
> version
> +   directly.  */
>  
>  static void
>  redirect_to_specific_clone (cgraph_node *node)
>  {
> -  cgraph_function_version_info *fv = node->function_version ();
> -  if (fv == NULL)
> +  if (!targetm.compare_version_priority
> +      || !targetm.target_option.functions_b_resolvable_from_a

The condition on the last line should always be true.

> +      || !optimize)
>      return;
>  
> -  gcc_assert (TARGET_HAS_FMV_TARGET_ATTRIBUTE);
> -  tree attr_target = lookup_attribute ("target", DECL_ATTRIBUTES 
> (node->decl));
> -  if (attr_target == NULL_TREE)
> -    return;
> -
> -  /* We need to remember NEXT_CALLER as it could be modified in the loop.  */
> +  /* We need to remember NEXT_CALLER as it could be modified in the
> +     loop.  */

There's no need to reflow this comment.

>    for (cgraph_edge *e = node->callees; e ; e = e->next_callee)
>      {
> -      cgraph_function_version_info *fv2 = e->callee->function_version ();
> -      if (!fv2)
> +      /* Only if this is a call to a dispatched symbol.  */
> +      if (!e->callee->dispatcher_function)
>       continue;
>  
> -      tree attr_target2 = lookup_attribute ("target",
> -                                         DECL_ATTRIBUTES (e->callee->decl));
> +      cgraph_function_version_info *callee_v
> +     = e->callee->function_version ();
> +      cgraph_function_version_info *caller_v
> +     = e->caller->function_version ();
>  
> -      /* Function is not calling proper target clone.  */
> -      if (attr_target2 == NULL_TREE
> -       || !attribute_value_equal (attr_target, attr_target2))
> -     {
> -       while (fv2->prev != NULL)
> -         fv2 = fv2->prev;
> +      /* If this is not the TU that contains the definition of the default
> +      version we are not guaranteed to have visibility of all versions
> +      so cannot reason about them.  */
> +      if (!callee_v->next->this_node->binds_to_current_def_p ())
> +     continue;
>  
> -       /* Try to find a clone with equal target attribute.  */
> -       for (; fv2 != NULL; fv2 = fv2->next)
> -         {
> -           cgraph_node *callee = fv2->this_node;
> -           attr_target2 = lookup_attribute ("target",
> -                                            DECL_ATTRIBUTES (callee->decl));
> -           if (attr_target2 != NULL_TREE
> -               && attribute_value_equal (attr_target, attr_target2))
> +      cgraph_function_version_info *highest_callable_fn = NULL;
> +      for (cgraph_function_version_info *ver = callee_v->next;
> +        ver;
> +        ver = ver->next)
> +     if (targetm.target_option.functions_b_resolvable_from_a
> +           (node->decl, ver->this_node->decl, node->decl))
> +       highest_callable_fn = ver;
> +
> +      if (!highest_callable_fn)
> +     continue;
> +
> +      /* If there are higher priority versions of callee and caller has no
> +      more version information, then not callable.  */
> +      if (!caller_v && highest_callable_fn->next)
> +     continue;

I think this should be:

    if (highest_callable_fn->next
        && (!caller_v
            || !caller_v->next->this_node->binds_to_current_def_p ()))
      continue;

We don't need to check whether node->binds_to_current_def_p (),
because if node is preempted/overridden, then it doesn't matter
what node does.  But for highest_callable_fn->next we're assuming
that all of caller's versions are visible in the current TU.

> +
> +      bool inlinable = true;
> +      /* If every higher priority version would imply a higher priority
> +      version of caller would have been selected, then this is
> +      callable.  */
> +      if (caller_v)
> +     for (cgraph_function_version_info *callee_ver
> +            = highest_callable_fn->next;
> +          callee_ver;
> +          callee_ver = callee_ver->next)
> +       {
> +         bool is_possible = true;
> +         for (cgraph_function_version_info *caller_ver = caller_v->next;
> +              caller_ver;
> +              caller_ver = caller_ver->next)
> +           if (targetm.target_option.functions_b_resolvable_from_a
> +                (callee_ver->this_node->decl,
> +                 caller_ver->this_node->decl,
> +                 node->decl))
>               {
> -               e->redirect_callee (callee);
> -               cgraph_edge::redirect_call_stmt_to_callee (e);
> +               is_possible = false;
>                 break;
>               }
> -         }
> -     }
> +         if (is_possible)
> +           {
> +             inlinable = false;
> +             break;
> +           }
> +       }
> +      if (!inlinable)
> +     continue;

I suppose this is personal preference, but I think the flow would
be easier to follow if the loop was conditional on highest_callable_fn->next.
Putting that together:

    if (highest_callable_fn->next)
      {
        /* If there are higher priority versions of callee and caller has no
           more version information, then not callable.  */
        if (!caller_v
            || !caller_v->next->this_node->binds_to_current_def_p ())
          continue;

        /* If every higher priority version would imply a higher priority
           version of caller would have been selected, then this is
           callable.  */
        for (cgraph_function_version_info *callee_ver
               = highest_callable_fn->next;
             callee_ver;
             callee_ver = callee_ver->next)

LGTM with those changes.

Thanks,
Richard

> +
> +      e->redirect_callee (highest_callable_fn->this_node);
> +      cgraph_edge::redirect_call_stmt_to_callee (e);
>      }
>  }
>  

Reply via email to