On Thu, Dec 19, 2024 at 12:01 AM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> In a later patch, I need to add "@" to a pattern that uses subst
> iterators.  This combination is problematic for two reasons:
>
> (1) define_substs are applied and filtered at a later stage than the
>     handling of "@" patterns, so that the handling of "@" patterns
>     doesn't know which subst variants are valid and which will later be
>     dropped.  Just adding a "@" therefore triggers a build error due to
>     references to non-existent patterns.
>
> (2) Currently, the code will treat a single "@" pattern as contributing
>     to a single set of overloaded functions.  These overloaded functions
>     will have an integer argument for every subst iterator.  For example,
>     the vczle and vczbe in:
>
>       "@aarch64_rev<REVERSE:rev_op><mode><vczle><vczbe>"
>
>     are subst iterators, and so currently we'd try to generate a
>     single set of overloads that take four arguments: one for rev_op,
>     one for the mode, one for vczle, and one for vczbe.  The gen_*
>     and maybe_gen_* functions will also have one rtx argument for
>     each operand in the original pattern.
>
>     This model doesn't really make sense for define_substs, since
>     define_substs are allowed to add extra operands to an instruction.
>     The number of rtx operands to the generators would then be
>     incorrect.
>
> I think a more sensible way of handling define_substs would be to
> apply them first (and thus expand things like <vczle> and <vczbe>
> above) and then apply "@".  However, that's a relatively invasive
> change and not suitable for stage 3.
>
> This patch instead skips over subst iterators and restricts "@"
> overload handling to the cases where no define_subst is applied.
> I looked through all uses of "@" names in target code and there
> seemed to be only one current use of "@" with define_substs,
> in x86 vector code.  The current behaviour seemed to be unwanted there,
> and the x86 code was having to work around it.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  I think the x86
> changes are bordering on obvious, but just in case: ok for those?
LGTM for x86 part.
>
> Thanks,
> Richard
>
>
> gcc/
>         * read-rtl.cc (md_reader::handle_overloaded_name): Don't add
>         arguments for uses of subst attributes.
>         (apply_iterators): Only add instructions to an overloaded helper
>         if they use the default subst iterator values.
>         * doc/md.texi: Update documentation accordingly.
>         * config/i386/i386-expand.cc (expand_vec_perm_broadcast_1): Update
>         accordingly.
> ---
>  gcc/config/i386/i386-expand.cc |  4 ++--
>  gcc/doc/md.texi                |  5 +++++
>  gcc/read-rtl.cc                | 18 ++++++++++++++----
>  3 files changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> index 6d25477841a..7f1dcd0937b 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -23884,7 +23884,7 @@ expand_vec_perm_broadcast_1 (struct expand_vec_perm_d 
> *d)
>        if (d->testing_p)
>         return true;
>
> -      rtx (*gen_interleave) (machine_mode, int, rtx, rtx, rtx);
> +      rtx (*gen_interleave) (machine_mode, rtx, rtx, rtx);
>        if (elt >= nelt2)
>         {
>           gen_interleave = gen_vec_interleave_high;
> @@ -23895,7 +23895,7 @@ expand_vec_perm_broadcast_1 (struct expand_vec_perm_d 
> *d)
>        nelt2 /= 2;
>
>        dest = gen_reg_rtx (vmode);
> -      emit_insn (gen_interleave (vmode, 1, dest, op0, op0));
> +      emit_insn (gen_interleave (vmode, dest, op0, op0));
>
>        vmode = V4SImode;
>        op0 = gen_lowpart (vmode, dest);
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index f0b63a144ad..a997ab07f21 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -12359,4 +12359,9 @@ output and three inputs).  This combination would 
> produce separate
>  each operand count, but it would still produce a single
>  @samp{maybe_code_for_@var{name}} and a single @samp{code_for_@var{name}}.
>
> +Currently, these @code{@@} patterns only take into account patterns for
> +which no @code{define_subst} has been applied (@pxref{Define Subst}).
> +Any @samp{<@dots{}>} placeholders that refer to subst attributes
> +(@pxref{Subst Iterators}) are ignored.
> +
>  @end ifset
> diff --git a/gcc/read-rtl.cc b/gcc/read-rtl.cc
> index 195f78bd5e1..49a5306254b 100644
> --- a/gcc/read-rtl.cc
> +++ b/gcc/read-rtl.cc
> @@ -814,9 +814,14 @@ md_reader::handle_overloaded_name (rtx original, 
> vec<mapping *> *iterators)
>           pending_underscore_p = false;
>         }
>
> -      /* Record an argument for ITERATOR.  */
> -      iterators->safe_push (iterator);
> -      tmp_oname.arg_types.safe_push (iterator->group->type);
> +      /* Skip define_subst iterators, since define_substs are allowed to
> +        add new match_operands in their output templates.  */
> +      if (iterator->group != &substs)
> +       {
> +         /* Record an argument for ITERATOR.  */
> +         iterators->safe_push (iterator);
> +         tmp_oname.arg_types.safe_push (iterator->group->type);
> +       }
>      }
>    if (base == copy)
>      fatal_with_file_and_line ("no iterator attributes in name `%s'", name);
> @@ -951,6 +956,7 @@ apply_iterators (rtx original, vec<rtx> *queue)
>        /* We apply subst iterator after RTL-template is copied, as during
>          subst-iterator processing, we could add an attribute to the
>          RTL-template, and we don't want to do it in the original one.  */
> +      bool add_oname = true;
>        FOR_EACH_VEC_ELT (iterator_uses, i, iuse)
>         {
>           v = iuse->iterator->current_value;
> @@ -961,10 +967,14 @@ apply_iterators (rtx original, vec<rtx> *queue)
>               current_iterator_name = iuse->iterator->name;
>               iuse->iterator->group->apply_iterator (iuse->x, iuse->index,
>                                                      v->number);
> +             /* Only handle '@' overloading for the default value.
> +                See handle_overloaded_name for details.  */
> +             if (v != iuse->iterator->values)
> +               add_oname = false;
>             }
>         }
>
> -      if (oname)
> +      if (oname && add_oname)
>         add_overload_instance (oname, iterators, x);
>
>        /* Add the new rtx to the end of the queue.  */
> --
> 2.25.1
>


-- 
BR,
Hongtao

Reply via email to