On Mon, Sep 15, 2025 at 7:57 AM Uros Bizjak <[email protected]> wrote:
>
> On Sun, Sep 14, 2025 at 9:14 PM H.J. Lu <[email protected]> wrote:
> >
> > If a single instruction can store or move the whole block of memory, use
> > vector instruction and don't align destination.
> >
> > gcc/
> >
> >         PR target/121934
> >         * config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): If a
> >         single instruction can store or move the whole block of memory,
> >         use vector instruction and don't align destination.
> >
> > gcc/testsuite/
> >
> >         PR target/121934
> >         * gcc.target/i386/pr121934-1a.c: New test.
> >         * gcc.target/i386/pr121934-1b.c: Likewise.
> >         * gcc.target/i386/pr121934-2a.c: Likewise.
> >         * gcc.target/i386/pr121934-2b.c: Likewise.
> >         * gcc.target/i386/pr121934-3a.c: Likewise.
> >         * gcc.target/i386/pr121934-3b.c: Likewise.
> >         * gcc.target/i386/pr121934-4a.c: Likewise.
> >         * gcc.target/i386/pr121934-4b.c: Likewise.
> >         * gcc.target/i386/pr121934-5a.c: Likewise.
> >         * gcc.target/i386/pr121934-5b.c: Likewise.
>
> OK.
>
> Thanks,
> Uros.
>
> >
> > Signed-off-by: H.J. Lu <[email protected]>
> > ---
> >  gcc/config/i386/i386-expand.cc              | 62 +++++++++++++--------
> >  gcc/testsuite/gcc.target/i386/pr121934-1a.c | 22 ++++++++
> >  gcc/testsuite/gcc.target/i386/pr121934-1b.c |  7 +++
> >  gcc/testsuite/gcc.target/i386/pr121934-2a.c | 23 ++++++++
> >  gcc/testsuite/gcc.target/i386/pr121934-2b.c |  7 +++
> >  gcc/testsuite/gcc.target/i386/pr121934-3a.c | 23 ++++++++
> >  gcc/testsuite/gcc.target/i386/pr121934-3b.c |  7 +++
> >  gcc/testsuite/gcc.target/i386/pr121934-4a.c | 23 ++++++++
> >  gcc/testsuite/gcc.target/i386/pr121934-4b.c |  7 +++
> >  gcc/testsuite/gcc.target/i386/pr121934-5a.c | 23 ++++++++
> >  gcc/testsuite/gcc.target/i386/pr121934-5b.c |  7 +++
> >  11 files changed, 187 insertions(+), 24 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-1a.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-1b.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-2a.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-2b.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-3a.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-3b.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-4a.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-4b.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-5a.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-5b.c
> >
> > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> > index dc26b3452cb..b0b9e6da946 100644
> > --- a/gcc/config/i386/i386-expand.cc
> > +++ b/gcc/config/i386/i386-expand.cc
> > @@ -9552,9 +9552,20 @@ ix86_expand_set_or_cpymem (rtx dst, rtx src, rtx 
> > count_exp, rtx val_exp,
> >    if (!issetmem)
> >      srcreg = ix86_copy_addr_to_reg (XEXP (src, 0));
> >
> > +  bool aligned_dstmem = false;
> > +  unsigned int nunits = issetmem ? STORE_MAX_PIECES : MOVE_MAX;
> > +  bool single_insn_p = count && count <= nunits;

Should the above also consider X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL
and/or X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL?

Uros.

Reply via email to