On Mon, Sep 15, 2025 at 7:57 AM Uros Bizjak <[email protected]> wrote: > > On Sun, Sep 14, 2025 at 9:14 PM H.J. Lu <[email protected]> wrote: > > > > If a single instruction can store or move the whole block of memory, use > > vector instruction and don't align destination. > > > > gcc/ > > > > PR target/121934 > > * config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): If a > > single instruction can store or move the whole block of memory, > > use vector instruction and don't align destination. > > > > gcc/testsuite/ > > > > PR target/121934 > > * gcc.target/i386/pr121934-1a.c: New test. > > * gcc.target/i386/pr121934-1b.c: Likewise. > > * gcc.target/i386/pr121934-2a.c: Likewise. > > * gcc.target/i386/pr121934-2b.c: Likewise. > > * gcc.target/i386/pr121934-3a.c: Likewise. > > * gcc.target/i386/pr121934-3b.c: Likewise. > > * gcc.target/i386/pr121934-4a.c: Likewise. > > * gcc.target/i386/pr121934-4b.c: Likewise. > > * gcc.target/i386/pr121934-5a.c: Likewise. > > * gcc.target/i386/pr121934-5b.c: Likewise. > > OK. > > Thanks, > Uros. > > > > > Signed-off-by: H.J. Lu <[email protected]> > > --- > > gcc/config/i386/i386-expand.cc | 62 +++++++++++++-------- > > gcc/testsuite/gcc.target/i386/pr121934-1a.c | 22 ++++++++ > > gcc/testsuite/gcc.target/i386/pr121934-1b.c | 7 +++ > > gcc/testsuite/gcc.target/i386/pr121934-2a.c | 23 ++++++++ > > gcc/testsuite/gcc.target/i386/pr121934-2b.c | 7 +++ > > gcc/testsuite/gcc.target/i386/pr121934-3a.c | 23 ++++++++ > > gcc/testsuite/gcc.target/i386/pr121934-3b.c | 7 +++ > > gcc/testsuite/gcc.target/i386/pr121934-4a.c | 23 ++++++++ > > gcc/testsuite/gcc.target/i386/pr121934-4b.c | 7 +++ > > gcc/testsuite/gcc.target/i386/pr121934-5a.c | 23 ++++++++ > > gcc/testsuite/gcc.target/i386/pr121934-5b.c | 7 +++ > > 11 files changed, 187 insertions(+), 24 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-1a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-1b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-2a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-2b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-3a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-3b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-4a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-4b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-5a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr121934-5b.c > > > > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc > > index dc26b3452cb..b0b9e6da946 100644 > > --- a/gcc/config/i386/i386-expand.cc > > +++ b/gcc/config/i386/i386-expand.cc > > @@ -9552,9 +9552,20 @@ ix86_expand_set_or_cpymem (rtx dst, rtx src, rtx > > count_exp, rtx val_exp, > > if (!issetmem) > > srcreg = ix86_copy_addr_to_reg (XEXP (src, 0)); > > > > + bool aligned_dstmem = false; > > + unsigned int nunits = issetmem ? STORE_MAX_PIECES : MOVE_MAX; > > + bool single_insn_p = count && count <= nunits;
Should the above also consider X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL and/or X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL? Uros.
