On Fri, Jul 11, 2025 at 4:23 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Fri, Jul 11, 2025 at 9:57 AM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > > gcc/
> > >
> > > PR target/121015
> > > * config/i386/constraints.md (BX): New constraint.
> > > * config/i386/i386.cc (ix86_print_operand): Support CONSTM1_RTX.
> > > * config/i386/mmx.md (MMXMODE:*mov<mode>_internal): Replace C with
> > > BX for memory and integer register destination.  Replace <v,C>
> > > with <v,BX>.
> > > Update 32-bit MMXMODE move splitter to also split all 1s vector
> > > source operand.
> > > * config/i386/predicates.md (vector_const0_or_m1_operand): New
> > > predicate.
> > > (nonimm_or_vector_const0_or_m1_operand): Likewise.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/121015
> > > * gcc.target/i386/pr106022-2.c: Adjusted.
> > > * gcc.target/i386/pr121015-1.c: New test.
> > > * gcc.target/i386/pr121015-2.c: Likewise.
> > > * gcc.target/i386/pr121015-3.c: Likewise.
> > > * gcc.target/i386/pr121015-4.c: Likewise.
> > > * gcc.target/i386/pr121015-5.c: Likewise.
> > > * gcc.target/i386/pr121015-6.c: Likewise.
> > >
> > > OK for master?
> >
> > Please try the attached patch that introduces "all ones" handling to MMX 
> > moves.
>
> Bah, wrong version attached (missing 32bit modes in mmxconstm1) -
> please try this.
>
> Uros.

Here are the source and 2 assembly codes generated by -O2 -march=x86-64-v3.
My patch generates:

movq $-1, %rax
...
movq %rax, 4(%rcx)
...
movq %rax, 4(%rcx)
...
movq %rax, 4(%rcx)

Yours generates:

vpcmpeqd %xmm0, %xmm0, %xmm0
...
vmovlps %xmm0, 4(%rdx)
...
vpcmpeqd %xmm1, %xmm1, %xmm1
...
vmovlps %xmm1, 4(%rdx)
...
vpcmpeqd %xmm2, %xmm2, %xmm2
...
vmovlps %xmm2, 4(%rdx)

I prefer the assembly codes generated by my patch.

-- 
H.J.

Attachment: pr121015-1.s.hjl
Description: Binary data

Attachment: pr121015-1.s.uros
Description: Binary data

/* { dg-do compile } */
/* { dg-options "-O2 -march=x86-64-v3" } */
/* { dg-final { scan-assembler-not "\tmovl\[\\t \]+\\\$-1, %" { target { ! ia32 } } } } */
/* { dg-final { scan-assembler "\tmovq\[\\t \]+\\\$-1, " { target { ! ia32 } } } } */

extern union {
  int i;
  float f;
} int_as_float_u;

extern int render_result_from_bake_w;
extern int render_result_from_bake_h_seed_pass;
extern float *render_result_from_bake_h_primitive;
extern float *render_result_from_bake_h_seed;

float
int_as_float(int i)
{
  int_as_float_u.i = i;
  return int_as_float_u.f;
}

void
render_result_from_bake_h(int tx)
{
  while (render_result_from_bake_w) {
    for (; tx < render_result_from_bake_w; tx++)
      render_result_from_bake_h_primitive[1] =
          render_result_from_bake_h_primitive[2] = int_as_float(-1);
    if (render_result_from_bake_h_seed_pass) {
      *render_result_from_bake_h_seed = 0;
    }
  }
}

Reply via email to