On Fri, Jul 11, 2025 at 4:23 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Fri, Jul 11, 2025 at 9:57 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Fri, Jul 11, 2025 at 6:05 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > gcc/ > > > > > > PR target/121015 > > > * config/i386/constraints.md (BX): New constraint. > > > * config/i386/i386.cc (ix86_print_operand): Support CONSTM1_RTX. > > > * config/i386/mmx.md (MMXMODE:*mov<mode>_internal): Replace C with > > > BX for memory and integer register destination. Replace <v,C> > > > with <v,BX>. > > > Update 32-bit MMXMODE move splitter to also split all 1s vector > > > source operand. > > > * config/i386/predicates.md (vector_const0_or_m1_operand): New > > > predicate. > > > (nonimm_or_vector_const0_or_m1_operand): Likewise. > > > > > > gcc/testsuite/ > > > > > > PR target/121015 > > > * gcc.target/i386/pr106022-2.c: Adjusted. > > > * gcc.target/i386/pr121015-1.c: New test. > > > * gcc.target/i386/pr121015-2.c: Likewise. > > > * gcc.target/i386/pr121015-3.c: Likewise. > > > * gcc.target/i386/pr121015-4.c: Likewise. > > > * gcc.target/i386/pr121015-5.c: Likewise. > > > * gcc.target/i386/pr121015-6.c: Likewise. > > > > > > OK for master? > > > > Please try the attached patch that introduces "all ones" handling to MMX > > moves. > > Bah, wrong version attached (missing 32bit modes in mmxconstm1) - > please try this. > > Uros.
Here are the source and 2 assembly codes generated by -O2 -march=x86-64-v3. My patch generates: movq $-1, %rax ... movq %rax, 4(%rcx) ... movq %rax, 4(%rcx) ... movq %rax, 4(%rcx) Yours generates: vpcmpeqd %xmm0, %xmm0, %xmm0 ... vmovlps %xmm0, 4(%rdx) ... vpcmpeqd %xmm1, %xmm1, %xmm1 ... vmovlps %xmm1, 4(%rdx) ... vpcmpeqd %xmm2, %xmm2, %xmm2 ... vmovlps %xmm2, 4(%rdx) I prefer the assembly codes generated by my patch. -- H.J.
pr121015-1.s.hjl
Description: Binary data
pr121015-1.s.uros
Description: Binary data
/* { dg-do compile } */ /* { dg-options "-O2 -march=x86-64-v3" } */ /* { dg-final { scan-assembler-not "\tmovl\[\\t \]+\\\$-1, %" { target { ! ia32 } } } } */ /* { dg-final { scan-assembler "\tmovq\[\\t \]+\\\$-1, " { target { ! ia32 } } } } */ extern union { int i; float f; } int_as_float_u; extern int render_result_from_bake_w; extern int render_result_from_bake_h_seed_pass; extern float *render_result_from_bake_h_primitive; extern float *render_result_from_bake_h_seed; float int_as_float(int i) { int_as_float_u.i = i; return int_as_float_u.f; } void render_result_from_bake_h(int tx) { while (render_result_from_bake_w) { for (; tx < render_result_from_bake_w; tx++) render_result_from_bake_h_primitive[1] = render_result_from_bake_h_primitive[2] = int_as_float(-1); if (render_result_from_bake_h_seed_pass) { *render_result_from_bake_h_seed = 0; } } }