On Thu, 16 Apr 2026, Uros Bizjak wrote: > Hello! > > After pass_reorder_blocks, there remain some propagating opportunities > for late_combine. Looking at gcc.target/i386/pr90178.c, we get a > trivial sequence of: > > gcc -O2 -mavx -mvzeroupper -m32: > > .L5: > xorl %ecx, %ecx > ... > movl %ecx, %eax > ret > > Putting another instance of pass_late_combine after > pass_reorder_blocks improves the assembly in a non-trivial way: > > @@ -28,10 +28,8 @@ > cmpl %edx, %ebx > je .L5 > .L4: > - movl %eax, %ecx > cmpl %esi, (%eax) > jne .L11 > - movl %ecx, %eax > popl %ebx > .cfi_remember_state > .cfi_restore 3 > @@ -44,17 +42,16 @@ > .p2align 3 > .L5: > .cfi_restore_state > - xorl %ecx, %ecx > + xorl %eax, %eax > popl %ebx > .cfi_restore 3 > .cfi_def_cfa_offset 8 > popl %esi > .cfi_restore 6 > .cfi_def_cfa_offset 4 > - movl %ecx, %eax > ret > .cfi_endproc > .LFE0: > .size find_ptr, .-find_ptr > > which looks like it is worth putting a new pass here. > > A comparison of sizes of default x86_64 linux build shows noticeable > code size improvement: > > $ size vmlinux-old.o vmlinux-new.o > text data bss dec hex filename > 29432351 4932443 754228 35119022 217dfae vmlinux-old.o > 29415516 4932443 754228 35102187 2179deb vmlinux-new.o > > which shows a code size reduction of 16835 bytes. > > Any thoughts?
Did you check other places to schedule the pass? Did you try moving the existing postreload late_combine later? Richard. > BR, > Uros. > -- Richard Biener <[email protected]> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
