On Thu, 16 Apr 2026, Uros Bizjak wrote:

> Hello!
> 
> After pass_reorder_blocks, there remain some propagating opportunities
> for late_combine.  Looking at gcc.target/i386/pr90178.c, we get a
> trivial sequence of:
> 
> gcc -O2 -mavx -mvzeroupper -m32:
> 
> .L5:
>     xorl    %ecx, %ecx
>     ...
>     movl    %ecx, %eax
>     ret
> 
> Putting another instance of pass_late_combine after
> pass_reorder_blocks improves the assembly in a non-trivial way:
> 
>  @@ -28,10 +28,8 @@
>      cmpl    %edx, %ebx
>      je    .L5
>  .L4:
> -    movl    %eax, %ecx
>      cmpl    %esi, (%eax)
>      jne    .L11
> -    movl    %ecx, %eax
>      popl    %ebx
>      .cfi_remember_state
>      .cfi_restore 3
> @@ -44,17 +42,16 @@
>      .p2align 3
>  .L5:
>      .cfi_restore_state
> -    xorl    %ecx, %ecx
> +    xorl    %eax, %eax
>      popl    %ebx
>      .cfi_restore 3
>      .cfi_def_cfa_offset 8
>      popl    %esi
>      .cfi_restore 6
>      .cfi_def_cfa_offset 4
> -    movl    %ecx, %eax
>      ret
>      .cfi_endproc
>  .LFE0:
>      .size    find_ptr, .-find_ptr
> 
> which looks like it is worth putting a new pass here.
> 
> A comparison of sizes of default x86_64 linux build shows noticeable
> code size improvement:
> 
> $ size vmlinux-old.o vmlinux-new.o
>   text    data     bss     dec     hex filename
> 29432351        4932443  754228 35119022        217dfae vmlinux-old.o
> 29415516        4932443  754228 35102187        2179deb vmlinux-new.o
> 
> which shows a code size reduction of 16835 bytes.
> 
> Any thoughts?

Did you check other places to schedule the pass?  Did you try
moving the existing postreload late_combine later?

Richard.

> BR,
> Uros.
> 

-- 
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to