sihuan wrote:

@joshua-arch1 
I tested this patch on LTO and it is consistent with what you describe. 
With the `-flto` option enabled is that `digits_2` has two specializations 
whether the patch is applied or not.
However with LTO enabled, LLVM performance is comparable to GCC (in terms of 
number of instructions ). Here are some figures:
> on x86_64 ArchLinux
> via `perf stat ./exchange2_r 0`

| Compiler | Instructions |
|--------|--------|
| gfortran  O3 (archlinux gcc-fortran 14.2.1+r134+gab884fffe3fc-2)| 
54,865,827,741 |
| gfortran  O3 lto (archlinux gcc-fortran 14.2.1+r134+gab884fffe3fc-2)| 
54,186,161,980 |
| flang O3 (#62d44fbd) | 107,953,750,439  |
| flang O3 lto (#62d44fbd) | 53,391,922,257  |
| flang O3 (patched #62d44fbd) | 53,146,581,727 |
| flang O3 lto (patched #62d44fbd) | 53,391,760,016 |

When using LTO, I don't think there is a problem that performance is worse than 
GCC, so this patch is not specific to LTO.


https://github.com/llvm/llvm-project/pull/96620
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to