https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111332
--- Comment #9 from d_vampile <d_vampile at 163 dot com> --- (In reply to Andrew Pinski from comment #8) > (In reply to d_vampile from comment #7) > > In terms of runtime, this code is the best. > > Depends on the core .... > What does -mtune=native provide for the core which you are running on? > Also what core are you testing with? I also tried GCC11 and GCC12, using the same compilation options, but not even the instruction ' vextracti128 ', so the program runs longer and performs worse. the assembly instruction is not change by use -mtune=nativeļ¼and the test results were still worse than gcc7. CPU info: Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz -mtune=generic