[Bug target/120428] [14/15/16 regression] Suboptimal autovec involving blocked permutation and std::copy

2025-05-27 Thread shawn at shawnxu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120428 --- Comment #12 from Shawn Xu --- Bisecting with -mprefer-vector-width=256 leads to PR112824, which seems to be a modification on the move_max option. Compiling with -mmove-max=256 reproduces the issue in 12.1: https://godbolt.org/z/boP6148er

[Bug target/120428] [14/15/16 regression] Suboptimal autovec involving blocked permutation and std::copy

2025-05-27 Thread shawn at shawnxu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120428 --- Comment #11 from Shawn Xu --- Since r15-3078-g6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 mentions -mprefer-vector-width, I tried setting -mprefer-vector-width=256 for compilation. This leads to the same regression in 14.2, but not in 13.3.

[Bug target/120428] [14/15/16 regression] Suboptimal autovec involving blocked permutation and std::copy

2025-05-27 Thread shawn at shawnxu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120428 --- Comment #9 from Shawn Xu --- (In reply to Jonathan Wakely from comment #8) > (In reply to Shawn Xu from comment #0) > > On x86-64 with avx512, PR115444 caused the following code to vectorize > > sub-optimally: > > What made you blame PR1154

[Bug libstdc++/120428] [15/16 regression] Suboptimal autovec involving blocked permutation and std::copy

2025-05-26 Thread shawn at shawnxu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120428 --- Comment #2 from Shawn Xu --- Shorter reproduction: // https://godbolt.org/z/z8z4Ye4rq #include #include #include #include void permute(std::array& data) { static constexpr std::array order{0, 1}; std::array buffer{}; for (

[Bug libstdc++/120428] [15/16 regression] Suboptimal autovec involving blocked permutation and std::copy

2025-05-26 Thread shawn at shawnxu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120428 --- Comment #3 from Shawn Xu --- Seems to only matter when -mavx or -mavx512f is on. Otherwise the generated assemblies are identical

[Bug target/120428] New: [15/16 regression] Suboptimal autovec involving blocked permutation and std::copy

2025-05-24 Thread shawn at shawnxu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120428 Bug ID: 120428 Summary: [15/16 regression] Suboptimal autovec involving blocked permutation and std::copy Product: gcc Version: 15.1.1 Status: UNCONFIRMED Seve