[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2024-08-08 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #12 from Jorn Wolfgang Rennecke --- (In reply to Jakub Jelinek from comment #11) But the condition I quoted rejects the recognition of a bswap16 with non-promoted arguments. vectorizable_bswap doesn't do anything for processors that

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2024-08-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #11 from Jakub Jelinek --- Only 16-bit byteswap has the r>> 8 canonical form, larger byteswaps don't. Larger byteswaps certainly aren't rotates, but just permutes. So, if the vectorizer doesn't try that already, it should try to vect

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2024-08-07 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #10 from Jorn Wolfgang Rennecke --- Even if you add support for V2HI bswap, it won't help vectorization without support for V4QI vectors and permutations, because vectorizable_bswap won't recognize the bswap capability of the target a

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2024-08-07 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 Jorn Wolfgang Rennecke changed: What|Removed |Added CC||amylaar at gcc dot gnu.org ---

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2019-10-21 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2019-10-02 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #7 from Marc Glisse --- (In reply to Jakub Jelinek from comment #1) > The loop with the rotate is vectorized, while the one with __builtin_bswap16 > is not. It is a bit surprising that we do not canonicalize one to the other somewher

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2019-10-02 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #6 from Jakub Jelinek --- Author: jakub Date: Wed Oct 2 10:18:50 2019 New Revision: 276442 URL: https://gcc.gnu.org/viewcvs?rev=276442&root=gcc&view=rev Log: PR tree-optimization/91940 * tree-vect-patterns.c: Include

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2019-10-01 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #5 from Jakub Jelinek --- Created attachment 46985 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46985&action=edit gcc10-pr91940.patch Full untested patch.

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2019-10-01 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #4 from Richard Biener --- Looks good from a quick look.

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2019-10-01 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #3 from Jakub Jelinek --- Untested WIP patch that does both. If it finds vectorize_bswap will work (the corresponding permutation is supported), it will just undo the promotion, if target supports vector rotates, will use vector rotat

[Bug tree-optimization/91940] __builtin_bswap16 loop optimization

2019-10-01 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91940 --- Comment #2 from Richard Biener --- Another option is to elide the promotion? int foo (unsigned short x) { return __builtin_bswap16 (x); } return (int) __builtin_bswap16 ((int) x); but BUILT_IN_BSWAP16 is BT_FN_UINT16_UINT16, not sure w