[Bug rtl-optimization/100085] New: Bad code for union transfer from __float128 to vector types

2021-04-14 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 Bug ID: 100085 Summary: Bad code for union transfer from __float128 to vector types Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal

[Bug rtl-optimization/100085] Bad code for union transfer from __float128 to vector types

2021-04-14 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 Steven Munroe changed: What|Removed |Added CC||munroesj at gcc dot gnu.org --- Comment

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2021-04-16 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #4 from Steven Munroe --- I am seeing this a similar problem with union transfers from __float128 to __int128. static inline unsigned __int128 vec_xfer_bin128_2_int128t (__binary128 f128) { __VF_128 vunion;

[Bug target/98519] rs6000: @pcrel unsupported on this instruction error in pveclib

2021-01-04 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98519 --- Comment #5 from Steven Munroe --- I would think you need to look at the instruction and the "m" constraint. In this case lxsd%X1 would need to be converted to plxsd and the "m" constraint would have to allow @pcrel. I would think a static va

[Bug target/98519] rs6000: @pcrel unsupported on this instruction error in pveclib

2021-01-04 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98519 --- Comment #7 from Steven Munroe --- Then you have problem as @pcrel is never valid for an instruction like lxsd%X1. Seems like you will need a new constrain or modifier specific to @pcrel.

[Bug middle-end/99293] New: Built-in vec_splat generates sub-optimal code for -mcpu=power10

2021-02-26 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293 Bug ID: 99293 Summary: Built-in vec_splat generates sub-optimal code for -mcpu=power10 Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal

[Bug middle-end/99293] Built-in vec_splat generates sub-optimal code for -mcpu=power10

2021-02-26 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293 --- Comment #1 from Steven Munroe --- Created attachment 50264 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50264&action=edit Compile test for simplied test case Download vec_dummy.c and vec_int128_ppc.h into a local directory and compil

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2021-04-29 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #5 from Steven Munroe --- Any progress on this?

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2021-06-10 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #13 from Steven Munroe --- "We want to use plain TImode instead of V1TImode on newer cpus." Actually I disagree. We have vector __int128 in the ABI and with POWER10 a complete set arithmetic operations for 128-bit in VRs. Also this

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2022-02-24 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #16 from Steven Munroe --- Created attachment 52510 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52510&action=edit Reduced tests for xfers from _float128 to vector or __int128 Cover more types including __int128 and vector _

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2022-02-24 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 Steven Munroe changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2022-02-25 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #21 from Steven Munroe --- Yes I was told by Peter Bergner that the fix from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085#c15 had been back ported top AT15.0-1. But when ran this test with AT15.0-1 I saw: :

[Bug target/100085] Bad code for union transfer from __float128 to vector types

2022-02-26 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085 --- Comment #23 from Steven Munroe --- Ok, but I strongly recommend a compiler test that verify that the compiler is generating the expected code (for this and other cases). We have a history of common code changes (accidental or deliberate) ca

[Bug c/106755] New: Incorrect code gen for altivec intrinsics with constant inputs

2022-08-26 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106755 Bug ID: 106755 Summary: Incorrect code gen for altivec intrinsics with constant inputs Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: blocker

[Bug target/104124] New: Poor optimization for vector splat DW with small consts

2022-01-19 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124 Bug ID: 104124 Summary: Poor optimization for vector splat DW with small consts Product: gcc Version: 11.1.1 Status: UNCONFIRMED Severity: normal Pri

[Bug target/104124] Poor optimization for vector splat DW with small consts

2022-01-19 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124 Steven Munroe changed: What|Removed |Added CC||munroesj at gcc dot gnu.org --- Comment

[Bug target/104124] Poor optimization for vector splat DW with small consts

2022-01-27 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124 Steven Munroe changed: What|Removed |Added Attachment #52236|0 |1 is obsolete|

[Bug c/110795] New: Bad code gen for vector compare booleans

2023-07-24 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795 Bug ID: 110795 Summary: Bad code gen for vector compare booleans Product: gcc Version: 13.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c

[Bug target/110795] Bad code gen for vector compare booleans

2023-07-24 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795 --- Comment #1 from Steven Munroe --- Created attachment 55627 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55627&action=edit Main and unit-test. When compiled and linked with vec_divide.c will verify if the divide code is correct or not

[Bug target/110795] Bad code gen for vector compare booleans

2023-07-24 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795 --- Comment #2 from Steven Munroe --- Also fails with gcc11/12. Also fails with Advance Toolchain 10.0 GCC 6.4.1. It might fail for all version between GCC 6 and 13.

[Bug target/110795] Bad code gen for vector compare booleans

2023-07-28 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795 --- Comment #5 from Steven Munroe --- Thanks, sorry I missed the obvious.

[Bug target/111645] New: Intrinsics vec_sldb /vec_srdb fail with __vector unsigned __int128

2023-09-29 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645 Bug ID: 111645 Summary: Intrinsics vec_sldb /vec_srdb fail with __vector unsigned __int128 Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal

[Bug target/111645] Intrinsics vec_sldb /vec_srdb fail with __vector unsigned __int128

2023-09-30 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645 Steven Munroe changed: What|Removed |Added Attachment #56018|0 |1 is obsolete|

[Bug target/111645] Intrinsics vec_sldb /vec_srdb fail with __vector unsigned __int128

2023-09-30 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645 --- Comment #3 from Steven Munroe --- (In reply to Peter Bergner from comment #1) > I see that we have created built-in overloads for signed and unsigned vector > char through vector long long. That said, the rs6000-builtins.def only > seems to

[Bug target/111645] Intrinsics vec_sldb /vec_srdb fail with __vector unsigned __int128

2023-10-01 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645 --- Comment #4 from Steven Munroe --- Actually shift/rotate intrinsic: ,vec_rl, vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra Support vector __int128 as required for the PowerISA 3.1 POWER vector shift/rotate quadword instructions But: vec_sld,

[Bug target/111645] Intrinsics vec_sldb /vec_srdb fail with __vector unsigned __int128

2023-10-25 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645 --- Comment #6 from Steven Munroe --- (In reply to Carl Love from comment #5) > There are a couple of issues with the test case in the attachment. For > example one of the tests is: > > > static inline vui64_t > vec_vsldbi_64 (vui64_t vra, vu

[Bug target/104124] Poor optimization for vector splat DW with small consts

2023-06-28 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124 --- Comment #5 from Steven Munroe --- Thanks

[Bug c/116004] New: PPC64 vector Intrinsic vec_first_mismatch_or_eos_index generates poor code

2024-07-19 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116004 Bug ID: 116004 Summary: PPC64 vector Intrinsic vec_first_mismatch_or_eos_index generates poor code Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: n

[Bug target/116004] PPC64 vector Intrinsic vec_first_mismatch_or_eos_index generates poor code

2024-07-19 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116004 --- Comment #1 from Steven Munroe --- Compile test code examples: int test_intrn_first_mismatch_or_eos_index_PWR9 (vui8_t vra, vui8_t vrb) { return vec_first_mismatch_or_eos_index (vra, vrb); } int test_first_mismatch_byte_or_eos_index_PWR9

[Bug target/116004] PPC64 vector Intrinsic vec_first_mismatch_or_eos_index generates poor code

2024-07-19 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116004 --- Comment #2 from Steven Munroe --- Actually: abnez = (vui8_t) vec_cmpnez (vra, vrb); result = vec_cntlz_lsbb (abnez);

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-10-11 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #4 from Steven Munroe --- Created attachment 59323 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59323&action=edit Examples doe Vector DW int constant

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-10-08 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #3 from Steven Munroe --- I tested the attached example source on GCC 14.0.1 from Ubuntu on powerpc64le. Seeing the same results. So add GCC 14.0.1 to this list. Actually the last GCC version that did not have this bug was GCC 7. Lo

[Bug target/117007] New: Poor optimiation for small vector constants needed for vector shift/rotate/mask genration.

2024-10-07 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 Bug ID: 117007 Summary: Poor optimiation for small vector constants needed for vector shift/rotate/mask genration. Product: gcc Version: 13.2.1 Status: UNCONFIRMED

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-10-29 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #10 from Steven Munroe --- (In reply to Segher Boessenkool from comment #7) > It is always more and slower code. Yes. More examples: vui64_t test_sld_52_v1 (vui64_t vra) { vui32_t shft = vec_splat_u32(52-64); return vec_vsld (

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-10-29 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #9 from Steven Munroe --- (In reply to Segher Boessenkool from comment #7) > It is always more and slower code. Yes. lets try some specific examples and examine the code generated for power8/9/10 vui32_t test_slw_23_v0 (vui32_t vr

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-10-26 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #6 from Steven Munroe --- I am starting to see pattern and wonder if the compiler is confused by assuming the sihft count must match the width/type of the shift/rotate target. This is implied all the way back to the Altivec-PIM and

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-10-26 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 Steven Munroe changed: What|Removed |Added Attachment #59323|0 |1 is obsolete|

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-11-07 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #12 from Steven Munroe --- Is seem like even for small values of signed char vec_splats ((signed char)x) will sometime generate 2 instruction where it should only generate a single xxspltib.

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-11-07 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #13 from Steven Munroe --- Is seem like even for small values of signed char vec_splats ((signed char)x) for target -mcpu=power9 will sometime generate 2 instruction where it should only generate a single xxspltib.

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-11-07 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #11 from Steven Munroe --- Created attachment 59560 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59560&action=edit Test cases for vec_splats(signed chat) on -mcpu=power9 for and valid char value I would expect for example ve

[Bug target/117818] vec_add incorrectly generates vadduwm for vector char const inputs.

2024-11-27 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #2 from Steven Munroe --- Same issues compiled for power9/10

[Bug target/117818] New: vec_add incorrectly generates vadduwm for vector char const inputs.

2024-11-27 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 Bug ID: 117818 Summary: vec_add incorrectly generates vadduwm for vector char const inputs. Product: gcc Version: 13.3.1 Status: UNCONFIRMED Severity: normal

[Bug target/117818] vec_add incorrectly generates vadduwm for vector char const inputs.

2024-11-27 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #1 from Steven Munroe --- May be related to 117007

[Bug target/117818] vec_add incorrectly generates vadduwm for vector char const inputs.

2024-11-28 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #3 from Steven Munroe --- Tried replacing generic vec_add with specific vec_addubm (__builtin_vec_add/__builtin_vec_vaddubm). No joy compiler still generates vadduwm and load from ,rodata.

[Bug target/117818] vec_add incorrectly generates vadduwm for vector char const inputs.

2024-11-30 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #5 from Steven Munroe --- I expected compiling for -mcpu=power9 to do a better job generating splats for small constants. Given the new instructions like VSX Vector Splat Immediate Byte (xxspltib) and Vector Extend Sign Byte To Word

[Bug target/117818] vec_add incorrectly generates vadduwm for vector char const inputs.

2024-11-30 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #4 from Steven Munroe --- Created attachment 59756 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59756&action=edit Updated Test case Vector shift long with const shift count -mcpu=power9 This is an extension of the original w

[Bug target/117718] Inefficient address computation for d-form vector loads

2024-12-01 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117718 --- Comment #6 from Steven Munroe --- Another issues with vector loads from .rodata Some times the compiler will generate this sequence for power8 addis 9,2,.LC69@toc@ha addi 9,9,.LC69@toc@l rldicr 9,9,0,59 lxv

[Bug target/117818] [12/13/14/15 regression] vec_add incorrectly generates vadduwm for vector char const inputs.

2025-02-05 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #7 from Steven Munroe --- (In reply to Richard Biener from comment #6) > is that powerpc64le or powerpc{,64} big endian? (or both) Definitely powerpc64le because few distros support powerpc targets. I think the lasts GCC I have th

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-11-22 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 Steven Munroe changed: What|Removed |Added Attachment #59291|0 |1 is obsolete|

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2024-11-22 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #15 from Steven Munroe --- Found where handling of vec_splat_u32 constant shift counts are handled differently across the various shift/rotate intrinsics. Even for the 5-bit shift counts (the easy case) the behavior of the various s

[Bug target/117718] Inefficient address computation for d-form vector loads

2024-11-22 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117718 --- Comment #5 from Steven Munroe --- (In reply to Michael Meissner from comment #3) > No, the issue is with DQ addressing (i.e. vector load/store with offset), we > can't guarantee that the external address will be properly aligned with the > b

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-01-14 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #2 from Steven Munroe --- Created attachment 60156 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60156&action=edit Examples for vector quadword shift by const immediate for POWER9 Compile with gcc -O3 -Wall -S -mcpu=power9 -m

[Bug target/118480] New: Power9 target generates poor code for vector char splat immediate.

2025-01-14 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 Bug ID: 118480 Summary: Power9 target generates poor code for vector char splat immediate. Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-01-14 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #1 from Steven Munroe --- Strangely the ticks that seem to work for positive immediate values (see test_slqi_char_18_V3 above) fail (generate and .rodata load) for negative values. For example the shift count for 110 (110-128 = -18)

[Bug target/119760] GCC does not implement intrinsics for Vector Multiply-by-10 Unsigned Quadword and varients

2025-04-14 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119760 --- Comment #2 from Steven Munroe --- (In reply to Richard Biener from comment #1) > Likely because GCC doesn't know anything about BCD (no BCD "modes", builtins > or optabs or direct internal functions). As I stated the existing bcdadd/sub bui

[Bug target/119760] New: GCC does not implement intrinsics for Vector Multiply-by-10 Unsigned Quadword and varients

2025-04-12 Thread munroesj at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119760 Bug ID: 119760 Summary: GCC does not implement intrinsics for Vector Multiply-by-10 Unsigned Quadword and varients Product: gcc Version: 14.0 Status: UNCONFIRMED