https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
Bug ID: 100085
Summary: Bad code for union transfer from __float128 to vector
types
Product: gcc
Version: 10.2.1
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
Steven Munroe changed:
What|Removed |Added
CC||munroesj at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #4 from Steven Munroe ---
I am seeing this a similar problem with union transfers from __float128 to
__int128.
static inline unsigned __int128
vec_xfer_bin128_2_int128t (__binary128 f128)
{
__VF_128 vunion;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98519
--- Comment #5 from Steven Munroe ---
I would think you need to look at the instruction and the "m" constraint.
In this case lxsd%X1 would need to be converted to plxsd and the "m" constraint
would have to allow @pcrel. I would think a static va
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98519
--- Comment #7 from Steven Munroe ---
Then you have problem as @pcrel is never valid for an instruction like lxsd%X1.
Seems like you will need a new constrain or modifier specific to @pcrel.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293
Bug ID: 99293
Summary: Built-in vec_splat generates sub-optimal code for
-mcpu=power10
Product: gcc
Version: 10.2.1
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293
--- Comment #1 from Steven Munroe ---
Created attachment 50264
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50264&action=edit
Compile test for simplied test case
Download vec_dummy.c and vec_int128_ppc.h into a local directory and compil
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #5 from Steven Munroe ---
Any progress on this?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #13 from Steven Munroe ---
"We want to use plain TImode instead of V1TImode on newer cpus."
Actually I disagree. We have vector __int128 in the ABI and with POWER10 a
complete set arithmetic operations for 128-bit in VRs.
Also this
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #16 from Steven Munroe ---
Created attachment 52510
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52510&action=edit
Reduced tests for xfers from _float128 to vector or __int128
Cover more types including __int128 and vector _
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
Steven Munroe changed:
What|Removed |Added
Status|RESOLVED|REOPENED
Resolution|FIXED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #21 from Steven Munroe ---
Yes I was told by Peter Bergner that the fix from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085#c15 had been back ported
top AT15.0-1.
But when ran this test with AT15.0-1 I saw:
:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #23 from Steven Munroe ---
Ok, but I strongly recommend a compiler test that verify that the compiler is
generating the expected code (for this and other cases).
We have a history of common code changes (accidental or deliberate) ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106755
Bug ID: 106755
Summary: Incorrect code gen for altivec intrinsics with
constant inputs
Product: gcc
Version: 12.2.1
Status: UNCONFIRMED
Severity: blocker
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
Bug ID: 104124
Summary: Poor optimization for vector splat DW with small
consts
Product: gcc
Version: 11.1.1
Status: UNCONFIRMED
Severity: normal
Pri
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
Steven Munroe changed:
What|Removed |Added
CC||munroesj at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
Steven Munroe changed:
What|Removed |Added
Attachment #52236|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795
Bug ID: 110795
Summary: Bad code gen for vector compare booleans
Product: gcc
Version: 13.1.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795
--- Comment #1 from Steven Munroe ---
Created attachment 55627
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55627&action=edit
Main and unit-test. When compiled and linked with vec_divide.c will verify if
the divide code is correct or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795
--- Comment #2 from Steven Munroe ---
Also fails with gcc11/12. Also fails with Advance Toolchain 10.0 GCC 6.4.1.
It might fail for all version between GCC 6 and 13.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795
--- Comment #5 from Steven Munroe ---
Thanks, sorry I missed the obvious.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
Bug ID: 111645
Summary: Intrinsics vec_sldb /vec_srdb fail with __vector
unsigned __int128
Product: gcc
Version: 13.2.1
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
Steven Munroe changed:
What|Removed |Added
Attachment #56018|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
--- Comment #3 from Steven Munroe ---
(In reply to Peter Bergner from comment #1)
> I see that we have created built-in overloads for signed and unsigned vector
> char through vector long long. That said, the rs6000-builtins.def only
> seems to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
--- Comment #4 from Steven Munroe ---
Actually shift/rotate intrinsic: ,vec_rl, vec_rlmi, vec_rlnm, vec_sl, vec_sr,
vec_sra
Support vector __int128 as required for the PowerISA 3.1 POWER vector
shift/rotate quadword instructions
But: vec_sld,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
--- Comment #6 from Steven Munroe ---
(In reply to Carl Love from comment #5)
> There are a couple of issues with the test case in the attachment. For
> example one of the tests is:
>
>
> static inline vui64_t
> vec_vsldbi_64 (vui64_t vra, vu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
--- Comment #5 from Steven Munroe ---
Thanks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116004
Bug ID: 116004
Summary: PPC64 vector Intrinsic vec_first_mismatch_or_eos_index
generates poor code
Product: gcc
Version: 13.2.1
Status: UNCONFIRMED
Severity: n
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116004
--- Comment #1 from Steven Munroe ---
Compile test code examples:
int
test_intrn_first_mismatch_or_eos_index_PWR9 (vui8_t vra, vui8_t vrb)
{
return vec_first_mismatch_or_eos_index (vra, vrb);
}
int
test_first_mismatch_byte_or_eos_index_PWR9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116004
--- Comment #2 from Steven Munroe ---
Actually:
abnez = (vui8_t) vec_cmpnez (vra, vrb);
result = vec_cntlz_lsbb (abnez);
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #4 from Steven Munroe ---
Created attachment 59323
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59323&action=edit
Examples doe Vector DW int constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #3 from Steven Munroe ---
I tested the attached example source on GCC 14.0.1 from Ubuntu on powerpc64le.
Seeing the same results. So add GCC 14.0.1 to this list. Actually the last GCC
version that did not have this bug was GCC 7. Lo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
Bug ID: 117007
Summary: Poor optimiation for small vector constants needed for
vector shift/rotate/mask genration.
Product: gcc
Version: 13.2.1
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #10 from Steven Munroe ---
(In reply to Segher Boessenkool from comment #7)
> It is always more and slower code. Yes.
More examples:
vui64_t
test_sld_52_v1 (vui64_t vra)
{
vui32_t shft = vec_splat_u32(52-64);
return vec_vsld (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #9 from Steven Munroe ---
(In reply to Segher Boessenkool from comment #7)
> It is always more and slower code. Yes.
lets try some specific examples and examine the code generated for power8/9/10
vui32_t
test_slw_23_v0 (vui32_t vr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #6 from Steven Munroe ---
I am starting to see pattern and wonder if the compiler is confused by assuming
the sihft count must match the width/type of the shift/rotate target.
This is implied all the way back to the Altivec-PIM and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
Steven Munroe changed:
What|Removed |Added
Attachment #59323|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #12 from Steven Munroe ---
Is seem like even for small values of signed char vec_splats ((signed char)x)
will sometime generate 2 instruction where it should only generate a single
xxspltib.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #13 from Steven Munroe ---
Is seem like even for small values of signed char vec_splats ((signed char)x)
for target -mcpu=power9 will sometime generate 2 instruction where it should
only generate a single xxspltib.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #11 from Steven Munroe ---
Created attachment 59560
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59560&action=edit
Test cases for vec_splats(signed chat) on -mcpu=power9
for and valid char value I would expect for example ve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #2 from Steven Munroe ---
Same issues compiled for power9/10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
Bug ID: 117818
Summary: vec_add incorrectly generates vadduwm for vector char
const inputs.
Product: gcc
Version: 13.3.1
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #1 from Steven Munroe ---
May be related to 117007
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #3 from Steven Munroe ---
Tried replacing generic vec_add with specific vec_addubm
(__builtin_vec_add/__builtin_vec_vaddubm).
No joy compiler still generates vadduwm and load from ,rodata.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #5 from Steven Munroe ---
I expected compiling for -mcpu=power9 to do a better job generating splats for
small constants.
Given the new instructions like VSX Vector Splat Immediate Byte (xxspltib) and
Vector Extend Sign Byte To Word
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #4 from Steven Munroe ---
Created attachment 59756
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59756&action=edit
Updated Test case Vector shift long with const shift count -mcpu=power9
This is an extension of the original w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117718
--- Comment #6 from Steven Munroe ---
Another issues with vector loads from .rodata
Some times the compiler will generate this sequence for power8
addis 9,2,.LC69@toc@ha
addi 9,9,.LC69@toc@l
rldicr 9,9,0,59
lxv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #7 from Steven Munroe ---
(In reply to Richard Biener from comment #6)
> is that powerpc64le or powerpc{,64} big endian? (or both)
Definitely powerpc64le because few distros support powerpc targets.
I think the lasts GCC I have th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
Steven Munroe changed:
What|Removed |Added
Attachment #59291|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #15 from Steven Munroe ---
Found where handling of vec_splat_u32 constant shift counts are handled
differently across the various shift/rotate intrinsics.
Even for the 5-bit shift counts (the easy case) the behavior of the various
s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117718
--- Comment #5 from Steven Munroe ---
(In reply to Michael Meissner from comment #3)
> No, the issue is with DQ addressing (i.e. vector load/store with offset), we
> can't guarantee that the external address will be properly aligned with the
> b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480
--- Comment #2 from Steven Munroe ---
Created attachment 60156
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60156&action=edit
Examples for vector quadword shift by const immediate for POWER9
Compile with gcc -O3 -Wall -S -mcpu=power9 -m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480
Bug ID: 118480
Summary: Power9 target generates poor code for vector char
splat immediate.
Product: gcc
Version: 13.2.1
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480
--- Comment #1 from Steven Munroe ---
Strangely the ticks that seem to work for positive immediate values (see
test_slqi_char_18_V3 above) fail (generate and .rodata load) for negative
values. For example the shift count for 110 (110-128 = -18)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119760
--- Comment #2 from Steven Munroe ---
(In reply to Richard Biener from comment #1)
> Likely because GCC doesn't know anything about BCD (no BCD "modes", builtins
> or optabs or direct internal functions).
As I stated the existing bcdadd/sub bui
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119760
Bug ID: 119760
Summary: GCC does not implement intrinsics for Vector
Multiply-by-10 Unsigned Quadword and varients
Product: gcc
Version: 14.0
Status: UNCONFIRMED
56 matches
Mail list logo