https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119760
--- Comment #2 from Steven Munroe ---
(In reply to Richard Biener from comment #1)
> Likely because GCC doesn't know anything about BCD (no BCD "modes", builtins
> or optabs or direct internal functions).
As I stated the existing bcdadd/sub bui
Severity: minor
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
POWER8 (PowerISA 2.07) introduced Binary Coded Decimal (BCD) Add/Subtract.
GCC implemented builtins
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #7 from Steven Munroe ---
(In reply to Richard Biener from comment #6)
> is that powerpc64le or powerpc{,64} big endian? (or both)
Definitely powerpc64le because few distros support powerpc targets.
I think the lasts GCC I have th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480
--- Comment #2 from Steven Munroe ---
Created attachment 60156
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60156&action=edit
Examples for vector quadword shift by const immediate for POWER9
Compile with gcc -O3 -Wall -S -mcpu=power9 -m
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480
--- Comment #1 from Steven Munroe ---
Strangely the ticks that seem to work for positive immediate values (see
test_slqi_char_18_V3 above) fail (generate and .rodata load) for negative
values. For example the shift count for 110 (110-128 = -18)
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
POWER9 (PowerISA 3.0C) adds the VSX Vector Splat Immediate Byte (xxspltib)
instruction that is perfect for generating small
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117718
--- Comment #6 from Steven Munroe ---
Another issues with vector loads from .rodata
Some times the compiler will generate this sequence for power8
addis 9,2,.LC69@toc@ha
addi 9,9,.LC69@toc@l
rldicr 9,9,0,59
lxv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #5 from Steven Munroe ---
I expected compiling for -mcpu=power9 to do a better job generating splats for
small constants.
Given the new instructions like VSX Vector Splat Immediate Byte (xxspltib) and
Vector Extend Sign Byte To Word
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #4 from Steven Munroe ---
Created attachment 59756
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59756&action=edit
Updated Test case Vector shift long with const shift count -mcpu=power9
This is an extension of the original w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #3 from Steven Munroe ---
Tried replacing generic vec_add with specific vec_addubm
(__builtin_vec_add/__builtin_vec_vaddubm).
No joy compiler still generates vadduwm and load from ,rodata.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #2 from Steven Munroe ---
Same issues compiled for power9/10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818
--- Comment #1 from Steven Munroe ---
May be related to 117007
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 59731
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59731&action=edit
Test case Vector shi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #15 from Steven Munroe ---
Found where handling of vec_splat_u32 constant shift counts are handled
differently across the various shift/rotate intrinsics.
Even for the 5-bit shift counts (the easy case) the behavior of the various
s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
Steven Munroe changed:
What|Removed |Added
Attachment #59291|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117718
--- Comment #5 from Steven Munroe ---
(In reply to Michael Meissner from comment #3)
> No, the issue is with DQ addressing (i.e. vector load/store with offset), we
> can't guarantee that the external address will be properly aligned with the
> b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #13 from Steven Munroe ---
Is seem like even for small values of signed char vec_splats ((signed char)x)
for target -mcpu=power9 will sometime generate 2 instruction where it should
only generate a single xxspltib.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #12 from Steven Munroe ---
Is seem like even for small values of signed char vec_splats ((signed char)x)
will sometime generate 2 instruction where it should only generate a single
xxspltib.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #11 from Steven Munroe ---
Created attachment 59560
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59560&action=edit
Test cases for vec_splats(signed chat) on -mcpu=power9
for and valid char value I would expect for example ve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #10 from Steven Munroe ---
(In reply to Segher Boessenkool from comment #7)
> It is always more and slower code. Yes.
More examples:
vui64_t
test_sld_52_v1 (vui64_t vra)
{
vui32_t shft = vec_splat_u32(52-64);
return vec_vsld (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #9 from Steven Munroe ---
(In reply to Segher Boessenkool from comment #7)
> It is always more and slower code. Yes.
lets try some specific examples and examine the code generated for power8/9/10
vui32_t
test_slw_23_v0 (vui32_t vr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #6 from Steven Munroe ---
I am starting to see pattern and wonder if the compiler is confused by assuming
the sihft count must match the width/type of the shift/rotate target.
This is implied all the way back to the Altivec-PIM and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
Steven Munroe changed:
What|Removed |Added
Attachment #59323|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #4 from Steven Munroe ---
Created attachment 59323
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59323&action=edit
Examples doe Vector DW int constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007
--- Comment #3 from Steven Munroe ---
I tested the attached example source on GCC 14.0.1 from Ubuntu on powerpc64le.
Seeing the same results. So add GCC 14.0.1 to this list. Actually the last GCC
version that did not have this bug was GCC 7. Lo
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 59291
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59291&action=edit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116004
--- Comment #2 from Steven Munroe ---
Actually:
abnez = (vui8_t) vec_cmpnez (vra, vrb);
result = vec_cntlz_lsbb (abnez);
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116004
--- Comment #1 from Steven Munroe ---
Compile test code examples:
int
test_intrn_first_mismatch_or_eos_index_PWR9 (vui8_t vra, vui8_t vrb)
{
return vec_first_mismatch_or_eos_index (vra, vrb);
}
int
test_first_mismatch_byte_or_eos_index_PWR9
: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
GCC13 generates the following code for the intrinsic
vec_first_mismatch_or_eos_index -mcpu=power9 -O3:
00c0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
--- Comment #6 from Steven Munroe ---
(In reply to Carl Love from comment #5)
> There are a couple of issues with the test case in the attachment. For
> example one of the tests is:
>
>
> static inline vui64_t
> vec_vsldbi_64 (vui64_t vra, vu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
--- Comment #4 from Steven Munroe ---
Actually shift/rotate intrinsic: ,vec_rl, vec_rlmi, vec_rlnm, vec_sl, vec_sr,
vec_sra
Support vector __int128 as required for the PowerISA 3.1 POWER vector
shift/rotate quadword instructions
But: vec_sld,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
--- Comment #3 from Steven Munroe ---
(In reply to Peter Bergner from comment #1)
> I see that we have created built-in overloads for signed and unsigned vector
> char through vector long long. That said, the rs6000-builtins.def only
> seems to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111645
Steven Munroe changed:
What|Removed |Added
Attachment #56018|0 |1
is obsolete|
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 56018
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56018&action=edit
example of the problem.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795
--- Comment #5 from Steven Munroe ---
Thanks, sorry I missed the obvious.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795
--- Comment #2 from Steven Munroe ---
Also fails with gcc11/12. Also fails with Advance Toolchain 10.0 GCC 6.4.1.
It might fail for all version between GCC 6 and 13.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110795
--- Comment #1 from Steven Munroe ---
Created attachment 55627
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55627&action=edit
Main and unit-test. When compiled and linked with vec_divide.c will verify if
the divide code is correct or not
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 55626
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55626&action=edit
Test examples for vector code combinining vector compare combined with log
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
--- Comment #5 from Steven Munroe ---
Thanks
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 53514
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53514&action=edit
Reducted test case for vec_muludq
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #23 from Steven Munroe ---
Ok, but I strongly recommend a compiler test that verify that the compiler is
generating the expected code (for this and other cases).
We have a history of common code changes (accidental or deliberate) ca
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #21 from Steven Munroe ---
Yes I was told by Peter Bergner that the fix from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085#c15 had been back ported
top AT15.0-1.
But when ran this test with AT15.0-1 I saw:
:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
Steven Munroe changed:
What|Removed |Added
Status|RESOLVED|REOPENED
Resolution|FIXED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #16 from Steven Munroe ---
Created attachment 52510
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52510&action=edit
Reduced tests for xfers from _float128 to vector or __int128
Cover more types including __int128 and vector _
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
Steven Munroe changed:
What|Removed |Added
Attachment #52236|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
Steven Munroe changed:
What|Removed |Added
CC||munroesj at gcc dot gnu.org
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
It looks to me like the compiler is seeing register pressure caused by loading
all the vector long long constants I need in my code. This
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #13 from Steven Munroe ---
"We want to use plain TImode instead of V1TImode on newer cpus."
Actually I disagree. We have vector __int128 in the ABI and with POWER10 a
complete set arithmetic operations for 128-bit in VRs.
Also this
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #5 from Steven Munroe ---
Any progress on this?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
--- Comment #4 from Steven Munroe ---
I am seeing this a similar problem with union transfers from __float128 to
__int128.
static inline unsigned __int128
vec_xfer_bin128_2_int128t (__binary128 f128)
{
__VF_128 vunion;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100085
Steven Munroe changed:
What|Removed |Added
CC||munroesj at gcc dot gnu.org
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 50595
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50595&action=edit
Reduced example of un
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99293
--- Comment #1 from Steven Munroe ---
Created attachment 50264
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50264&action=edit
Compile test for simplied test case
Download vec_dummy.c and vec_int128_ppc.h into a local directory and compil
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 50263
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50263&action=edit
Simplified test case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98519
--- Comment #7 from Steven Munroe ---
Then you have problem as @pcrel is never valid for an instruction like lxsd%X1.
Seems like you will need a new constrain or modifier specific to @pcrel.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98519
--- Comment #5 from Steven Munroe ---
I would think you need to look at the instruction and the "m" constraint.
In this case lxsd%X1 would need to be converted to plxsd and the "m" constraint
would have to allow @pcrel. I would think a static va
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85830
--- Comment #3 from Steven Munroe ---
(In reply to Carl Love from comment #2)
> Hit the save button a little too fast missed putting in everything I
> intended to put in. Lets try to get it all in.
>
> > In altivec.h they are defined as:
> >
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96139
--- Comment #3 from Steven Munroe ---
(In reply to Bill Schmidt from comment #2)
> Have you tried it for -m32, out of curiosity?
no
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96139
--- Comment #1 from Steven Munroe ---
Created attachment 48851
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48851&action=edit
Test case for printf of vector long long int elements
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
When printing vector element for example:
printf ("%s %016llx,%016llx\n", prefix, val[1], val[0]);
where val is a vector uns
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
Created attachment 44147
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44147&action=edit
compile test case for vec_popcntd.
Altivec.h should define either the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83402
--- Comment #9 from Steven Munroe ---
I suggested fixing the emmintrin.h source for both eventually ...
If you only fix AT11 then sometime later some will discover the difference and
try fix it. And likely break it again.
So fix AT immediately
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83402
--- Comment #7 from Steven Munroe ---
Ok it could be that compiler behavior changed.
You where testing gcc-trunk?
Please try the same test with AT11 gcc7. I know I hit this!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83402
--- Comment #5 from Steven Munroe ---
You need to look at the generated asm code. And see what the compiler is doing.
Basically it should be generating a vspltisw vr,si for vec_splat_s32.
But if the immediate signed int (si) is greater than 15,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83964
--- Comment #18 from Steven Munroe ---
(In reply to jos...@codesourcery.com from comment #17)
> And, when long is 64-bit, there is no corresponding standard function to
> round to 32-bit integer with "invalid" raised for out-of-range results -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83964
--- Comment #13 from Steven Munroe ---
WTF which part of requirement did you not understand.
You you should implement the direct moves (to GPRs) to complete the
__builtin_fctid and __builtin_fctiw implementation.
But to just remove them is mis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83964
--- Comment #11 from Steven Munroe ---
The requirement was to reduce the use of (in-line) assembler in libraries. Asm
is error prone in the light of 32/64-bit ABI difference and the compiler
(usual) generates the correct code for the target.
Flo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84266
--- Comment #10 from Steven Munroe ---
Change this to RESOLVED state now?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84266
--- Comment #9 from Steven Munroe ---
Author: munroesj
Date: Sun Feb 11 21:45:39 2018
New Revision: 257571
URL: https://gcc.gnu.org/viewcvs?rev=257571&root=gcc&view=rev
Log:
Fix PR 84266
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84266
--- Comment #7 from Steven Munroe ---
Created attachment 43388
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43388&action=edit
correct mmintrin.h for power9
2018-02-09 Steven Munroe
* config/rs6000/mmintrin.h (_mm_cmpeq_pi32 [
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84266
Steven Munroe changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84266
--- Comment #4 from Steven Munroe ---
Yup this looks like a pasteo from the pi16 implementation which was not caught
as P9 was rare at the time.
The #if _ARCH_PWR9 clause is an optimization based on better timing for P9 (vs
P8) for GPR <-> VSR t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83402
--- Comment #1 from Steven Munroe ---
Similarly doe _mm_slli_epi64 for any const value > 15 and < 32. So:
if (__builtin_constant_p(__B))
{
if (__B < 32)
lshift = (__v2du) vec_splat_s32(__B);
els
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
The rs6000/emmintrin.h implementation of _mm_slli_epi32 reports:
error: argument 1 must be a 5-bit signed literal
For
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81539
Steven Munroe changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
Assignee: unassigned at gcc dot gnu.org
Reporter: munroesj at gcc dot gnu.org
Target Milestone: ---
The online GCC documentation mentions psABI though out the document, but
section "3.8 Options to Request or Suppress Warning" does not describe or even
mention -Wno-psabi.
This
76 matches
Mail list logo