: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jens.seifert at de dot ibm.com
Target Milestone: ---
unsigned long long add(unsigned long long a, unsigned long long b, unsigned
long long *ovf)
{
return
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jens.seifert at de dot ibm.com
Target Milestone: ---
#include
#include
Up to 16 bytes consider using vector instructions for memcmp.
This is not required for 1,2,4,8 bytes, but for the rest.
For general
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jens.seifert at de dot ibm.com
Target Milestone: ---
I found no way to efficient check fp data class on z using wftcidb (z13) and
wftcisb(z14) instruction.
For PowerPC scalar_test_data_class exists and provides
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jens.seifert at de dot ibm.com
Target Milestone: ---
I want to use the z14 vlbr instruction, but I found no builtin for them.
The assembler claims "unknown" mnemonic for vlbr, but I see the instruction in
the &quo
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jens.seifert at de dot ibm.com
Target Milestone: ---
bool parity(unsigned long long l)
{
return __builtin_parityll(l);
}
bool parity2(unsigned long long l)
{
return
ity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jens.seifert at de dot ibm.com
Target Milestone: ---
bool parityll(unsigned long long x)
{
return __builtin_parityll(x);
}
Code generation for z15 and above is opti
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jens.seifert at de dot ibm.com
Target Milestone: ---
void lshift1(unsigned long long *a)
{
a[0] <<= 1;
a[1] <<= 1;
}
Output:
lshift1(unsigned long long*):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119468
--- Comment #2 from Jens Seifert ---
popcnt + parity is slower than just
64-bit popcount and extracting last bit.
"missed-optimization" opportunity applies as well to big endian.
Optimal code:
popcntd 3, 3
clrldi 3, 3, 63
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119468
--- Comment #4 from Jens Seifert ---
clang is emitting extended mnemonics.
On gcc, I only can enforce this by using inline assembly:
unsigned long long parityfast(unsigned long long in)
{
__asm__("popcntd %0,%1":"+r"(in));
return in & 1
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jens.seifert at de dot ibm.com
Target Milestone: ---
Shifts by -1 should be performed by a 0xFF..FF constant as
PPC has modulo shift and the constant generation for 0xFF..FF requires just 1
instruction.
On Power9 always use
101 - 110 of 110 matches
Mail list logo