[Bug c/92436] New: SIMD integer subtract with constant always becomes add

2019-11-09 Thread zingaburga+gcc at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92436

Bug ID: 92436
   Summary: SIMD integer subtract with constant always becomes add
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zingaburga+gcc at hotmail dot com
  Target Milestone: ---

Firstly, this isn't a bug, rather a missed optimization opportunity (I presume
this is the place to post these?).


With the optimizer enabled, it seems like SIMD integer subtract, with a
constant, always gets turned into a SIMD integer add with the constant negated.

For a target like x86 SSE, I suppose this may make sense, as the commutative
property of addition gives more flexibility around register placement, but it
isn't always beneficial - for example, if the constant could be re-used
elsewhere.

Example (x86):

_mm_or_si128(
_mm_sub_epi8(a, _mm_set1_epi8(99)),
_mm_set1_epi8(99)
);

In this case, the '99' constant can be used in both the subtract and or, but
GCC will always convert the first use to a '-99' constant, meaning that it now
has to deal with two constants: https://godbolt.org/z/gaKAkA

This can have a greater effect when the constants are held in registers, as the
negated constant wastes a register, which can sometimes cause otherwise
unnecessary register spilling elsewhere.

The behavior persists with AVX enabled, and I've even seen it for ARM NEON:
https://godbolt.org/z/z3b5mq

---

Perhaps a different issue, but maybe related: I noticed that switching the
order of the arguments for subtract, GCC seems to think the two constants are
different, even though this is not the case: https://godbolt.org/z/6fGhGd

For this second example ((99-a)|99), I'd have thought the more appropriate
assembly to be something like:

vmovdqa xmm1, XMMWORD PTR .LC0[rip]
vpsubb  xmm0, xmm1, xmm0
vporxmm0, xmm0, xmm1

[Bug target/92437] New: Unnecessary register duplication of vector constant in x86

2019-11-10 Thread zingaburga+gcc at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92437

Bug ID: 92437
   Summary: Unnecessary register duplication of vector constant in
x86
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zingaburga+gcc at hotmail dot com
  Target Milestone: ---

Consider the following code example:

#include 
void fn(__m128i* in, __m128i* out) {
int i=0;
const __m128i num = _mm_set1_epi8(99);
while(i<100) {
__m128i a = in[i];
__m128i b = _mm_add_epi8(a, num);
if(_mm_movemask_epi8(b))
a = _mm_or_si128(a, num);
if(_mm_movemask_epi8(a))
a = _mm_or_si128(a, num);
out[i] = a;
i++;
}
}

The vector `num` is referenced 3 times in the loop, and GCC seems to load it
into 3 separate registers, when 1 would suffice: https://godbolt.org/z/mP22ez
(in this link, the `99` vector is held in xmm2, xmm3 and xmm4).

This seems to be the case regardless of AVX being enabled or not.

I don't really get what a possible cause for this is, but it seems that the
`if` conditions are necessary to trigger this effect.

[Bug target/92437] Unnecessary register duplication of vector constant in x86

2019-11-10 Thread zingaburga+gcc at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92437

--- Comment #2 from zingaburga+gcc at hotmail dot com ---
Thanks for the info Andrew!
Changing the add to `_mm_add_epi64` does seem to eliminate all instances of the
duplication.

[Bug target/114069] New: Type punning RISC-V vectors causes ICE at -O1

2024-02-22 Thread zingaburga+gcc at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114069

Bug ID: 114069
   Summary: Type punning RISC-V vectors causes ICE at -O1
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zingaburga+gcc at hotmail dot com
  Target Milestone: ---

Type punning a RISC-V vector causes ICE under RV64 GCC 13.x/trunk:
https://godbolt.org/z/sajcb3T7z

Seems to work with -O0 instead of -O1, on GCC 13.x


Code:

#include 

vbool8_t f(vuint8m1_t s) {
// unavailable in GCC 13, available in trunk
//return __riscv_vreinterpret_v_u8m1_b8(s);

// causes ICE in GCC 13 + trunk
return *reinterpret_cast(&s);

// this seems to work without issue
vuint8mf8_t f = __riscv_vlmul_trunc_v_u8m1_u8mf8(s);
return *reinterpret_cast(&f);
}


Compiler options: -march=rv64gcv -O1


Output:

during RTL pass: expand
: In function 'vbool8_t f(vuint8m1_t)':
:8:47: internal compiler error: in convert_move, at expr.cc:219
8 | return *reinterpret_cast(&s);
  |   ^
0x7fb7d5029e3f __libc_start_main
???:0
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.
Compiler returned: 1