https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93449

--- Comment #4 from Jens Seifert <jens.seifert at de dot ibm.com> ---
Power8 has bcdadd which can be only combined with _Decimal128 if you have some
kind of conversion in between BCDs stored in vector register and _Decimal128.

On Power9 vec_load_len/vec_store_len can be used to load variable length BCDs.
On Power7/8 I can load variable length BCDs as well (with more instructions),
but overall it is desirable to have the possibility to convert vector to
_Decimal128 and vice versa.

I suppose I can survive with inline assembly like below. The assembly works for
p7-p9 with optimal speed.

The memcpy inline between vector and _Decimal128 is not optimal for
-mcpu=power7-9. Always a store/load (lacking XNOP) ending up in load-hit-store
issue.

Reply via email to