https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114801

--- Comment #33 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That is still a hack, but guess can be acceptable for 14.22 and short term
trunk if the ARM maintainers approve it.
But, for GCC 15+, I think if the behavior is that when the predicate
constant/register is used in an instruction, regardless of the element mode it
actually performs per-byte predication, then it should be represented as
V16BImode, not V8BImode or V4BImode.
It is fine if instructions which produce the predicate mask like comparisons
produce V8BImode or V4BImode, but what consumes should use subreg of that to
V16BImode.
At least if the behavior is either perform the operation on all elements and
then based on the 16 bits in the predicate choose result between the newly
computed result and something else on byte by byte basis.  Or perhaps if the
operation is performed only
on elements where at least one predicate bit for the element is non-zero and
then merged.

I think it would be useful if you pointed at the docs how the instructions
exactly work or tried to explain it here.

Reply via email to