https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447
ktkachov at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed|2025-05-27 00:00:00 |2025-5-30 CC| |ktkachov at gcc dot gnu.org, | |rsandifo at gcc dot gnu.org --- Comment #4 from ktkachov at gcc dot gnu.org --- The crash through aarch64_emit_load_store_through_mode happens when to do a move from: (mem/u/c:VNx4QI (reg/f:DI 119) [0 S[4, 4] A32]) to (reg:VNx4QI 115) through a QImode subreg. In the code aarch64_emit_load_store_through_mode if (MEM_P (src)) { rtx tmp = force_reg (new_mode, adjust_address (src, new_mode, 0)); tmp = force_lowpart_subreg (int_mode, tmp, new_mode); emit_move_insn (dest, force_lowpart_subreg (mode, tmp, int_mode)); } The line: tmp = force_lowpart_subreg (int_mode, tmp, new_mode); returns NULL_RTX which crashes the emit_move_insn. >From what I can tell the validate_subreg check inside the eventual callee inside the force_lowpart_subreg chain returns false when validating a VNx4QI subreg of reg:QI. This is because the [4, 4] size of VNx4QI is not ordered_p wrt REGMODE_NATURAL_SIZE (QImode), which is 8. I guess we could do such checks in aarch64_expand_maskloadstore before calling aarch64_emit_load_store_through_mode but we should be able to make this optimisation for QImode so is there a way perhaps to allow this?