https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118362
--- Comment #1 from Stefan Schulze Frielinghaus <stefansf at gcc dot gnu.org> --- Yikes, the optimization should only apply for constant vectors which are supported by the hardware. That means vectors up to 16 byte. For s390_constant_via_vgm_p() and s390_constant_via_vrepi_p() we utilize s390_constant_via_vgm_vrepi_1() and bail out for any constant vector larger than 16 byte. For s390_constant_via_vgbm_p() I have been missing this. Something along the lines diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index 918a2cd6c6d..08acb69de3e 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -2818,7 +2818,7 @@ s390_constant_via_vgbm_p (rtx op, unsigned *mask) unsigned tmp_mask = 0; int nunit, unit_size; - if (GET_CODE (op) == CONST_VECTOR) + if (GET_CODE (op) == CONST_VECTOR && GET_MODE_SIZE (GET_MODE (op)) <= 16) { if (GET_MODE_INNER (GET_MODE (op)) == TImode || GET_MODE_INNER (GET_MODE (op)) == TFmode) should fix it (maybe a power-of-two check should be added, too). I will revisit this tomorrow. Sorry for the hassle.