https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447
--- Comment #5 from Tamar Christina <tnfchris at gcc dot gnu.org> --- I could be mistaken, but VNx4QI is a partial vector, so every QI element occupies 32-bits (so we'd use a widening load here). I'm not sure this operation is valid for partial vectors as it means you're taking a subreg of an unpacked representation. Again I could be wrong, but I assume you can't do it for partial vectors as you need the zero extends on the elements.