https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230
Bug ID: 93230 Summary: PowerPC GCC vec_extract of a vector in memory does not fold sign/zero extension into load Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: meissner at gcc dot gnu.org Target Milestone: --- Created attachment 47634 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47634&action=edit Example code In working on some bugs and extensions for -mcpu=future, I noticed that the code for vec_extract is not optimal when you are extracting an 8/16/32-bit integer from a vector in memory. In this case, we convert the vec_extract to be a load of the scalar value, but we don't have the proper combine insns to fold the sign extend or zero extend into the load, which means we have to issue a separate conversion instruction. For example, consider: #include <altivec.h> unsigned long v8hi_uns_1 (vector unsigned short *p) { return (unsigned long) vec_extract (*p, 1); } long v8hi_sign_1 (vector unsigned short *p) { return (long) vec_extract (*p, 1); } It generates: v8hi_uns_1: lhz 3,2(3) rlwinm 3,3,0,0xffff blr v8hi_sign_1: lhz 3,2(3) extsh 3,3 blr It should generate: v8hi_uns_1: lhz 3,2(3) blr v8hi_sign_1: lhz 3,2(3) blr