https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93230

            Bug ID: 93230
           Summary: PowerPC GCC vec_extract of a vector in memory does not
                    fold sign/zero extension into load
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

Created attachment 47634
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47634&action=edit
Example code

In working on some bugs and extensions for -mcpu=future, I noticed that the
code for vec_extract is not optimal when you are extracting an 8/16/32-bit
integer from a vector in memory.  In this case, we convert the vec_extract to
be a load of the scalar value, but we don't have the proper combine insns to
fold the sign extend or zero extend into the load, which means we have to issue
a separate conversion instruction.

For example, consider:

    #include <altivec.h>

    unsigned long
    v8hi_uns_1 (vector unsigned short *p)
    {
      return (unsigned long) vec_extract (*p, 1);
    }

    long
    v8hi_sign_1 (vector unsigned short *p)
    {
      return (long) vec_extract (*p, 1);
    }

It generates:

    v8hi_uns_1:
        lhz 3,2(3)
        rlwinm 3,3,0,0xffff
        blr

    v8hi_sign_1:
        lhz 3,2(3)
        extsh 3,3
        blr

It should generate:

    v8hi_uns_1:
        lhz 3,2(3)
        blr

    v8hi_sign_1:
        lhz 3,2(3)
        blr

Reply via email to