https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122891

            Bug ID: 122891
           Summary: Exploit xxeval instruction for operations of the form
                    (or A, (and B, C))
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jeevitha at gcc dot gnu.org
  Target Milestone: ---
            Target: powerc64le-linux-gnu

In the following testcase, we have an expression of the form (or A, (and B,
C)), which can be implemented using a single xxeval instruction instead of
multiple logical operations. 

Testcase:

typedef __vector unsigned int vec_u32_t;

vec_u32_t test1(vec_u32_t a, vec_u32_t b)
{
    vec_u32_t t0 = vec_and(a, vec_splats((unsigned int)0x003f));
    vec_u32_t t1 = vec_and(b, vec_splats((unsigned int)0x3f00));
    return vec_or(t0, t1);
}

Currently, Power10 generates:
test1(unsigned int __vector(4), unsigned int __vector(4)):
        xxspltiw 0,16128
        xxspltiw 32,63
        xxlor 33,34,34
        xxland 35,35,0
        vand 2,0,1
        vor 2,2,3

Instead, it could emit following sequence:
        xxspltiw %vs1, 16128
        xxspltiw %vs0, 63
        xxland  %vs1, %vs35, %vs1
        xxeval  %vs34, %vs1, %vs34, %vs0, 31

Reply via email to