Compiling the folloxing testcase:

#define vector __attribute__((__vector_size__(16) ))
float fa[100] __attribute__ ((__aligned__(16)));
vector float foo ()
{
  float f = fa[0];
  vector float vf = {f, f, f, f};
  return vf;
}

...with gcc -O2 -maltivec, we get:

ld      r9,0(r2)
lfs     f0,0(r9)
addi    r9,r1,-16
stfs    f0,-16(r1)
lvewx   v2,r0,r9
vspltw  v2,v2,0
blr

My problem is with the {lfs,stfs,lvewx} sequence: we load a value into f0, and
then store it (with stfs) into an aligned memory location, so that it could be
loaded from there into a vector (with lvewx). However, since the address from
which f0 was loaded is known to be aligned, we could directly do an lvewx from
there, and avoid the extra {lfs,stfs}, so the following should be enough:

ld      r9,0(r2)
lvewx   v2,r0,r9
vspltw  v2,v2,0
blr

The problem is that rs6000_expand_vector_init doesn't know that f0 is
originated from an aligned address. It gets the following as vals:

(parallel:V4SF [
        (reg/v:SF 119 [ f ])
        (reg/v:SF 119 [ f ])
        (reg/v:SF 119 [ f ])
        (reg/v:SF 119 [ f ])
    ])

We somehow want to expand 'f = fa[0]' and '{f,f,f,f}' together... if
expand_vector_init could get this as vals: '{fa[0],fa[0],fa[0],fa[0]}', it
could see that the original address is aligned. 
Alternatively, the prospects of getting rid of the redundant load and store
later on during some kind of a peephole optimization don't seem so high to
me... Thoughts?

This may be related to PR31334 (though there the issue is about initialization
with constants, so I'm not sure if the idea for a solution proposed there would
help us here).


-- 
           Summary: bad codegen for vector initialization in Altivec
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: dorit at il dot ibm dot com
 GCC build triplet: powerpc-linux
  GCC host triplet: powerpc-linux
GCC target triplet: powerpc-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32107

Reply via email to