https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69908

--- Comment #7 from Yuri Gribov <tetra2005 at gmail dot com> ---
(In reply to Marc Glisse from comment #6)
> (In reply to Yuri Gribov from comment #5)
> > Well, as we all know there are a lot of missing optimizations in GCC :) I
> > think the real question is whether it's ever going to be fixed if there's no
> > standard API for this code pattern which we can recognize as builtin.
> > 
> > I believe the answer is "No". ATM GCC does not vectorize even the simplest
> > memcpy equivalent code:
> >   // gcc tmp.c -O3 -mtune=native -ftree-vectorize -o- -S
> >   void memcpy_(char * __restrict a, char * __restrict b, unsigned n) {
> >     unsigned i;
> >     for (i = 0; i < n; ++i)
> >       a[i] = b[i];
> >   }
> 
> Please look again. ldist turns this into a call to memcpy. And if you
> disable ldist, it does get vectorized.

Hm, I've just tried r249806 both with -ftree-loop-distribution and
-fno-tree-loop-distribution on top of flags above without any changes in
output. This may depend on revision/flags/machine, which ones did you use?

Reply via email to