Hi,

> I've been tinkering with the autovectorizer.  It's really cool.
> I particularly like the realignment support.
>
> I've noticed just a few things while tinkering with it (in 4.1.1):
>

thanks a lot for your comments!

>
>
> 1) The definition of the realignment instruction doesn't match hardware
for
> instrution sets like ARM WMMX, where aligned amounts shift by 0 bytes
> instead of VECSIZE byes.  This makes it useless for vector realignment,
> because in the case that the pointer happens to be aligned, we get the
> wrong vector.  Looks like the SPARC realignment hook does the same
thing...
> Indeed, it looks like Altivec is the only one to support it, and they do
> some trickery with shifting the wrong (against endianness) way based on
the
> two's compliment of the source (a very clever trick).  No other machine
> (evidentally) can easily meet the description of the current realignment
> mechanism.
>
> Of course, for safety reasons I guess we don't always get the next vector
> (the one at address floor(ptr+VECSIZE)), which would allow us to use the
> shift-style instructions.
>
> So, there may be a few options:
>
(1)
> * Have a flag or hook where we can say it is always OK to read the next
>         element.  This is probably a bad option; everyone who used the
>         vectorizer would have to know that they may need to pad their
>         arrays if they are in a protected memory environment.
>
(2)
> * Conditionally fetch the next bundle, and don't do the fetch of the
>         next data the last time around if might not be safe.  Probably
>         a bad idea for architectures without conditional execution.
>
(3)
> * Currently we drop out of the loop when there are VEC_ELEMENTS - 1
>         iterations or less.  We could drop out when there are
VEC_ELEMENTS
>         or less, and then we could always fetch the next aligned data.
>
(4)
> * Some other clever trick I don't know about. :-)
>
(5)
> * Or keep it the way it is, and leave out the machines that have the
>         shift-by-zero instead of the shift-by-VECSIZE behavior for
>         an aligned pointer.
>

We should probably implement (2) and (3) (and (4)... :-)) and let each
target choose which alternative to take.

>
> 2) It seems like there may be some hooks that aren't documented.  For
> instance, there seems to be some kind of support for the "vcond"
> standard name, but I can't seem to find it in the documentation.
>

I'll open PRs for both this (missed documentation) and the above (missed
optimization). I'll try to address these soon.
I've been meaning to add documentation on all the different idioms that one
needs to define in order to enable the different vectorization features for
a certain target.

>
> In general things work quite well, and it seems to play reasonably well
with
> things like the modulo scheduler.
>

thanks!

dorit

> Cheers,
>
>   Erich
>
> --
> Why are ``tolerant'' people so intolerant of intolerant people?

Reply via email to