On 04/06/2024 12:50, Richard Biener wrote:
On Tue, 4 Jun 2024, Andre Vieira (lists) wrote:
Hi,
We got a question as to whether GCC had something similar to llvm's pragma
clang loop interleave_count(N), see
https://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations
I did a quick hack, using 'GCC interleaves N', just as a proof of concept, to
see whether we could connect this to the suggested_unroll_factor in the
vectorizer and to test the waters regarding having something like this
upstream.
For the real thing I'd suggest we use the same pragma syntax as clang's so its
easier to port code. It is my understanding that the main use for this is for
doing performance tuning of HPC kernels and performance tuning of CPU's cost
models.
This seems to work (TM), though with the move to slp-only I guess this will
stop working? Though I suspect we will want to have similar capabilities in
SLP, or maybe we have already and I didn't look hard enough.
suggested-unroll-factor also works with SLP, at least I don't see a
reason why it should not.
I think I may have misread what this (see below) was trying to say and
assumed we didn't support it.
/* If the slp decision is false when suggested unroll factor is worked
out, and we are applying suggested unroll factor, we can simply skip
all slp related analyses this time. */
bool slp = !applying_suggested_uf || slp_done_for_suggested_uf;