> On Nov 15, 2019, at 10:01 AM, Szabolcs Nagy <[email protected]> wrote:
>
> On 15/11/2019 15:05, Francesco Petrogalli wrote:
>> Thank you Szabolcs for working on this.
>>
>>> On Nov 14, 2019, at 2:23 PM, Szabolcs Nagy <[email protected]> wrote:
>>>
>>> Sorry v2 had a bug.
>>>
>>> v2: added documentation and tests.
>>> v3: fixed expand_simd_clones so different isa variants are actually
>>> generated.
>>>
>>> GCC currently supports two ways to declare the availability of vector
>>> variants of a scalar function:
>>>
>>> #pragma omp declare simd
>>> void f (void);
>>>
>>> and
>>>
>>> __attribute__ ((simd))
>>> void f (void);
>>>
>>> However these declare a set of symbols that are different simd variants
>>> of f, so a library either provides definitions for all those symbols or
>>> it cannot use these declarations. (The set of declared symbols can be
>>> narrowed down with additional omp clauses, but not enough to allow
>>> declaring a single symbol. OpenMP 5 has a declare variant feature that
>>> allows declaring more specific simd variants, but it is complicated and
>>> still requires gcc or vendor extension for unambiguous declarations.)
>>>
>>
>> It is not just that it is complicated, it is also a good idea to make math
>> function vectorization orthogonal to OpenMP.
>> This is needed din clang too, thank you for shaping a solution. It would be
>> good if we could come up with a common solution!
>>
>>> This patch extends the gcc specific simd attribute such that it can
>>> specify a single vector variant of simple scalar functions (functions
>>> that only take and return scalar integer or floating type values):
>>>
>>>
>>>
>>> where mask is "inbranch" or "notinbranch" like now, simdlen is an int
>>> with the same meaning as in omp declare simd and simdabi is a string
>>> specifying the call ABI (which the intel vector ABI calls ISA). The
>>> name is optional and allows a library to use a different symbol name
>>> than what the vector ABI specifies.
>>>
>>
>> Can we have also handling also the simplest case for the use of linear (just
>> with step = OpenMP default = 1)? It will be useful for `sincos`.
>>
>> OpenMP `linear` uses parameters name, in the attribute, which applies to
>> declaration with unnamed parameters, we could use positions.
>>
>> Also, can we avoid making the attribute a varargs attribute, but requires
>> all arguments to be present? We could use dummy values when the descriptor
>> is not needed (e.g. `no_linear`).
>>
>> I would also require the name to be specified, with the additional
>> requirement to make the declaration of name visible in the same compilation
>> unit.
>>
>> For example, on Arm, mapping `sincos` and `exp` to unmasked vector versions
>> with 2 lanes would be:
>>
>> ```
>> void _ZGVnN2vl8l8_sincos(float64x2_t, double *, double *);
>>
>> void sincos(double, double*, double *)
>> __attribute__(simd(notinbranch,2,”simd”,_ZGVnN2vl8l8_sincos, linear(2,3)));
>>
>> void _ZGVnN2v_exp(float64x2_t);
>>
>> void exp(double) __attribute__(simd(notinbranch,2,”simd”,_ZGVnN2v_exp,
>> no_linear));
>> ```
>
>
> note that exp returns double.
>
Facepalm myself. Sorry.
The examples still holds though, replacing `void` with `double` for `exp`.
> the simdabi is "n" instead of "simd", it is the
> 'ISA character' from the vector abi, to make it clear
> how it is tied to the vector abi name mangling.
>
Good choice, I like it more.
> pre-declaration of the simd symbol is not required,
> since the current simd attr (and omp pragma) works
> that way (e.g. it means simd types need not be visible
> where it is used, which may be tricky if they need
> target specific header inclusion).
>
Fair enough.
> notinbranch is a string in quotes, to avoid macro
> name collision in standard headers.
>
Fine by me.
> the same issue applies to linear, so if we add such
> argument it must not use bare 'linear' identifier, e.g.
> it could be an int list like
>
> simd("notinbranch", 2, "n", {2,3}, "vsincos")
>
I like it. Can it be an empty `{}` when no linear parameters are available?
> or we could just specify the vector abi name and
> reverse engineer the requirements from that:
>
> simd("", "_ZGVnN2vl8l8_sincos", "vsincos")
>
> (first argument is just to disambiguate from the mask form)
>
> i considered this name demangling more difficult to
> implement and error prone,
The demangling is not that difficult to implement:
https://reviews.llvm.org/D66024
But I agree that relying on it at user level is more error prone. The users of
the compiler should stay away from the ABI as much as possible.
> but it unambiguously
> specifies a single vector abi symbol for all future
> vector abi extensions.
>
Future extensions should be covered by a new ISA token, so I think we are good
with your solution.
> but there is a bigger problem with linear: currently
> the vectorizer only considers vector clones if they
> are 'const' or the vectorization of a loop was
> explicitly requested with omp simd pragma. so functions
> with pointer arguments won't be used for vectorization.
>
If the C attribute provides the same semantics of `declare simd` and its
`variant` counterpart, can we assume that the “can be invoked concurrently” and
therefore can be used to vectorize even if no #pragma omp simd is attached to
the loop?
Quoting the standard:
“The use of one or more declare simd directives immediately prior to a function
declaration or definition enables the creation of corresponding SIMD versions
of the associated function that can be used to process multiple arguments from
a single invocation in a SIMD loop concurrently.”
Isn’t that enough for saying that the function can be used for vectorizing a
loop if the function being vectorizable is the only question that is left to be
sorted when all other conditions for vectorizing vectorizing a loop are met?
> is that different in clang? or expected to change somehow?
Upstream clang doesn’t do linear vectorization yet. We have it in arm compiler
for linux (https://godbolt.org/z/DEwEcT).