On 4/16/19 4:50 PM, H.J. Lu wrote:
> On Tue, Apr 16, 2019 at 1:28 AM Martin Liška <[email protected]> wrote:
>>
>> On 4/15/19 5:09 PM, H.J. Lu wrote:
>>> On Mon, Apr 15, 2019 at 12:26 AM Martin Liška <[email protected]> wrote:
>>>>
>>>> On 4/12/19 4:12 PM, H.J. Lu wrote:
>>>>> On Fri, Apr 12, 2019 at 4:41 AM Martin Liška <[email protected]> wrote:
>>>>>>
>>>>>> On 4/11/19 6:30 PM, H.J. Lu wrote:
>>>>>>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška <[email protected]> wrote:
>>>>>>>>
>>>>>>>> Hi.
>>>>>>>>
>>>>>>>> The patch is adding missing AVX512 ISAs for target and target_clone
>>>>>>>> attributes.
>>>>>>>>
>>>>>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>>>>>
>>>>>>>> Ready to be installed?
>>>>>>>> Thanks,
>>>>>>>> Martin
>>>>>>>>
>>>>>>>> gcc/ChangeLog:
>>>>>>>>
>>>>>>>> 2019-04-10 Martin Liska <[email protected]>
>>>>>>>>
>>>>>>>> PR target/89929
>>>>>>>> * config/i386/i386.c (get_builtin_code_for_version): Add
>>>>>>>> support for missing AVX512 ISAs.
>>>>>>>>
>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>
>>>>>>>> 2019-04-10 Martin Liska <[email protected]>
>>>>>>>>
>>>>>>>> PR target/89929
>>>>>>>> * g++.target/i386/mv28.C: New test.
>>>>>>>> * gcc.target/i386/mvc14.c: New test.
>>>>>>>> ---
>>>>>>>> gcc/config/i386/i386.c | 34 ++++++++++++++++++++++++++-
>>>>>>>> gcc/testsuite/g++.target/i386/mv28.C | 30 +++++++++++++++++++++++
>>>>>>>> gcc/testsuite/gcc.target/i386/mvc14.c | 16 +++++++++++++
>>>>>>>> 3 files changed, 79 insertions(+), 1 deletion(-)
>>>>>>>> create mode 100644 gcc/testsuite/g++.target/i386/mv28.C
>>>>>>>> create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Hi.
>>>>>>
>>>>>>> Since any ISAs beyond AVX512F may be enabled individually, we
>>>>>>> can't simply assign priorities to them. For GFNI, we can have
>>>>>>>
>>>>>>> 1. GFNI
>>>>>>> 2. GFNI + AVX
>>>>>>> 3. GFNI + AVX512F
>>>>>>> 4. GFNI + AVX512F + AVX512VL
>>>>>>
>>>>>> Makes sense to me! I'm considering syntax extension where one would be
>>>>>> able to come up with a priority. Eg.
>>>>>>
>>>>>> __attribute__((target("gfni,avx512bw", priority((3)))))
>>>>>>
>>>>>> Without that the ISA combinations are probably not comparable in a
>>>>>> reasonable way.
>>>>>>
>>>>>>>
>>>>>>> For this code, GFNI + AVX512BW is ignored:
>>>>>>>
>>>>>>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii
>>>>>>> __attribute__((target("gfni")))
>>>>>>> int foo(int i) {
>>>>>>> return 1;
>>>>>>> }
>>>>>>> __attribute__((target("gfni,avx512bw")))
>>>>>>> int foo(int i) {
>>>>>>> return 4;
>>>>>>> }
>>>>>>> __attribute__((target("default")))
>>>>>>> int foo(int i) {
>>>>>>> return 3;
>>>>>>> }
>>>>>>> int bar ()
>>>>>>> {
>>>>>>> return foo(2);
>>>>>>> }
>>>>>>
>>>>>> For 'target' attribute it works for me:
>>>>>>
>>>>>> 1) $ cat z.c && ./xg++ -B. z.c -c
>>>>>> #include <x86intrin.h>
>>>>>> volatile __m512i x1, x2;
>>>>>> volatile __mmask64 m64;
>>>>>>
>>>>>> __attribute__((target("gfni")))
>>>>>> int foo(int i) {
>>>>>> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>>>>>> return 1;
>>>>>> }
>>>>>> __attribute__((target("gfni,avx512bw")))
>>>>>> int foo(int i) {
>>>>>> return 4;
>>>>>> }
>>>>>> __attribute__((target("default")))
>>>>>> int foo(int i) {
>>>>>> return 3;
>>>>>> }
>>>>>> int bar ()
>>>>>> {
>>>>>> return foo(2);
>>>>>> }
>>>>>> In file included from ./include/immintrin.h:117,
>>>>>> from ./include/x86intrin.h:32,
>>>>>> from z.c:1:
>>>>>> z.c: In function ‘int foo(int)’:
>>>>>> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa
>>>>>> option -m32 -mgfni -mavx512f
>>>>>> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>>>>>> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>> z.c:7:10: note: the ABI for passing parameters with 64-byte alignment
>>>>>> has changed in GCC 4.6
>>>>>>
>>>>>> 2) $ cat z.c && ./xg++ -B. z.c -c
>>>>>> #include <x86intrin.h>
>>>>>> volatile __m512i x1, x2;
>>>>>> volatile __mmask64 m64;
>>>>>>
>>>>>> __attribute__((target("gfni")))
>>>>>> int foo(int i) {
>>>>>> return 1;
>>>>>> }
>>>>>> __attribute__((target("gfni,avx512bw")))
>>>>>> int foo(int i) {
>>>>>> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3);
>>>>>> return 4;
>>>>>> }
>>>>>> __attribute__((target("default")))
>>>>>> int foo(int i) {
>>>>>> return 3;
>>>>>> }
>>>>>> int bar ()
>>>>>> {
>>>>>> return foo(2);
>>>>>> }
>>>>>>
>>>>>> [OK]
>>>>>>
>>>>>> Btw. is it really correct the '-m32' in: 'needs isa option -m32' ?
>>>>>
>>>>> It does look odd.
>>>>
>>>> Then let me take a look at this.
>>>>
>>>>>
>>>>>> Similar applies to target_clone attribute where we'll have to come up
>>>>>> with
>>>>>> a syntax that will allow multiple ISA to be combined. Something like:
>>>>>>
>>>>>> __attribute__((target_clones("gfni+avx512bw")))
>>>>>>
>>>>>> ? Priorities can be maybe implemented by order?
>>>>>>
>>>>>
>>>>> I am thinking -misa=processor which will enable ISAs for
>>>>> processor. It differs from -march=. -misa= doesn't set
>>>>> -mtune.
>>>>>
>>>>
>>>> Well, isn't that what we currently support, e.g.:
>>>>
>>>> $ cat mvc11.c && gcc mvc11.c -c
>>>> __attribute__((target_clones("arch=sandybridge", "arch=cascadelake",
>>>> "default"))) int
>>>> foo (void)
>>>> {
>>>> return 0;
>>>> }
>>>>
>>>> int
>>>> main ()
>>>> {
>>>> foo ();
>>>> }
>>>>
>>>> If so, we can provide a new warning that will tell that for AVX512* on
>>>> should use 'arch=xyz'
>>>> instead?
>>>>
>>>
>>> 1. We don't have one option to enable AVX512F and AVX512CD, which are
>>> common to
>>> Skylake server and KNL.
>>
>> Then arch=axy is the solution for that.
>>
>>> 2. -march= also implies -mtune, which may not be wanted.
>>
>> Why is that not intended. Using target (or target_clone), I would expect the
>> fastest possible version of the function for defined ISA (or ARCH). Then
>> implying -mtune is reasonable for me.
>>
>>> 3. -march=icelake-client enables more than just AVX512XXX.
>>
>> Yes, but if you take a look at number of AVX512 ISA extensions (18), then
>> it looks more easier to me to use arch=xyz as it leads to reasonable number
>> of architectures.
>>
>
> Are you proposing to add a new set of -march=XXX to support the family
> of processors with AVX512? Adding -march=XXX is much more invasive
> that adding -misa=XXX, which is just a combination of -mavx512XXXs.
> Do we really want to do that?
>
No, I'm saying that I would reject all these:
+ {"avx512f", P_AVX512F},
+ {"avx512vl", P_AVX512VL},
+ {"avx512bw", P_AVX512BW},
+ {"avx512dq", P_AVX512DQ},
+ {"avx512cd", P_AVX512CD},
+ {"avx512er", P_AVX512ER},
+ {"avx512pf", P_AVX512PF},
+ {"avx512vbmi", P_AVX512VBMI},
+ {"avx512ifma", P_AVX512IFMA},
+ {"avx5124vnniw", P_AVX5124VNNIW},
+ {"avx5124fmaps", P_AVX5124FMAPS},
+ {"avx512vpopcntdq", P_AVX512VPOPCNTDQ},
+ {"avx512vbmi2", P_AVX512VBMI2},
+ {"gfni", P_GFNI},
+ {"vpclmulqdq", P_VPCLMULQDQ},
+ {"avx512vnni", P_AVX512VNNI},
+ {"avx512bitalg", P_AVX512BITALG}
as arguments to target_clone attribute and instead I would recommend to users
to use
target_clones(arch=skylake,arch=skylake-avx512,arch=cannonlake,arch=icelake-client,arch=icelake-server,
..)
It's because we don't have a syntax for combination of ISAs for a target_clones
attribute (e.g. avx512f+avx512cd+avx512er)
and we don't have a syntax for priorities.
Martin