Fix a minor copyright year typo

2018-09-04 Thread wei xiao
Hi,

According to commit log, this file is created in the year 2018, hence
the copyright year should be corrected to 2018.
Is below patch ok to fix this problem?

Thanks
Wei Xiao

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 62e32c6..416dbd1 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,7 @@
+2018-09-04  Wei Xiao  
+
+   * gcc/config/i386/movdirintrin.h: Update copyright year.
+
 2018-09-03  Richard Biener  

PR tree-optimization/87177
diff --git a/gcc/config/i386/movdirintrin.h b/gcc/config/i386/movdirintrin.h
index 8b4d0b3..75a5552 100644
--- a/gcc/config/i386/movdirintrin.h
+++ b/gcc/config/i386/movdirintrin.h
@@ -1,4 +1,4 @@
-/* Copyright (C) 2017 Free Software Foundation, Inc.
+/* Copyright (C) 2018 Free Software Foundation, Inc.

This file is part of GCC.


Re: [PATCH] x86: Add -march=cascadelake

2018-12-12 Thread Wei Xiao
Hi Uros and other reviewers,

I'd like to split the work into 2 parts:
1) Basic processor enabling.
2) Processor type dynamic check.

Let's use a separate patch to implement the part 2.
The part 1 is implemented by attached patch.
Is it ok for trunk?

Wei

gcc/
  * common/config/i386/i386-common.c (processor_names): Add cascadelake.
  (processor_alias_table): Add cascadelake.
  * config.gcc: Add -march=cascadelake.
  * config/i386/i386-c.c (ix86_target_macros_internal): Handle cascadelake.
  * config/i386/i386.c (Add m_CASCADELAKE): New.
  (processor_cost_table): Add cascadelake.
  (get_builtin_code_for_version): Handle cascadelake.
  * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New.
  (PTA_CASCADELAKE): Ditto.
  * doc/invoke.texi: Add -march=cascadelake.

gcc/testsuite/
  * gcc.target/i386/funcspec-56.inc: Handle new march.
Wei Xiao  于2018年11月29日周四 下午4:32写道:
>
> Hi
>
> Distinguish based on stepping number is not recommended for some reasons:
> 1) Intel doesn't officially disclose stepping information in SDM.
> 2) Stepping can be changing in the future.
>
> We still prefer the conventional distinguish approach based on feature bits.
> I have refined the patch as attached according to all your suggestions.
>
> Wei
>
> gcc/
> * common/config/i386/i386-common.c (processor_names): Add cascadelake.
> (processor_alias_table): Add cascadelake.
> * config.gcc: Add -march=cascadelake.
> * config/i386/driver-i386.c
> (host_detect_local_cpu): Detect cascadelake.
> * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> cascadelake.
> * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE.
> (processor_cost_table): Add cascadelake.
> (get_builtin_code_for_version): Handle cascadelake.
> (fold_builtin_cpu): Ditto.
> * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New.
> (PTA_CASCADELAKE): Ditto.
> * doc/extend.texi: Add cascadelake.
> * doc/invoke.texi: Add -march=cascadelake.
> gcc/testsuite/
> * g++.target/i386/mv16.C: Handle new march.
> * gcc.target/i386/builtin_target.c: Ditto.
> * gcc.target/i386/funcspec-56.inc: Ditto.
> libgcc/
> * config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake.
> * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.
> Wei Xiao  于2018年11月27日周二 下午6:40写道:
> >
> > Thanks for the helpful information!
> > But I'm still checking with hardware team about the
> > family/model/stepping numbers for Cascadelake which are not officially
> > disclosed by Intel (to my best knowledge).
> >
> > Wei
> > Martin Liška  于2018年11月26日周一 下午10:00写道:
> > >
> > > On 11/26/18 12:18 PM, Jakub Jelinek wrote:
> > > > On Mon, Nov 26, 2018 at 12:03:53PM +0100, Martin Liška wrote:
> > > >>> For Cascade Lake the model number is the same as Skylake Server,
> > > >>> it can only be distinguished based on the stepping (5 vs 4)
> > > >>
> > > >> Very interesting, probably the first time a distinguish is based on 
> > > >> stepping number?
> > > >
> > > > Wouldn't it be better to distinguish it based on availability of VNNI, 
> > > > like
> > > > we do for unknown family/model?
> > > >
> > > >>> Like gcc -mcpu=native needs to learn about this.
> > > >>
> > > >> I'm attaching patch that does that. Note that it's completely untested 
> > > >> as I don't have
> > > >> access to any of the new machines (Skylake server).
> > >
> > > Would be possible, the only ugly place would be in 
> > > libgcc/config/i386/cpuinfo.c where we
> > > call:
> > >
> > >   get_intel_cpu (family, model, stepping, brand_id);
> > >   /* Find available features. */
> > >   get_available_features (ecx, edx, max_level, &avx512_vnni);
> > >
> > > one would need a feature to distinguish CPU model. Do we really want that?
> > >
> > > Martin
> > >
> > > >
> > > >   Jakub
> > > >
> > >


cascadelake-v4.diff
Description: Binary data


Re: [PATCH] x86: Add -march=cascadelake

2018-12-14 Thread Wei Xiao
The part 2 is implemented by attached patch.
Ok for trunk?

Wei

gcc/
   * config/i386/driver-i386.c (host_detect_local_cpu): Detect cascadelake.
   * config/i386/i386.c (fold_builtin_cpu): Handle cascadelake.
   * doc/extend.texi: Add cascadelake.

gcc/testsuite/
   * g++.target/i386/mv16.C: Handle new march.
   * gcc.target/i386/builtin_target.c: Ditto.

libgcc/
   * config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake.
   * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.
Uros Bizjak  于2018年12月13日周四 上午12:46写道:
>
> On Wed, Dec 12, 2018 at 10:48 AM Wei Xiao  wrote:
> >
> > Hi Uros and other reviewers,
> >
> > I'd like to split the work into 2 parts:
> > 1) Basic processor enabling.
> > 2) Processor type dynamic check.
> >
> > Let's use a separate patch to implement the part 2.
> > The part 1 is implemented by attached patch.
> > Is it ok for trunk?
> >
> > Wei
> >
> > gcc/
> >   * common/config/i386/i386-common.c (processor_names): Add cascadelake.
> >   (processor_alias_table): Add cascadelake.
> >   * config.gcc: Add -march=cascadelake.
> >   * config/i386/i386-c.c (ix86_target_macros_internal): Handle 
> > cascadelake.
> >   * config/i386/i386.c (Add m_CASCADELAKE): New.
> >   (processor_cost_table): Add cascadelake.
> >   (get_builtin_code_for_version): Handle cascadelake.
> >   * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New.
> >   (PTA_CASCADELAKE): Ditto.
> >   * doc/invoke.texi: Add -march=cascadelake.
> >
> > gcc/testsuite/
> >   * gcc.target/i386/funcspec-56.inc: Handle new march.
>
> OK for mainline.
>
> Thanks,
> Uros.
>
> > Wei Xiao  于2018年11月29日周四 下午4:32写道:
> > >
> > > Hi
> > >
> > > Distinguish based on stepping number is not recommended for some reasons:
> > > 1) Intel doesn't officially disclose stepping information in SDM.
> > > 2) Stepping can be changing in the future.
> > >
> > > We still prefer the conventional distinguish approach based on feature 
> > > bits.
> > > I have refined the patch as attached according to all your suggestions.
> > >
> > > Wei
> > >
> > > gcc/
> > > * common/config/i386/i386-common.c (processor_names): Add 
> > > cascadelake.
> > > (processor_alias_table): Add cascadelake.
> > > * config.gcc: Add -march=cascadelake.
> > > * config/i386/driver-i386.c
> > > (host_detect_local_cpu): Detect cascadelake.
> > > * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> > > cascadelake.
> > > * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE.
> > > (processor_cost_table): Add cascadelake.
> > > (get_builtin_code_for_version): Handle cascadelake.
> > > (fold_builtin_cpu): Ditto.
> > > * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): 
> > > New.
> > > (PTA_CASCADELAKE): Ditto.
> > > * doc/extend.texi: Add cascadelake.
> > > * doc/invoke.texi: Add -march=cascadelake.
> > > gcc/testsuite/
> > > * g++.target/i386/mv16.C: Handle new march.
> > > * gcc.target/i386/builtin_target.c: Ditto.
> > > * gcc.target/i386/funcspec-56.inc: Ditto.
> > > libgcc/
> > > * config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake.
> > > * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.
> > > Wei Xiao  于2018年11月27日周二 下午6:40写道:
> > > >
> > > > Thanks for the helpful information!
> > > > But I'm still checking with hardware team about the
> > > > family/model/stepping numbers for Cascadelake which are not officially
> > > > disclosed by Intel (to my best knowledge).
> > > >
> > > > Wei
> > > > Martin Liška  于2018年11月26日周一 下午10:00写道:
> > > > >
> > > > > On 11/26/18 12:18 PM, Jakub Jelinek wrote:
> > > > > > On Mon, Nov 26, 2018 at 12:03:53PM +0100, Martin Liška wrote:
> > > > > >>> For Cascade Lake the model number is the same as Skylake Server,
> > > > > >>> it can only be distinguished based on the stepping (5 vs 4)
> > > > > >>
> > > > > >> Very interesting, probably the first time a distinguish is based 
> > > > > >> on stepping number?
> > > > > >
> > 

Re: [PATCH] x86: Add -march=cascadelake

2018-12-16 Thread Wei Xiao
Thanks for the comments!
Fixed as attached.
Ok for trunk?
Jakub Jelinek  于2018年12月14日周五 下午6:47写道:
>
> On Fri, Dec 14, 2018 at 06:33:37PM +0800, Wei Xiao wrote:
> --- a/gcc/config/i386/driver-i386.c
> +++ b/gcc/config/i386/driver-i386.c
> @@ -832,8 +832,16 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
>   cpu = "skylake";
>   break;
> case 0x55:
> - /* Skylake with AVX-512.  */
> - cpu = "skylake-avx512";
> + if (has_avx512vnni)
> + {
> +   /* Cascade Lake.  */
> +   cpu = "cascadelake";
> + }
> + else
> + {
> +   /* Skylake with AVX-512.  */
> +   cpu = "skylake-avx512";
> + }
>   break;
>
> Just a formatting nit here, if {}s are used, they should be indented
> 2 columns to the right from the if or else and the body of {} should
> be indented by two further columns over {.
> But, in this case, there is another rule, that if the body has a single
> statement, then there shouldn't be {}s around it.  Thus just:
>   if (has_avx512vnni)
> /* Cascade Lake.  */
> cpu = "cascadelake";
>   else
> /* Skylake with AVX-512.  */
> cpu = "skylake-avx512";
>
> Jakub


cascadelake-v6.diff
Description: Binary data


[PATCH] x86: Update VFIXUPIMM* Intrinsics to align with the latest Intel SDM

2018-10-30 Thread Wei Xiao
Hi,

The attached patch updates VFIXUPIMM* Intrinsics to align with the
latest Intel® 64 and IA-32 Architectures Software Developer’s Manual
(SDM).
Tested with GCC regression test on x86, no regression.

Is it ok?

Thanks
Wei

gcc/
2018-10-30 Wei Xiao 

*config/i386/avx512fintrin.h: Update VFIXUPIMM* intrinsics.
(_mm512_fixupimm_round_pd): Update parameters and builtin.
(_mm512_maskz_fixupimm_round_pd): Ditto.
(_mm512_fixupimm_round_ps): Ditto.
(_mm512_maskz_fixupimm_round_ps): Ditto.
(_mm_fixupimm_round_sd): Ditto.
(_mm_maskz_fixupimm_round_sd): Ditto.
(_mm_fixupimm_round_ss): Ditto.
(_mm_maskz_fixupimm_round_ss): Ditto.
(_mm512_fixupimm_pd): Ditto.
(_mm512_maskz_fixupimm_pd): Ditto.
(_mm512_fixupimm_ps): Ditto.
(_mm512_maskz_fixupimm_ps): Ditto.
(_mm_fixupimm_sd): Ditto.
(_mm_maskz_fixupimm_sd): Ditto.
(_mm_fixupimm_ss): Ditto.
(_mm_maskz_fixupimm_ss): Ditto.
(_mm512_mask_fixupimm_round_pd): Update builtin.
(_mm512_mask_fixupimm_round_ps): Ditto.
(_mm_mask_fixupimm_round_sd): Ditto.
(_mm_mask_fixupimm_round_ss): Ditto.
(_mm512_mask_fixupimm_pd): Ditto.
(_mm512_mask_fixupimm_ps): Ditto.
(_mm_mask_fixupimm_sd): Ditto.
(_mm_mask_fixupimm_ss): Ditto.
*config/i386/avx512vlintrin.h:
(_mm256_fixupimm_pd): Update parameters and builtin.
(_mm256_maskz_fixupimm_pd): Ditto.
(_mm256_fixupimm_ps): Ditto.
(_mm256_maskz_fixupimm_ps): Ditto.
(_mm_fixupimm_pd): Ditto.
(_mm_maskz_fixupimm_pd): Ditto.
(_mm_fixupimm_ps): Ditto.
(_mm_maskz_fixupimm_ps): Ditto.
(_mm256_mask_fixupimm_pd): Update builtin.
(_mm256_mask_fixupimm_ps): Ditto.
(_mm_mask_fixupimm_pd): Ditto.
(_mm_mask_fixupimm_ps): Ditto.
*config/i386/i386-builtin-types.def: Add new builtin types.
*config/i386/i386-builtin.def: Update builtin definitions.
*config/i386/i386.c: Handle new builtin types.
*config/i386/sse.md: Update VFIXUPIMM* patterns.
(_fixupimm_maskz): Update.
(_fixupimm): Update.
(_fixupimm_mask): Update.
(avx512f_sfixupimm_maskz): Update.
(avx512f_sfixupimm): Update.
(avx512f_sfixupimm_mask): Update.
*config/i386/subst.md:
(round_saeonly_sd_mask_operand4): Add new subst_attr.
(round_saeonly_sd_mask_op4): Ditto.
(round_saeonly_expand_operand5): Ditto.
(round_saeonly_expand): Update.

gcc/testsuite
2018-10-30 Wei Xiao 

*gcc.target/i386/avx-1.c: Update tests for VFIXUPIMM* intrinsics.
*gcc.target/i386/avx512f-vfixupimmpd-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmpd-2.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmps-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmsd-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmsd-2.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmss-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmss-2.c: Ditto.
*gcc.target/i386/avx512vl-vfixupimmpd-1.c: Ditto.
*gcc.target/i386/avx512vl-vfixupimmps-1.c: Ditto.
*gcc.target/i386/sse-13.c: Ditto.
*gcc.target/i386/sse-14.c: Ditto.
*gcc.target/i386/sse-22.c: Ditto.
*gcc.target/i386/sse-23.c: Ditto.
*gcc.target/i386/testimm-10.c: Ditto.
*gcc.target/i386/testround-1.c: Ditto.


update-vfixupimm.diff
Description: Binary data


Re: [PATCH] x86: Update VFIXUPIMM* Intrinsics to align with the latest Intel SDM

2018-11-02 Thread Wei Xiao
Hi Uros and HJ,

I have updated the patch according to your remarks as attached.
Ok for trunk?

Thanks
Wei

gcc/
2018-11-2 Wei Xiao 

*config/i386/avx512fintrin.h: Update VFIXUPIMM* intrinsics.
(_mm512_fixupimm_round_pd): Update parameters and builtin.
(_mm512_maskz_fixupimm_round_pd): Ditto.
(_mm512_fixupimm_round_ps): Ditto.
(_mm512_maskz_fixupimm_round_ps): Ditto.
(_mm_fixupimm_round_sd): Ditto.
(_mm_maskz_fixupimm_round_sd): Ditto.
(_mm_fixupimm_round_ss): Ditto.
(_mm_maskz_fixupimm_round_ss): Ditto.
(_mm512_fixupimm_pd): Ditto.
(_mm512_maskz_fixupimm_pd): Ditto.
(_mm512_fixupimm_ps): Ditto.
(_mm512_maskz_fixupimm_ps): Ditto.
(_mm_fixupimm_sd): Ditto.
(_mm_maskz_fixupimm_sd): Ditto.
(_mm_fixupimm_ss): Ditto.
(_mm_maskz_fixupimm_ss): Ditto.
(_mm512_mask_fixupimm_round_pd): Update builtin.
(_mm512_mask_fixupimm_round_ps): Ditto.
(_mm_mask_fixupimm_round_sd): Ditto.
(_mm_mask_fixupimm_round_ss): Ditto.
(_mm512_mask_fixupimm_pd): Ditto.
(_mm512_mask_fixupimm_ps): Ditto.
(_mm_mask_fixupimm_sd): Ditto.
(_mm_mask_fixupimm_ss): Ditto.
*config/i386/avx512vlintrin.h:
(_mm256_fixupimm_pd): Update parameters and builtin.
(_mm256_maskz_fixupimm_pd): Ditto.
(_mm256_fixupimm_ps): Ditto.
(_mm256_maskz_fixupimm_ps): Ditto.
(_mm_fixupimm_pd): Ditto.
(_mm_maskz_fixupimm_pd): Ditto.
(_mm_fixupimm_ps): Ditto.
(_mm_maskz_fixupimm_ps): Ditto.
(_mm256_mask_fixupimm_pd): Update builtin.
(_mm256_mask_fixupimm_ps): Ditto.
(_mm_mask_fixupimm_pd): Ditto.
(_mm_mask_fixupimm_ps): Ditto.
*config/i386/i386-builtin-types.def: Add new types and remove
useless ones.
*config/i386/i386-builtin.def: Update builtin definitions.
*config/i386/i386.c: Handle new builtin types and remove useless ones.
*config/i386/sse.md: Update VFIXUPIMM* patterns.
(_fixupimm_maskz): Update.
(_fixupimm): Update.
(_fixupimm_mask): Update.
(avx512f_sfixupimm_maskz): Update.
(avx512f_sfixupimm): Update.
(avx512f_sfixupimm_mask): Update.
*config/i386/subst.md:
(round_saeonly_sd_mask_operand4): Add new subst_attr.
(round_saeonly_sd_mask_op4): Ditto.
(round_saeonly_expand_operand5): Ditto.
(round_saeonly_expand): Update.

gcc/testsuite
2018-11-2 Wei Xiao 

*gcc.target/i386/avx-1.c: Update tests for VFIXUPIMM* intrinsics.
*gcc.target/i386/avx512f-vfixupimmpd-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmpd-2.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmps-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmsd-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmsd-2.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmss-1.c: Ditto.
*gcc.target/i386/avx512f-vfixupimmss-2.c: Ditto.
*gcc.target/i386/avx512vl-vfixupimmpd-1.c: Ditto.
*gcc.target/i386/avx512vl-vfixupimmps-1.c: Ditto.
*gcc.target/i386/sse-13.c: Ditto.
*gcc.target/i386/sse-14.c: Ditto.
*gcc.target/i386/sse-22.c: Ditto.
*gcc.target/i386/sse-23.c: Ditto.
*gcc.target/i386/testimm-10.c: Ditto.
*gcc.target/i386/testround-1.c: Ditto.
Uros Bizjak  于2018年11月2日周五 上午1:27写道:
>
> On Tue, Oct 30, 2018 at 10:12 AM Wei Xiao  wrote:
> >
> > Hi,
> >
> > The attached patch updates VFIXUPIMM* Intrinsics to align with the
> > latest Intel® 64 and IA-32 Architectures Software Developer’s Manual
> > (SDM).
> > Tested with GCC regression test on x86, no regression.
>
> A couple of remarks:
>
> -_mm512_fixupimm_round_pd (__m512d __A, __m512d __B, __m512i __C,
> +_mm512_fixupimm_round_pd (__m512d __B, __m512i __C,
>
>  _mm512_mask_fixupimm_round_pd (__m512d __A, __mmask8 __U, __m512d __B,
> __m512i __C, const int __imm, const int __R)
>
> Some kind of the convention in avx512fintrin.h is that arguments are
> named like this:
>
> [ __m512. __W,] __mmask. __U, __m512x __A, __m512x __B, ..., const int
> _imm, const int __R]. Can we please keep the same approach here? I'
> mostly concerned that argument names don't start with __A.
>
> -BDESC (OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_fixupimmv4df_mask,
> "__builtin_ia32_fixupimmpd256_mask", IX86_BUILTIN_FIXUPIMMPD256_MASK,
> UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI)
> ...
>
> You are removing the only users of e.g.
> V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI (and other definitions). If there
> are no users left, can you also remove the relevant definitions?
>
> > Is it ok?
>
> Please repost the patch with above remarks addressed. These builtins
> are mostly Intel affair, s

Re: [PATCH] x86: Update VFIXUPIMM* Intrinsics to align with the latest Intel SDM

2018-11-04 Thread Wei Xiao
> Please also rename these:
>
>  _mm512_mask_fixupimm_round_pd (__m512d __A, __mmask8 __U, __m512d __B,
> __m512i __C, const int __imm, const int __R)
>
>  _mm512_mask_fixupimm_round_ps (__m512 __A, __mmask16 __U, __m512 __B,
> __m512i __C, const int __imm, const int __R)
>
>  _mm_mask_fixupimm_round_sd (__m128d __A, __mmask8 __U, __m128d __B,
>  __m128i __C, const int __imm, const int __R)
>
>  _mm_mask_fixupimm_round_ss (__m128 __A, __mmask8 __U, __m128 __B,
>  __m128i __C, const int __imm, const int __R)
>
>  _mm512_mask_fixupimm_pd (__m512d __A, __mmask8 __U, __m512d __B,
>   __m512i __C, const int __imm)
>
> _mm512_mask_fixupimm_ps (__m512 __A, __mmask16 __U, __m512 __B,
>   __m512i __C, const int __imm)
>
>  _mm_mask_fixupimm_sd (__m128d __A, __mmask8 __U, __m128d __B,
>__m128i __C, const int __imm)
>
>  _mm_mask_fixupimm_ss (__m128 __A, __mmask8 __U, __m128 __B,
>__m128i __C, const int __imm)
>
>  _mm256_mask_fixupimm_pd (__m256d __A, __mmask8 __U, __m256d __B,
>   __m256i __C, const int __imm)
>
>  _mm256_mask_fixupimm_ps (__m256 __A, __mmask8 __U, __m256 __B,
>   __m256i __C, const int __imm)
>
>   _mm_mask_fixupimm_pd (__m128d __A, __mmask8 __U, __m128d __B,
>__m128i __C, const int __imm)
>
>  _mm_mask_fixupimm_ps (__m128 __A, __mmask8 __U, __m128 __B,
>__m128i __C, const int __imm)
>
> Uros.

As attached, I have renamed above intrinsics according to
aforementioned convention:

[ __m512. __W,] __mmask. __U, __m512x __A, __m512x __B, ..., const int
_imm, const int __R].

Wei


update-vfixupimm-v3.diff
Description: Binary data


[PATCH] x86: Optimize VFIXUPIMM* patterns with multiple-alternative constraints

2018-11-06 Thread Wei Xiao
Hi maintainers,

The attached patch intends to optimize VFIXUPIMM* patterns with
multiple-alternative constraints and
4 patterns are combined into 2 patterns. Tested with bootstrap and
regression tests on x86_64. No regressions.

Is it OK for trunk?

Thanks,
Wei


opt-vfixupimm-v1.diff
Description: Binary data


Re: [PATCH] x86: Optimize VFIXUPIMM* patterns with multiple-alternative constraints

2018-11-09 Thread Wei Xiao
Hi Uros

Thanks for the remarks!
I improve the patch as attached to address the issues you mentioned:
1. No changes to substs any more.
2. Adopt established approach (e.g "rcp14") to
handle zero masks.

I'd like to explain our motivation of combining vfixupimm patterns: there will
be a lot of new x86 instructions with both masking and rounding like vfixupimm
in the future but we still want to keep x86 MD as short as possible and don't
want to write 2 patterns for each of these new instructions, which will also
raise code review cost for maintainer. We want to make sure the new pattern
paradigm is ok for x86 maintainer through this patch.

Wei
Uros Bizjak  于2018年11月7日周三 下午3:24写道:
>
> On Tue, Nov 6, 2018 at 11:16 AM Wei Xiao  wrote:
> >
> > Hi maintainers,
> >
> > The attached patch intends to optimize VFIXUPIMM* patterns with
> > multiple-alternative constraints and
> > 4 patterns are combined into 2 patterns. Tested with bootstrap and
> > regression tests on x86_64. No regressions.
> >
> > Is it OK for trunk?
>
> I'm not convinced that this particular optimization is a good idea.
> Looking at the patch, you have to add a whole bunch of substs just to
> merge two pattern sets. Also, the approach diverges from established
> approach of handling zero masks. The later raises maintenance costs
> for no compelling reason.
>
> I'd say to leave these patterns the way they are.
>
> Uros.


combine-vfixupimm-v2.diff
Description: Binary data


[PATCH] x86: Add -march=cascadelake

2018-11-21 Thread Wei Xiao
Hi,

The attached patch added -march=cascadelake for x86.
Tested with bootstrap and regression tests on x86_64. No regressions.
Is it ok for trunk?

Wei

gcc/
* common/config/i386/i386-common.c (processor_names): Add cascadelake.
(processor_alias_table): Add cascadelake.
* config.gcc: Add -march=cascadelake.
* config/i386/driver-i386.c
(host_detect_local_cpu): Detect cascadelake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
cascadelake.
* config/i386/i386.c (ix86_cost): Add m_CASCADELAKE.
(processor_cost_table): Add cascadelake.
(get_builtin_code_for_version): Handle cascadelake.
(fold_builtin_cpu): Ditto.
* config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New.
(PTA_CASCADELAKE): Ditto.
* doc/invoke.texi: Add -march=cascadelake.
gcc/testsuite/
* g++.target/i386/mv16.C: Handle new march.
* gcc.target/i386/funcspec-56.inc" Ditto.
libgcc/
* config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.


cascadelake.diff
Description: Binary data


Re: [PATCH] x86: Add -march=cascadelake

2018-11-21 Thread Wei Xiao
Jakub,

Thanks for the comments!
I have addressed them as attached.

Wei

gcc/
* common/config/i386/i386-common.c (processor_names): Add cascadelake.
(processor_alias_table): Add cascadelake.
* config.gcc: Add -march=cascadelake.
* config/i386/driver-i386.c
(host_detect_local_cpu): Detect cascadelake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
cascadelake.
* config/i386/i386.c (ix86_cost): Add m_CASCADELAKE.
(processor_cost_table): Add cascadelake.
(get_builtin_code_for_version): Handle cascadelake.
(fold_builtin_cpu): Ditto.
* config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New.
(PTA_CASCADELAKE): Ditto.
* doc/invoke.texi: Add -march=cascadelake.
gcc/testsuite/
* g++.target/i386/mv16.C: Handle new march.
* gcc.target/i386/funcspec-56.inc" Ditto.
libgcc/
* config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.
Jakub Jelinek  于2018年11月21日周三 下午7:09写道:
>
> On Wed, Nov 21, 2018 at 06:23:41PM +0800, Wei Xiao wrote:
> > The attached patch added -march=cascadelake for x86.
> > Tested with bootstrap and regression tests on x86_64. No regressions.
> > Is it ok for trunk?
>
> Not a real review, just nits:
>
> index bff4dfb..f7c1c98 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,18 @@
> +2018-11-21 Wei Xiao 
>
> Two spaces after date, two spaces before <.
>
> --- a/gcc/config/i386/driver-i386.c
> +++ b/gcc/config/i386/driver-i386.c
> @@ -857,6 +857,9 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
>   /* Assume Ice Lake.  */
>   else if (has_gfni)
> cpu = "icelake-client";
> + /* Assume Cascade Lake.  */
> + if (has_avx512vnni)
> +   cpu = "cascadelake";
>   /* Assume Cannon Lake.  */
>   else if (has_avx512vbmi)
> cpu = "cannonlake";
>
> Doesn't this break handling of all the other CPUs?  I mean, it is a large
>   if (cond) ... else if (cond) ... else if (cond) ... else ...
> but you've added if without else before it into the middle.
>
> Jakub


cascadelake-v2.diff
Description: Binary data


Re: [PATCH] x86: Add -march=cascadelake

2018-11-27 Thread Wei Xiao
Thanks for the helpful information!
But I'm still checking with hardware team about the
family/model/stepping numbers for Cascadelake which are not officially
disclosed by Intel (to my best knowledge).

Wei
Martin Liška  于2018年11月26日周一 下午10:00写道:
>
> On 11/26/18 12:18 PM, Jakub Jelinek wrote:
> > On Mon, Nov 26, 2018 at 12:03:53PM +0100, Martin Liška wrote:
> >>> For Cascade Lake the model number is the same as Skylake Server,
> >>> it can only be distinguished based on the stepping (5 vs 4)
> >>
> >> Very interesting, probably the first time a distinguish is based on 
> >> stepping number?
> >
> > Wouldn't it be better to distinguish it based on availability of VNNI, like
> > we do for unknown family/model?
> >
> >>> Like gcc -mcpu=native needs to learn about this.
> >>
> >> I'm attaching patch that does that. Note that it's completely untested as 
> >> I don't have
> >> access to any of the new machines (Skylake server).
>
> Would be possible, the only ugly place would be in 
> libgcc/config/i386/cpuinfo.c where we
> call:
>
>   get_intel_cpu (family, model, stepping, brand_id);
>   /* Find available features. */
>   get_available_features (ecx, edx, max_level, &avx512_vnni);
>
> one would need a feature to distinguish CPU model. Do we really want that?
>
> Martin
>
> >
> >   Jakub
> >
>


Re: [PATCH] x86: Add -march=cascadelake

2018-11-29 Thread Wei Xiao
Hi

Distinguish based on stepping number is not recommended for some reasons:
1) Intel doesn't officially disclose stepping information in SDM.
2) Stepping can be changing in the future.

We still prefer the conventional distinguish approach based on feature bits.
I have refined the patch as attached according to all your suggestions.

Wei

gcc/
* common/config/i386/i386-common.c (processor_names): Add cascadelake.
(processor_alias_table): Add cascadelake.
* config.gcc: Add -march=cascadelake.
* config/i386/driver-i386.c
(host_detect_local_cpu): Detect cascadelake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
cascadelake.
* config/i386/i386.c (ix86_cost): Add m_CASCADELAKE.
(processor_cost_table): Add cascadelake.
(get_builtin_code_for_version): Handle cascadelake.
(fold_builtin_cpu): Ditto.
* config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New.
(PTA_CASCADELAKE): Ditto.
* doc/extend.texi: Add cascadelake.
* doc/invoke.texi: Add -march=cascadelake.
gcc/testsuite/
* g++.target/i386/mv16.C: Handle new march.
* gcc.target/i386/builtin_target.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Ditto.
libgcc/
* config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake.
* config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.
Wei Xiao  于2018年11月27日周二 下午6:40写道:
>
> Thanks for the helpful information!
> But I'm still checking with hardware team about the
> family/model/stepping numbers for Cascadelake which are not officially
> disclosed by Intel (to my best knowledge).
>
> Wei
> Martin Liška  于2018年11月26日周一 下午10:00写道:
> >
> > On 11/26/18 12:18 PM, Jakub Jelinek wrote:
> > > On Mon, Nov 26, 2018 at 12:03:53PM +0100, Martin Liška wrote:
> > >>> For Cascade Lake the model number is the same as Skylake Server,
> > >>> it can only be distinguished based on the stepping (5 vs 4)
> > >>
> > >> Very interesting, probably the first time a distinguish is based on 
> > >> stepping number?
> > >
> > > Wouldn't it be better to distinguish it based on availability of VNNI, 
> > > like
> > > we do for unknown family/model?
> > >
> > >>> Like gcc -mcpu=native needs to learn about this.
> > >>
> > >> I'm attaching patch that does that. Note that it's completely untested 
> > >> as I don't have
> > >> access to any of the new machines (Skylake server).
> >
> > Would be possible, the only ugly place would be in 
> > libgcc/config/i386/cpuinfo.c where we
> > call:
> >
> >   get_intel_cpu (family, model, stepping, brand_id);
> >   /* Find available features. */
> >   get_available_features (ecx, edx, max_level, &avx512_vnni);
> >
> > one would need a feature to distinguish CPU model. Do we really want that?
> >
> > Martin
> >
> > >
> > >   Jakub
> > >
> >


cascadelake-v3.diff
Description: Binary data


[PATCH] x86: Revert patches to fix PR target/88794

2019-01-15 Thread Wei Xiao
Hi,

It turns out that the Intel 64 and IA-32 Architectures Software Developer
Manuals (SDM) description about the fixupimm intrinsic is incorrect. So we need
to revert 3 patches related to it: r265827, r266026 and r267160.
Sorry for the inconvenience.

Is it ok?

Wei


Re: [PATCH] x86: Revert patches to fix PR target/88794

2019-01-16 Thread Wei Xiao
> > Yes, but please test the compiler after the revert. Please also create
> > a runtime testcase out of the testcase in the PR.
Yes, we have tested it but current runtime testcase can't cover the corner
case to expose the incorrectness of SDM. We will add some after the revert.

> For r267160, I'd expect you want to revert just the config/i386/ part and
> keep the testcases, they should work even with the changes reverted, right?
>
The testcase part also need to be reverted since we have changed them
according to the incorrect intrinsic list in SDM.

Jakub Jelinek  于2019年1月15日周二 下午11:20写道:
>
> On Tue, Jan 15, 2019 at 04:14:06PM +0100, Uros Bizjak wrote:
> > On Tue, Jan 15, 2019 at 3:40 PM Wei Xiao  wrote:
> > >
> > > Hi,
> > >
> > > It turns out that the Intel 64 and IA-32 Architectures Software Developer
> > > Manuals (SDM) description about the fixupimm intrinsic is incorrect. So 
> > > we need
> > > to revert 3 patches related to it: r265827, r266026 and r267160.
> > > Sorry for the inconvenience.
> > >
> > > Is it ok?
> >
> > Yes, but please test the compiler after the revert. Please also create
> > a runtime testcase out of the testcase in the PR.
>
> For r267160, I'd expect you want to revert just the config/i386/ part and
> keep the testcases, they should work even with the changes reverted, right?
>
> Jakub


Re: [PATCH] x86: Revert patches to fix PR target/88794

2019-01-16 Thread Wei Xiao
The original runtime testcases are incorrect and I have fixed them as attached.
Is it ok to do the revert and fix the testcases for trunk?

Wei

2019-01-16  Wei Xiao  

* gcc.target/i386/avx512f-vfixupimmpd-2.c: Fix the test cases for
VFIXUPIMM* intrinsics.
* gcc.target/i386/avx512f-vfixupimmps-2.c: Ditto.
* gcc.target/i386/avx512f-vfixupimmsd-2.c: Ditto.
* gcc.target/i386/avx512f-vfixupimmss-2.c: Ditto.


fixupimm_testcases.diff
Description: Binary data


Re: [PATCH] Read avx512vl-vfixupimms*-2.c testcases (PR target/88489)

2019-01-17 Thread Wei Xiao
> > > For r267160, I'd expect you want to revert just the config/i386/ part and
> > > keep the testcases, they should work even with the changes reverted, 
> > > right?
> > >
> > The testcase part also need to be reverted since we have changed them
> > according to the incorrect intrinsic list in SDM.
>
> I don't really understand this.
>
> The testcases succeed just fine for me in the current trunk with all the
> reversions and test something the current state of the testsuite doesn't
> check normally, in particular that the testcases run correctly even when
> -mavx512vl is used.  As that misbehaved in the past, we should make sure we
> don't break that again.
>

You're right. The testcases need to be kept to prevent regression.

> Uros, is it ok to reapply this to current trunk?