Fix a minor copyright year typo
Hi, According to commit log, this file is created in the year 2018, hence the copyright year should be corrected to 2018. Is below patch ok to fix this problem? Thanks Wei Xiao diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 62e32c6..416dbd1 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,7 @@ +2018-09-04 Wei Xiao + + * gcc/config/i386/movdirintrin.h: Update copyright year. + 2018-09-03 Richard Biener PR tree-optimization/87177 diff --git a/gcc/config/i386/movdirintrin.h b/gcc/config/i386/movdirintrin.h index 8b4d0b3..75a5552 100644 --- a/gcc/config/i386/movdirintrin.h +++ b/gcc/config/i386/movdirintrin.h @@ -1,4 +1,4 @@ -/* Copyright (C) 2017 Free Software Foundation, Inc. +/* Copyright (C) 2018 Free Software Foundation, Inc. This file is part of GCC.
Re: [PATCH] x86: Add -march=cascadelake
Hi Uros and other reviewers, I'd like to split the work into 2 parts: 1) Basic processor enabling. 2) Processor type dynamic check. Let's use a separate patch to implement the part 2. The part 1 is implemented by attached patch. Is it ok for trunk? Wei gcc/ * common/config/i386/i386-common.c (processor_names): Add cascadelake. (processor_alias_table): Add cascadelake. * config.gcc: Add -march=cascadelake. * config/i386/i386-c.c (ix86_target_macros_internal): Handle cascadelake. * config/i386/i386.c (Add m_CASCADELAKE): New. (processor_cost_table): Add cascadelake. (get_builtin_code_for_version): Handle cascadelake. * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New. (PTA_CASCADELAKE): Ditto. * doc/invoke.texi: Add -march=cascadelake. gcc/testsuite/ * gcc.target/i386/funcspec-56.inc: Handle new march. Wei Xiao 于2018年11月29日周四 下午4:32写道: > > Hi > > Distinguish based on stepping number is not recommended for some reasons: > 1) Intel doesn't officially disclose stepping information in SDM. > 2) Stepping can be changing in the future. > > We still prefer the conventional distinguish approach based on feature bits. > I have refined the patch as attached according to all your suggestions. > > Wei > > gcc/ > * common/config/i386/i386-common.c (processor_names): Add cascadelake. > (processor_alias_table): Add cascadelake. > * config.gcc: Add -march=cascadelake. > * config/i386/driver-i386.c > (host_detect_local_cpu): Detect cascadelake. > * config/i386/i386-c.c (ix86_target_macros_internal): Handle > cascadelake. > * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE. > (processor_cost_table): Add cascadelake. > (get_builtin_code_for_version): Handle cascadelake. > (fold_builtin_cpu): Ditto. > * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New. > (PTA_CASCADELAKE): Ditto. > * doc/extend.texi: Add cascadelake. > * doc/invoke.texi: Add -march=cascadelake. > gcc/testsuite/ > * g++.target/i386/mv16.C: Handle new march. > * gcc.target/i386/builtin_target.c: Ditto. > * gcc.target/i386/funcspec-56.inc: Ditto. > libgcc/ > * config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake. > * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE. > Wei Xiao 于2018年11月27日周二 下午6:40写道: > > > > Thanks for the helpful information! > > But I'm still checking with hardware team about the > > family/model/stepping numbers for Cascadelake which are not officially > > disclosed by Intel (to my best knowledge). > > > > Wei > > Martin Liška 于2018年11月26日周一 下午10:00写道: > > > > > > On 11/26/18 12:18 PM, Jakub Jelinek wrote: > > > > On Mon, Nov 26, 2018 at 12:03:53PM +0100, Martin Liška wrote: > > > >>> For Cascade Lake the model number is the same as Skylake Server, > > > >>> it can only be distinguished based on the stepping (5 vs 4) > > > >> > > > >> Very interesting, probably the first time a distinguish is based on > > > >> stepping number? > > > > > > > > Wouldn't it be better to distinguish it based on availability of VNNI, > > > > like > > > > we do for unknown family/model? > > > > > > > >>> Like gcc -mcpu=native needs to learn about this. > > > >> > > > >> I'm attaching patch that does that. Note that it's completely untested > > > >> as I don't have > > > >> access to any of the new machines (Skylake server). > > > > > > Would be possible, the only ugly place would be in > > > libgcc/config/i386/cpuinfo.c where we > > > call: > > > > > > get_intel_cpu (family, model, stepping, brand_id); > > > /* Find available features. */ > > > get_available_features (ecx, edx, max_level, &avx512_vnni); > > > > > > one would need a feature to distinguish CPU model. Do we really want that? > > > > > > Martin > > > > > > > > > > > Jakub > > > > > > > cascadelake-v4.diff Description: Binary data
Re: [PATCH] x86: Add -march=cascadelake
The part 2 is implemented by attached patch. Ok for trunk? Wei gcc/ * config/i386/driver-i386.c (host_detect_local_cpu): Detect cascadelake. * config/i386/i386.c (fold_builtin_cpu): Handle cascadelake. * doc/extend.texi: Add cascadelake. gcc/testsuite/ * g++.target/i386/mv16.C: Handle new march. * gcc.target/i386/builtin_target.c: Ditto. libgcc/ * config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake. * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE. Uros Bizjak 于2018年12月13日周四 上午12:46写道: > > On Wed, Dec 12, 2018 at 10:48 AM Wei Xiao wrote: > > > > Hi Uros and other reviewers, > > > > I'd like to split the work into 2 parts: > > 1) Basic processor enabling. > > 2) Processor type dynamic check. > > > > Let's use a separate patch to implement the part 2. > > The part 1 is implemented by attached patch. > > Is it ok for trunk? > > > > Wei > > > > gcc/ > > * common/config/i386/i386-common.c (processor_names): Add cascadelake. > > (processor_alias_table): Add cascadelake. > > * config.gcc: Add -march=cascadelake. > > * config/i386/i386-c.c (ix86_target_macros_internal): Handle > > cascadelake. > > * config/i386/i386.c (Add m_CASCADELAKE): New. > > (processor_cost_table): Add cascadelake. > > (get_builtin_code_for_version): Handle cascadelake. > > * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New. > > (PTA_CASCADELAKE): Ditto. > > * doc/invoke.texi: Add -march=cascadelake. > > > > gcc/testsuite/ > > * gcc.target/i386/funcspec-56.inc: Handle new march. > > OK for mainline. > > Thanks, > Uros. > > > Wei Xiao 于2018年11月29日周四 下午4:32写道: > > > > > > Hi > > > > > > Distinguish based on stepping number is not recommended for some reasons: > > > 1) Intel doesn't officially disclose stepping information in SDM. > > > 2) Stepping can be changing in the future. > > > > > > We still prefer the conventional distinguish approach based on feature > > > bits. > > > I have refined the patch as attached according to all your suggestions. > > > > > > Wei > > > > > > gcc/ > > > * common/config/i386/i386-common.c (processor_names): Add > > > cascadelake. > > > (processor_alias_table): Add cascadelake. > > > * config.gcc: Add -march=cascadelake. > > > * config/i386/driver-i386.c > > > (host_detect_local_cpu): Detect cascadelake. > > > * config/i386/i386-c.c (ix86_target_macros_internal): Handle > > > cascadelake. > > > * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE. > > > (processor_cost_table): Add cascadelake. > > > (get_builtin_code_for_version): Handle cascadelake. > > > (fold_builtin_cpu): Ditto. > > > * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): > > > New. > > > (PTA_CASCADELAKE): Ditto. > > > * doc/extend.texi: Add cascadelake. > > > * doc/invoke.texi: Add -march=cascadelake. > > > gcc/testsuite/ > > > * g++.target/i386/mv16.C: Handle new march. > > > * gcc.target/i386/builtin_target.c: Ditto. > > > * gcc.target/i386/funcspec-56.inc: Ditto. > > > libgcc/ > > > * config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake. > > > * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE. > > > Wei Xiao 于2018年11月27日周二 下午6:40写道: > > > > > > > > Thanks for the helpful information! > > > > But I'm still checking with hardware team about the > > > > family/model/stepping numbers for Cascadelake which are not officially > > > > disclosed by Intel (to my best knowledge). > > > > > > > > Wei > > > > Martin Liška 于2018年11月26日周一 下午10:00写道: > > > > > > > > > > On 11/26/18 12:18 PM, Jakub Jelinek wrote: > > > > > > On Mon, Nov 26, 2018 at 12:03:53PM +0100, Martin Liška wrote: > > > > > >>> For Cascade Lake the model number is the same as Skylake Server, > > > > > >>> it can only be distinguished based on the stepping (5 vs 4) > > > > > >> > > > > > >> Very interesting, probably the first time a distinguish is based > > > > > >> on stepping number? > > > > > > > >
Re: [PATCH] x86: Add -march=cascadelake
Thanks for the comments! Fixed as attached. Ok for trunk? Jakub Jelinek 于2018年12月14日周五 下午6:47写道: > > On Fri, Dec 14, 2018 at 06:33:37PM +0800, Wei Xiao wrote: > --- a/gcc/config/i386/driver-i386.c > +++ b/gcc/config/i386/driver-i386.c > @@ -832,8 +832,16 @@ const char *host_detect_local_cpu (int argc, const char > **argv) > cpu = "skylake"; > break; > case 0x55: > - /* Skylake with AVX-512. */ > - cpu = "skylake-avx512"; > + if (has_avx512vnni) > + { > + /* Cascade Lake. */ > + cpu = "cascadelake"; > + } > + else > + { > + /* Skylake with AVX-512. */ > + cpu = "skylake-avx512"; > + } > break; > > Just a formatting nit here, if {}s are used, they should be indented > 2 columns to the right from the if or else and the body of {} should > be indented by two further columns over {. > But, in this case, there is another rule, that if the body has a single > statement, then there shouldn't be {}s around it. Thus just: > if (has_avx512vnni) > /* Cascade Lake. */ > cpu = "cascadelake"; > else > /* Skylake with AVX-512. */ > cpu = "skylake-avx512"; > > Jakub cascadelake-v6.diff Description: Binary data
[PATCH] x86: Update VFIXUPIMM* Intrinsics to align with the latest Intel SDM
Hi, The attached patch updates VFIXUPIMM* Intrinsics to align with the latest Intel® 64 and IA-32 Architectures Software Developer’s Manual (SDM). Tested with GCC regression test on x86, no regression. Is it ok? Thanks Wei gcc/ 2018-10-30 Wei Xiao *config/i386/avx512fintrin.h: Update VFIXUPIMM* intrinsics. (_mm512_fixupimm_round_pd): Update parameters and builtin. (_mm512_maskz_fixupimm_round_pd): Ditto. (_mm512_fixupimm_round_ps): Ditto. (_mm512_maskz_fixupimm_round_ps): Ditto. (_mm_fixupimm_round_sd): Ditto. (_mm_maskz_fixupimm_round_sd): Ditto. (_mm_fixupimm_round_ss): Ditto. (_mm_maskz_fixupimm_round_ss): Ditto. (_mm512_fixupimm_pd): Ditto. (_mm512_maskz_fixupimm_pd): Ditto. (_mm512_fixupimm_ps): Ditto. (_mm512_maskz_fixupimm_ps): Ditto. (_mm_fixupimm_sd): Ditto. (_mm_maskz_fixupimm_sd): Ditto. (_mm_fixupimm_ss): Ditto. (_mm_maskz_fixupimm_ss): Ditto. (_mm512_mask_fixupimm_round_pd): Update builtin. (_mm512_mask_fixupimm_round_ps): Ditto. (_mm_mask_fixupimm_round_sd): Ditto. (_mm_mask_fixupimm_round_ss): Ditto. (_mm512_mask_fixupimm_pd): Ditto. (_mm512_mask_fixupimm_ps): Ditto. (_mm_mask_fixupimm_sd): Ditto. (_mm_mask_fixupimm_ss): Ditto. *config/i386/avx512vlintrin.h: (_mm256_fixupimm_pd): Update parameters and builtin. (_mm256_maskz_fixupimm_pd): Ditto. (_mm256_fixupimm_ps): Ditto. (_mm256_maskz_fixupimm_ps): Ditto. (_mm_fixupimm_pd): Ditto. (_mm_maskz_fixupimm_pd): Ditto. (_mm_fixupimm_ps): Ditto. (_mm_maskz_fixupimm_ps): Ditto. (_mm256_mask_fixupimm_pd): Update builtin. (_mm256_mask_fixupimm_ps): Ditto. (_mm_mask_fixupimm_pd): Ditto. (_mm_mask_fixupimm_ps): Ditto. *config/i386/i386-builtin-types.def: Add new builtin types. *config/i386/i386-builtin.def: Update builtin definitions. *config/i386/i386.c: Handle new builtin types. *config/i386/sse.md: Update VFIXUPIMM* patterns. (_fixupimm_maskz): Update. (_fixupimm): Update. (_fixupimm_mask): Update. (avx512f_sfixupimm_maskz): Update. (avx512f_sfixupimm): Update. (avx512f_sfixupimm_mask): Update. *config/i386/subst.md: (round_saeonly_sd_mask_operand4): Add new subst_attr. (round_saeonly_sd_mask_op4): Ditto. (round_saeonly_expand_operand5): Ditto. (round_saeonly_expand): Update. gcc/testsuite 2018-10-30 Wei Xiao *gcc.target/i386/avx-1.c: Update tests for VFIXUPIMM* intrinsics. *gcc.target/i386/avx512f-vfixupimmpd-1.c: Ditto. *gcc.target/i386/avx512f-vfixupimmpd-2.c: Ditto. *gcc.target/i386/avx512f-vfixupimmps-1.c: Ditto. *gcc.target/i386/avx512f-vfixupimmsd-1.c: Ditto. *gcc.target/i386/avx512f-vfixupimmsd-2.c: Ditto. *gcc.target/i386/avx512f-vfixupimmss-1.c: Ditto. *gcc.target/i386/avx512f-vfixupimmss-2.c: Ditto. *gcc.target/i386/avx512vl-vfixupimmpd-1.c: Ditto. *gcc.target/i386/avx512vl-vfixupimmps-1.c: Ditto. *gcc.target/i386/sse-13.c: Ditto. *gcc.target/i386/sse-14.c: Ditto. *gcc.target/i386/sse-22.c: Ditto. *gcc.target/i386/sse-23.c: Ditto. *gcc.target/i386/testimm-10.c: Ditto. *gcc.target/i386/testround-1.c: Ditto. update-vfixupimm.diff Description: Binary data
Re: [PATCH] x86: Update VFIXUPIMM* Intrinsics to align with the latest Intel SDM
Hi Uros and HJ, I have updated the patch according to your remarks as attached. Ok for trunk? Thanks Wei gcc/ 2018-11-2 Wei Xiao *config/i386/avx512fintrin.h: Update VFIXUPIMM* intrinsics. (_mm512_fixupimm_round_pd): Update parameters and builtin. (_mm512_maskz_fixupimm_round_pd): Ditto. (_mm512_fixupimm_round_ps): Ditto. (_mm512_maskz_fixupimm_round_ps): Ditto. (_mm_fixupimm_round_sd): Ditto. (_mm_maskz_fixupimm_round_sd): Ditto. (_mm_fixupimm_round_ss): Ditto. (_mm_maskz_fixupimm_round_ss): Ditto. (_mm512_fixupimm_pd): Ditto. (_mm512_maskz_fixupimm_pd): Ditto. (_mm512_fixupimm_ps): Ditto. (_mm512_maskz_fixupimm_ps): Ditto. (_mm_fixupimm_sd): Ditto. (_mm_maskz_fixupimm_sd): Ditto. (_mm_fixupimm_ss): Ditto. (_mm_maskz_fixupimm_ss): Ditto. (_mm512_mask_fixupimm_round_pd): Update builtin. (_mm512_mask_fixupimm_round_ps): Ditto. (_mm_mask_fixupimm_round_sd): Ditto. (_mm_mask_fixupimm_round_ss): Ditto. (_mm512_mask_fixupimm_pd): Ditto. (_mm512_mask_fixupimm_ps): Ditto. (_mm_mask_fixupimm_sd): Ditto. (_mm_mask_fixupimm_ss): Ditto. *config/i386/avx512vlintrin.h: (_mm256_fixupimm_pd): Update parameters and builtin. (_mm256_maskz_fixupimm_pd): Ditto. (_mm256_fixupimm_ps): Ditto. (_mm256_maskz_fixupimm_ps): Ditto. (_mm_fixupimm_pd): Ditto. (_mm_maskz_fixupimm_pd): Ditto. (_mm_fixupimm_ps): Ditto. (_mm_maskz_fixupimm_ps): Ditto. (_mm256_mask_fixupimm_pd): Update builtin. (_mm256_mask_fixupimm_ps): Ditto. (_mm_mask_fixupimm_pd): Ditto. (_mm_mask_fixupimm_ps): Ditto. *config/i386/i386-builtin-types.def: Add new types and remove useless ones. *config/i386/i386-builtin.def: Update builtin definitions. *config/i386/i386.c: Handle new builtin types and remove useless ones. *config/i386/sse.md: Update VFIXUPIMM* patterns. (_fixupimm_maskz): Update. (_fixupimm): Update. (_fixupimm_mask): Update. (avx512f_sfixupimm_maskz): Update. (avx512f_sfixupimm): Update. (avx512f_sfixupimm_mask): Update. *config/i386/subst.md: (round_saeonly_sd_mask_operand4): Add new subst_attr. (round_saeonly_sd_mask_op4): Ditto. (round_saeonly_expand_operand5): Ditto. (round_saeonly_expand): Update. gcc/testsuite 2018-11-2 Wei Xiao *gcc.target/i386/avx-1.c: Update tests for VFIXUPIMM* intrinsics. *gcc.target/i386/avx512f-vfixupimmpd-1.c: Ditto. *gcc.target/i386/avx512f-vfixupimmpd-2.c: Ditto. *gcc.target/i386/avx512f-vfixupimmps-1.c: Ditto. *gcc.target/i386/avx512f-vfixupimmsd-1.c: Ditto. *gcc.target/i386/avx512f-vfixupimmsd-2.c: Ditto. *gcc.target/i386/avx512f-vfixupimmss-1.c: Ditto. *gcc.target/i386/avx512f-vfixupimmss-2.c: Ditto. *gcc.target/i386/avx512vl-vfixupimmpd-1.c: Ditto. *gcc.target/i386/avx512vl-vfixupimmps-1.c: Ditto. *gcc.target/i386/sse-13.c: Ditto. *gcc.target/i386/sse-14.c: Ditto. *gcc.target/i386/sse-22.c: Ditto. *gcc.target/i386/sse-23.c: Ditto. *gcc.target/i386/testimm-10.c: Ditto. *gcc.target/i386/testround-1.c: Ditto. Uros Bizjak 于2018年11月2日周五 上午1:27写道: > > On Tue, Oct 30, 2018 at 10:12 AM Wei Xiao wrote: > > > > Hi, > > > > The attached patch updates VFIXUPIMM* Intrinsics to align with the > > latest Intel® 64 and IA-32 Architectures Software Developer’s Manual > > (SDM). > > Tested with GCC regression test on x86, no regression. > > A couple of remarks: > > -_mm512_fixupimm_round_pd (__m512d __A, __m512d __B, __m512i __C, > +_mm512_fixupimm_round_pd (__m512d __B, __m512i __C, > > _mm512_mask_fixupimm_round_pd (__m512d __A, __mmask8 __U, __m512d __B, > __m512i __C, const int __imm, const int __R) > > Some kind of the convention in avx512fintrin.h is that arguments are > named like this: > > [ __m512. __W,] __mmask. __U, __m512x __A, __m512x __B, ..., const int > _imm, const int __R]. Can we please keep the same approach here? I' > mostly concerned that argument names don't start with __A. > > -BDESC (OPTION_MASK_ISA_AVX512VL, CODE_FOR_avx512vl_fixupimmv4df_mask, > "__builtin_ia32_fixupimmpd256_mask", IX86_BUILTIN_FIXUPIMMPD256_MASK, > UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI) > ... > > You are removing the only users of e.g. > V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI (and other definitions). If there > are no users left, can you also remove the relevant definitions? > > > Is it ok? > > Please repost the patch with above remarks addressed. These builtins > are mostly Intel affair, s
Re: [PATCH] x86: Update VFIXUPIMM* Intrinsics to align with the latest Intel SDM
> Please also rename these: > > _mm512_mask_fixupimm_round_pd (__m512d __A, __mmask8 __U, __m512d __B, > __m512i __C, const int __imm, const int __R) > > _mm512_mask_fixupimm_round_ps (__m512 __A, __mmask16 __U, __m512 __B, > __m512i __C, const int __imm, const int __R) > > _mm_mask_fixupimm_round_sd (__m128d __A, __mmask8 __U, __m128d __B, > __m128i __C, const int __imm, const int __R) > > _mm_mask_fixupimm_round_ss (__m128 __A, __mmask8 __U, __m128 __B, > __m128i __C, const int __imm, const int __R) > > _mm512_mask_fixupimm_pd (__m512d __A, __mmask8 __U, __m512d __B, > __m512i __C, const int __imm) > > _mm512_mask_fixupimm_ps (__m512 __A, __mmask16 __U, __m512 __B, > __m512i __C, const int __imm) > > _mm_mask_fixupimm_sd (__m128d __A, __mmask8 __U, __m128d __B, >__m128i __C, const int __imm) > > _mm_mask_fixupimm_ss (__m128 __A, __mmask8 __U, __m128 __B, >__m128i __C, const int __imm) > > _mm256_mask_fixupimm_pd (__m256d __A, __mmask8 __U, __m256d __B, > __m256i __C, const int __imm) > > _mm256_mask_fixupimm_ps (__m256 __A, __mmask8 __U, __m256 __B, > __m256i __C, const int __imm) > > _mm_mask_fixupimm_pd (__m128d __A, __mmask8 __U, __m128d __B, >__m128i __C, const int __imm) > > _mm_mask_fixupimm_ps (__m128 __A, __mmask8 __U, __m128 __B, >__m128i __C, const int __imm) > > Uros. As attached, I have renamed above intrinsics according to aforementioned convention: [ __m512. __W,] __mmask. __U, __m512x __A, __m512x __B, ..., const int _imm, const int __R]. Wei update-vfixupimm-v3.diff Description: Binary data
[PATCH] x86: Optimize VFIXUPIMM* patterns with multiple-alternative constraints
Hi maintainers, The attached patch intends to optimize VFIXUPIMM* patterns with multiple-alternative constraints and 4 patterns are combined into 2 patterns. Tested with bootstrap and regression tests on x86_64. No regressions. Is it OK for trunk? Thanks, Wei opt-vfixupimm-v1.diff Description: Binary data
Re: [PATCH] x86: Optimize VFIXUPIMM* patterns with multiple-alternative constraints
Hi Uros Thanks for the remarks! I improve the patch as attached to address the issues you mentioned: 1. No changes to substs any more. 2. Adopt established approach (e.g "rcp14") to handle zero masks. I'd like to explain our motivation of combining vfixupimm patterns: there will be a lot of new x86 instructions with both masking and rounding like vfixupimm in the future but we still want to keep x86 MD as short as possible and don't want to write 2 patterns for each of these new instructions, which will also raise code review cost for maintainer. We want to make sure the new pattern paradigm is ok for x86 maintainer through this patch. Wei Uros Bizjak 于2018年11月7日周三 下午3:24写道: > > On Tue, Nov 6, 2018 at 11:16 AM Wei Xiao wrote: > > > > Hi maintainers, > > > > The attached patch intends to optimize VFIXUPIMM* patterns with > > multiple-alternative constraints and > > 4 patterns are combined into 2 patterns. Tested with bootstrap and > > regression tests on x86_64. No regressions. > > > > Is it OK for trunk? > > I'm not convinced that this particular optimization is a good idea. > Looking at the patch, you have to add a whole bunch of substs just to > merge two pattern sets. Also, the approach diverges from established > approach of handling zero masks. The later raises maintenance costs > for no compelling reason. > > I'd say to leave these patterns the way they are. > > Uros. combine-vfixupimm-v2.diff Description: Binary data
[PATCH] x86: Add -march=cascadelake
Hi, The attached patch added -march=cascadelake for x86. Tested with bootstrap and regression tests on x86_64. No regressions. Is it ok for trunk? Wei gcc/ * common/config/i386/i386-common.c (processor_names): Add cascadelake. (processor_alias_table): Add cascadelake. * config.gcc: Add -march=cascadelake. * config/i386/driver-i386.c (host_detect_local_cpu): Detect cascadelake. * config/i386/i386-c.c (ix86_target_macros_internal): Handle cascadelake. * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE. (processor_cost_table): Add cascadelake. (get_builtin_code_for_version): Handle cascadelake. (fold_builtin_cpu): Ditto. * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New. (PTA_CASCADELAKE): Ditto. * doc/invoke.texi: Add -march=cascadelake. gcc/testsuite/ * g++.target/i386/mv16.C: Handle new march. * gcc.target/i386/funcspec-56.inc" Ditto. libgcc/ * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE. cascadelake.diff Description: Binary data
Re: [PATCH] x86: Add -march=cascadelake
Jakub, Thanks for the comments! I have addressed them as attached. Wei gcc/ * common/config/i386/i386-common.c (processor_names): Add cascadelake. (processor_alias_table): Add cascadelake. * config.gcc: Add -march=cascadelake. * config/i386/driver-i386.c (host_detect_local_cpu): Detect cascadelake. * config/i386/i386-c.c (ix86_target_macros_internal): Handle cascadelake. * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE. (processor_cost_table): Add cascadelake. (get_builtin_code_for_version): Handle cascadelake. (fold_builtin_cpu): Ditto. * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New. (PTA_CASCADELAKE): Ditto. * doc/invoke.texi: Add -march=cascadelake. gcc/testsuite/ * g++.target/i386/mv16.C: Handle new march. * gcc.target/i386/funcspec-56.inc" Ditto. libgcc/ * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE. Jakub Jelinek 于2018年11月21日周三 下午7:09写道: > > On Wed, Nov 21, 2018 at 06:23:41PM +0800, Wei Xiao wrote: > > The attached patch added -march=cascadelake for x86. > > Tested with bootstrap and regression tests on x86_64. No regressions. > > Is it ok for trunk? > > Not a real review, just nits: > > index bff4dfb..f7c1c98 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,18 @@ > +2018-11-21 Wei Xiao > > Two spaces after date, two spaces before <. > > --- a/gcc/config/i386/driver-i386.c > +++ b/gcc/config/i386/driver-i386.c > @@ -857,6 +857,9 @@ const char *host_detect_local_cpu (int argc, const char > **argv) > /* Assume Ice Lake. */ > else if (has_gfni) > cpu = "icelake-client"; > + /* Assume Cascade Lake. */ > + if (has_avx512vnni) > + cpu = "cascadelake"; > /* Assume Cannon Lake. */ > else if (has_avx512vbmi) > cpu = "cannonlake"; > > Doesn't this break handling of all the other CPUs? I mean, it is a large > if (cond) ... else if (cond) ... else if (cond) ... else ... > but you've added if without else before it into the middle. > > Jakub cascadelake-v2.diff Description: Binary data
Re: [PATCH] x86: Add -march=cascadelake
Thanks for the helpful information! But I'm still checking with hardware team about the family/model/stepping numbers for Cascadelake which are not officially disclosed by Intel (to my best knowledge). Wei Martin Liška 于2018年11月26日周一 下午10:00写道: > > On 11/26/18 12:18 PM, Jakub Jelinek wrote: > > On Mon, Nov 26, 2018 at 12:03:53PM +0100, Martin Liška wrote: > >>> For Cascade Lake the model number is the same as Skylake Server, > >>> it can only be distinguished based on the stepping (5 vs 4) > >> > >> Very interesting, probably the first time a distinguish is based on > >> stepping number? > > > > Wouldn't it be better to distinguish it based on availability of VNNI, like > > we do for unknown family/model? > > > >>> Like gcc -mcpu=native needs to learn about this. > >> > >> I'm attaching patch that does that. Note that it's completely untested as > >> I don't have > >> access to any of the new machines (Skylake server). > > Would be possible, the only ugly place would be in > libgcc/config/i386/cpuinfo.c where we > call: > > get_intel_cpu (family, model, stepping, brand_id); > /* Find available features. */ > get_available_features (ecx, edx, max_level, &avx512_vnni); > > one would need a feature to distinguish CPU model. Do we really want that? > > Martin > > > > > Jakub > > >
Re: [PATCH] x86: Add -march=cascadelake
Hi Distinguish based on stepping number is not recommended for some reasons: 1) Intel doesn't officially disclose stepping information in SDM. 2) Stepping can be changing in the future. We still prefer the conventional distinguish approach based on feature bits. I have refined the patch as attached according to all your suggestions. Wei gcc/ * common/config/i386/i386-common.c (processor_names): Add cascadelake. (processor_alias_table): Add cascadelake. * config.gcc: Add -march=cascadelake. * config/i386/driver-i386.c (host_detect_local_cpu): Detect cascadelake. * config/i386/i386-c.c (ix86_target_macros_internal): Handle cascadelake. * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE. (processor_cost_table): Add cascadelake. (get_builtin_code_for_version): Handle cascadelake. (fold_builtin_cpu): Ditto. * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New. (PTA_CASCADELAKE): Ditto. * doc/extend.texi: Add cascadelake. * doc/invoke.texi: Add -march=cascadelake. gcc/testsuite/ * g++.target/i386/mv16.C: Handle new march. * gcc.target/i386/builtin_target.c: Ditto. * gcc.target/i386/funcspec-56.inc: Ditto. libgcc/ * config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake. * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE. Wei Xiao 于2018年11月27日周二 下午6:40写道: > > Thanks for the helpful information! > But I'm still checking with hardware team about the > family/model/stepping numbers for Cascadelake which are not officially > disclosed by Intel (to my best knowledge). > > Wei > Martin Liška 于2018年11月26日周一 下午10:00写道: > > > > On 11/26/18 12:18 PM, Jakub Jelinek wrote: > > > On Mon, Nov 26, 2018 at 12:03:53PM +0100, Martin Liška wrote: > > >>> For Cascade Lake the model number is the same as Skylake Server, > > >>> it can only be distinguished based on the stepping (5 vs 4) > > >> > > >> Very interesting, probably the first time a distinguish is based on > > >> stepping number? > > > > > > Wouldn't it be better to distinguish it based on availability of VNNI, > > > like > > > we do for unknown family/model? > > > > > >>> Like gcc -mcpu=native needs to learn about this. > > >> > > >> I'm attaching patch that does that. Note that it's completely untested > > >> as I don't have > > >> access to any of the new machines (Skylake server). > > > > Would be possible, the only ugly place would be in > > libgcc/config/i386/cpuinfo.c where we > > call: > > > > get_intel_cpu (family, model, stepping, brand_id); > > /* Find available features. */ > > get_available_features (ecx, edx, max_level, &avx512_vnni); > > > > one would need a feature to distinguish CPU model. Do we really want that? > > > > Martin > > > > > > > > Jakub > > > > > cascadelake-v3.diff Description: Binary data
[PATCH] x86: Revert patches to fix PR target/88794
Hi, It turns out that the Intel 64 and IA-32 Architectures Software Developer Manuals (SDM) description about the fixupimm intrinsic is incorrect. So we need to revert 3 patches related to it: r265827, r266026 and r267160. Sorry for the inconvenience. Is it ok? Wei
Re: [PATCH] x86: Revert patches to fix PR target/88794
> > Yes, but please test the compiler after the revert. Please also create > > a runtime testcase out of the testcase in the PR. Yes, we have tested it but current runtime testcase can't cover the corner case to expose the incorrectness of SDM. We will add some after the revert. > For r267160, I'd expect you want to revert just the config/i386/ part and > keep the testcases, they should work even with the changes reverted, right? > The testcase part also need to be reverted since we have changed them according to the incorrect intrinsic list in SDM. Jakub Jelinek 于2019年1月15日周二 下午11:20写道: > > On Tue, Jan 15, 2019 at 04:14:06PM +0100, Uros Bizjak wrote: > > On Tue, Jan 15, 2019 at 3:40 PM Wei Xiao wrote: > > > > > > Hi, > > > > > > It turns out that the Intel 64 and IA-32 Architectures Software Developer > > > Manuals (SDM) description about the fixupimm intrinsic is incorrect. So > > > we need > > > to revert 3 patches related to it: r265827, r266026 and r267160. > > > Sorry for the inconvenience. > > > > > > Is it ok? > > > > Yes, but please test the compiler after the revert. Please also create > > a runtime testcase out of the testcase in the PR. > > For r267160, I'd expect you want to revert just the config/i386/ part and > keep the testcases, they should work even with the changes reverted, right? > > Jakub
Re: [PATCH] x86: Revert patches to fix PR target/88794
The original runtime testcases are incorrect and I have fixed them as attached. Is it ok to do the revert and fix the testcases for trunk? Wei 2019-01-16 Wei Xiao * gcc.target/i386/avx512f-vfixupimmpd-2.c: Fix the test cases for VFIXUPIMM* intrinsics. * gcc.target/i386/avx512f-vfixupimmps-2.c: Ditto. * gcc.target/i386/avx512f-vfixupimmsd-2.c: Ditto. * gcc.target/i386/avx512f-vfixupimmss-2.c: Ditto. fixupimm_testcases.diff Description: Binary data
Re: [PATCH] Read avx512vl-vfixupimms*-2.c testcases (PR target/88489)
> > > For r267160, I'd expect you want to revert just the config/i386/ part and > > > keep the testcases, they should work even with the changes reverted, > > > right? > > > > > The testcase part also need to be reverted since we have changed them > > according to the incorrect intrinsic list in SDM. > > I don't really understand this. > > The testcases succeed just fine for me in the current trunk with all the > reversions and test something the current state of the testsuite doesn't > check normally, in particular that the testcases run correctly even when > -mavx512vl is used. As that misbehaved in the past, we should make sure we > don't break that again. > You're right. The testcases need to be kept to prevent regression. > Uros, is it ok to reapply this to current trunk?