Richard,
I've benchmarked your patch on Skylake with SPEC CPU 20[06|17][fp|int]rate
and another smaller benchmark suites. I found that it doesn't regress
any benchmark off-noise but improves 525.x264 by 1.8%, 526.blender by 1.9% and
465.tonto by 3.2%.
I think this is a good reason to merge the p
, February 7, 2018 2:15 PM
To: Shalnov, Sergey
Cc: gcc-patches@gcc.gnu.org; Peryt, Sebastian ;
Ivchenko, Alexander ; Kirill Yukhin
Subject: Re: [PATCH, i386] Fix ix86_multiplication_cost for SKX
On Wed, Feb 7, 2018 at 2:02 PM, Shalnov, Sergey
wrote:
> Hi,
> This patch is one of the set of patc
Hi,
This patch contain cost model change for SKX and closes PR target/83008
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008)
It provides following performance scores in geomean:
SPEC CPU2017 intrate +0.6%
SPEC CPU2017 fprate +1.5%
SPEC 2006 [int|fp] no changes out of noise
I found a regressi
Hi,
This patch is one of the set of patches to fix SKX costs.
I think multiplication costs calculation algorithm needs to be adjusted in
gcc/config/i386/i386.c ix86_multiplication_cost() function.
For TARGET_AVX512DQ emulation is not used and single vpmullq instruction
emitted.
I think we have t
Hi,
Should we use vector instructions if the scalar and vector costs in SLP are the
same?
According to the source line comment (already in source code) we should not
use vector instructions in this case.
I would like to propose to use scalars if costs are the same.
Sergey
2017-12-27 Sergey Sha
Hi,
I found wrong MODE_XI used in movdi_internal that cause zmm
Generation with "-march=skylake-avx512 -mprefer-vector-width=128"
options set. This patch fixes the mode and register type but keep using
AVX512 instruction set.
2017-11-28 Sergey Shalnov
gcc/
* config/i386/i386.md: Fix A
Hi,
I found wrong vpcmpeqd instruction form generated in case of
"-march=skylake-avx512 -mprefer-vector-width=128" options set
The compiler emits following error at compile stage:
Error: invalid register operand for `vpcmpeqd'
Because following was generated:
vpcmpeqd %xmm16, %xmm
Hi,
I found wrong ymm registers are generated in case of
"-march=skylake-avx512 -mprefer-vector-width=128" options set
The code looks like:
movq%r11, 64(%rbx)
vpxord %ymm0, %ymm0, %ymm0
vmovdqa64 %xmm0, 32(%rbx)
movq%r11, 15584(%rbx)
where MODE_TI
: Uros Bizjak [mailto:ubiz...@gmail.com]
Sent: Wednesday, November 22, 2017 9:18 PM
To: Shalnov, Sergey
Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval, Julia
; Senkevich, Andrew ; Peryt,
Sebastian ; Ivchenko, Alexander
; Joseph Myers
Subject: Re: [PATCH, i386] Fix behavior for
Hi,
This patch making –mprefer-vector-width= option inclusive. This means that
if we use –mprefer-vector-width=128 it should switch TARGET_PREFER_AVX128=ON
and TARGET_PREFER_AVX256=ON also.
It is minor change to generate “xmm” with –mprefer-vector-width=128
on the platform with “zmm”.
Sergey
2
Uros,
Yes, please. Thank you for your proposals and comments.
Please commit as you proposed.
Sergey
-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com]
Sent: Tuesday, November 21, 2017 6:13 PM
To: Shalnov, Sergey
Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval
acceptable.
Thank you
Sergey
-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com]
Sent: Tuesday, November 14, 2017 7:57 AM
To: Joseph Myers
Cc: Shalnov, Sergey ; gcc-patches@gcc.gnu.org;
kirill.yuk...@gmail.com; Koval, Julia ; Senkevich,
Andrew ; Peryt, Sebastian
; Ivchenko
Hi,
Modern architectures provides wider and wider vector registers. This patch
implements
common (in i386 arch) option to prefer vector register width for the vectorizer.
Currently, GCC has "-mprefer-avx128" and "-mprefer-avx256" options to limit
maximum
vector register width in vectorizer. To av
Hi,
This patch makes "prefer-avx256" option as default tuning for "skylake-avx512".
This is due to better performance of 256-bit code for some of the cases. In
case of
Skylake Server the Optimization Manual has following "Since port 0 and port 1
are 256-bits wide,
Intel AVX-512 operations that
alnov, Sergey
Cc: Jakub Jelinek ; 'gcc-patches@gcc.gnu.org'
; 'ubiz...@gmail.com' ; Senkevich,
Andrew ; Ivchenko, Alexander
; Peryt, Sebastian
Subject: Re: [PATCH, i386] Avoid 512-bit mode MOV for prefer-avx256 option in
Intel AVX512 configuration
Hello Sergey,
On 06 Oct 14:
Uros,
Is this patch (second one which fixed in the way as Jakub proposed) ok for the
trunk?
Could you please merge it?
Sergey
-Original Message-
From: Shalnov, Sergey
Sent: Friday, October 6, 2017 4:20 PM
To: Jakub Jelinek
Cc: 'gcc-patches@gcc.gnu.org' ; 'ub
case of TARGET_PREFER_AVX256.
I would propose to merge this patch as temporal solution.
Sergey
-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On
Behalf Of Jakub Jelinek
Sent: Friday, October 6, 2017 11:58 AM
To: Shalnov, Sergey
Cc:
Hi,
GCC uses full 512-bit register in case of moving SF/DF value between two
registers.
The patch avoid 512-bit register usage if "-mprefer-avx256" option used.
2017-10-06 Sergey Shalnov
gcc/
* config/i386/i386.md(*movsf_internal, *movdf_internal):
Avoid 512-bit AVX modes for
Sorry. The patch is changed as you proposed.
-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com]
Sent: Thursday, September 28, 2017 3:17 PM
To: Shalnov, Sergey
Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Senkevich, Andrew
; Ivchenko, Alexander
; Peryt
Hi,
GCC uses full 512-bit register to return the constant from the function.
The patch avoid 512-bit register usage if "-mprefer-avx256" option used.
2017-09-28 Sergey Shalnov
gcc/
* config/i386/i386.md(*movsf_internal, *movdf_internal):
Return 256-bit AVX modes for TARGET_PREF
sage.
-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com]
Sent: Thursday, September 21, 2017 3:54 PM
To: Shalnov, Sergey
Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval, Julia
; Senkevich, Andrew ;
Ivchenko, Alexander
Subject: Re: [PATCH, i386] Avoid fixed 512-bit v
Hi,
GCC uses full 512-bit register to keep the constant. This constant uses in the
code further but with 128-bit vector length.
The patch avoid fixed large vector length usage.
For the simple code:
void my_test(short *table)
{
for (int i = 0; i < 128; ++i) {
table[i] = -1;
}
}
It generat
("prefer-avx256"))) works
* gcc.target/i386/avx512f-prefer.c: New test. Avoid 512bit
registers if -mprefer-avx256 used
-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com]
Sent: Wednesday, September 20, 2017 4:25 PM
To: Shalnov, Sergey
Cc: gcc-patches@g
ros Bizjak [mailto:ubiz...@gmail.com]
Sent: Wednesday, September 20, 2017 3:51 PM
To: Shalnov, Sergey
Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval, Julia
; Senkevich, Andrew
Subject: Re: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512
configuration
On Wed, Sep 20, 201
Uros,
Could you please merge the patch into mainline?
Thank you
Sergey
-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com]
Sent: Tuesday, September 19, 2017 6:17 PM
To: Shalnov, Sergey
Cc: gcc-patches@gcc.gnu.org; kirill.yuk...@gmail.com; Koval, Julia
; Senkevich, Andrew
tyle of changes" in previous message.
If you like to change "ix86_autovectorize_vector_sizes" function
algorithmically, I would propose to do
this in separate patch.
Sergey
-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com]
Sent: Monday, September 18, 20
f Of Jakub Jelinek
Sent: Thursday, September 14, 2017 2:36 PM
To: Shalnov, Sergey
Cc: gcc-patches@gcc.gnu.org; ubiz...@gmail.com; kirill.yuk...@gmail.com; Koval,
Julia ; Senkevich, Andrew
Subject: Re: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512
configuration
On Thu, Sep 1
Hi,
GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead of
256-bit AVX registers in the auto-vectorizer.
This patch enables the command line option "mprefer-avx256" that reduces
512-bit registers usage in "march=skylake-avx512" mode.
This is the initial implementation of the
28 matches
Mail list logo