date:20190107

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

2019-01-07 Thread Richard Biener

On Sun, 6 Jan 2019, Jan Hubicka wrote:

> Hello,
> while running benchmarks for inliner tuning I also run benchmarks
> comparing -O2 and -O2 -ftree-vectorize -ftree-slp-vectorize using Martin
> Liska's LNT setup (https://lnt.opensuse.org/).  The results are
> summarized below but you can also see also colorful table produced
> by Martin's LNT magic
> 
> https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?num_runs=3&min_percentage_change=0.02&revisions=746f%2C55f&fbclid=IwAR1EhvEnavV5Fg5g404cTrguOXG2cW7b3mRZZvtYn1qy93zihyAanZ7AiWQ
> https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?num_runs=10&min_percentage_change=0.02&revisions=746f%2C55f
> 
> Overall we got following SPECrate improvements:
> 
>  SPECfp2k6   kabylake generic  +7.15%
>  SPECfp2k6   kabylake native   +9.36%
>  SPECfp2k17  kabylake generic  +5.36%
>  SPECfp2k17  kabylake native   +6.03%
>  SPECint2k17 kabylake generic  +4.13%
> 
>  SPECfp2k6   zen  generic  +9.98%
>  SPECfp2k6   zen  native   +7.04%
>  SPECfp2k17  zen  generic  +6.11%
>  SPECfp2k17  zen  native   +5.46%
>  SPECint2k17 zen  generic  +3.61%
>  SPECint2k17 zen  native   +5.18%
> 
> The performance results seems surprisingly a lot in favor of
> vectorization.  Martin's setup is also checking code size which goes up
> by as much 26% on leslie 3d, but since many of benchmarks are small,
> this is not very representative for overall code size/compile time costs
> of vectorization.
> 
> I measured compile time/size on larger programs I have available with
> notable changes on DealII, but otherwise sub 1% increases.  I also
> benchmarked Firefox but there are no significant differences because
> build system already uses -O3 for places where it matters (graphics
> library etc.)

Well, as much as compile-time/size of spec is not representable
the performance improvements are.

>Compile timecode segment size 
> Firefox   mainlin   in noise 0.8%
> gcc from spec2k6  0.5%   0.6%
> gdb   0.8%   0.3%
> crafty0% 0%
> DealII3.2%   4%
> 
> Note that I benchmarked -ftree-slp-vectorize separately before and
> results was hit/miss, so perhaps enabling only -ftree-vectorize would
> give better compile time tradeoffs. I was worried of partial memory
> stalls, but I will benchmark it and also benchmark difference between
> cost models.
>
> There are some performance regressions, most notably in SPEC
>  - exchange (all settings),
>  - gamess (all settings),
>  - calculix (Zen native only),
>  - bwaves (zen native) 
> and induct2 on all settings and ffft2 zen only from Polyhedron. Botan
> seems very noisy, but it is rather special code.
> 
> Exchange can be fixed by adding heuristics that it is bad idea to
> vectorize withing loop nest of 10 containing recursive call. I believe
> gamess and calculix are understood and i can look into the remaining
> cases.
> 
> Overall I am surprised how many improvements vectorization at -O2 can do
> - clearly more parallel CPUs depends it depends on it.  In my experience
> from analyzing regressions of gcc -O2 compared to clang -O2 buids,
> vectorization is one of most common reasons. Having gcc -O2 producing
> lower SPEC scores and comparably large binaries to clang -O2 does not
> feel OK and I think the problem is not limited just to artificial
> benchmarks.
> 
> Even though it is late in release cycle I wonder if we can do that for
> GCC 9?  Performance of vectorization is very architecture specific, I
> would propose enabling vectorization for Zen, core based chips and
> generic in x86-64. I can also run benchmarks on buldozer. I can then
> tune down the cheap model to avoid some of more expensive
> transformations.

I'd rather not do this now, it's _way_ too late (also considering
you are again doing inliner tuning so late).

See our last attempts at this btw.

Richard.
 
> Honza
> 
> 
> Kabylake Spec2k6, generic tuning
> 
>   improvements:
> SPEC2006/FP/481.wrf   -31.33% 
> SPEC2006/FP/436.cactusADM -28.17% 
> SPEC2006/FP/437.leslie3d  -17.21% 
> SPEC2006/FP/434.zeusmp-12.90% 
> SPEC2006/FP/454.calculix  -6.44%  
> SPEC2006/FP/433.milc  -6.03%  
> SPEC2006/FP/459.GemsFDTD  -4.65%  
> SPEC2006/FP/450.soplex-2.11%  
> SPEC2006/INT/403.gcc  -6.54%  
> SPEC2006/INT/456.hmmer-5.45%  
> SPEC2006/INT/464.h264ref  -2.23%  
>   regresions:
> SPEC2006/FP/416.gamess8.51%   
> SPEC2006/FP/447.dealII2.73%   
> 
> Kabylake spec2k6 -march=native
> 
>   improvements:
> SPEC2006/FP/436.cactusADM -45.52% 
> SPEC2006/FP/481.wrf   -34.13% 
> SPEC2006/FP/434.zeusmp-20.25% 
> SPEC2006/FP/437.leslie3d  -1

Re:4G WiFi cameras

2019-01-07 Thread Janson Zhan




4G camera !!!
Hi dear gcc,

How are you?

This is ZYsecurity co.,ltd new 4G Wireless WiFi Bullet cameras.
Support:
* Audio/SD card/Wireless WiFi/4G/Reset bullon/APP:CamHi

If you have interested in this products welcome to contact me to get more 
details.

Looking forward to hear from you soon.

Thanks & Regards

Janson Zhan



This email was sent to gcc@gcc.gnu.org (mailto:gcc@gcc.gnu.org)
why did I get this? 
(https://zysecurity.us15.list-manage.com/about?u=180ff4e1e8d2da3c75d9a68d4&id=01b0171922&e=0d7ddca964&c=67e340c5a6)
 unsubscribe from this list 
(https://zysecurity.us15.list-manage.com/unsubscribe?u=180ff4e1e8d2da3c75d9a68d4&id=01b0171922&e=0d7ddca964&c=67e340c5a6)
 update subscription preferences 
(https://zysecurity.us15.list-manage.com/profile?u=180ff4e1e8d2da3c75d9a68d4&id=01b0171922&e=0d7ddca964)
CCTV security . 5-F,3th Building,BaoFeng industrial area,XiaShuiJing, Buji 
Town, LongGang District, ShenZhen,China . Shenzhen, 86 518110 . China

Email Marketing Powered by Mailchimp
http://www.mailchimp.com/monkey-rewards/?utm_source=freemium_newsletter&utm_medium=email&utm_campaign=monkey_rewards&aid=180ff4e1e8d2da3c75d9a68d4&afl=1

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

2019-01-07 Thread Eric Botcazou

> Note that I benchmarked -ftree-slp-vectorize separately before and
> results was hit/miss, so perhaps enabling only -ftree-vectorize would
> give better compile time tradeoffs. I was worried of partial memory
> stalls, but I will benchmark it and also benchmark difference between
> cost models.

; Alias to enable both -ftree-loop-vectorize and -ftree-slp-vectorize.
ftree-vectorize
Common Report Optimization
Enable vectorization on trees.

-- 
Eric Botcazou

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

2019-01-07 Thread Jan Hubicka

> > Note that I benchmarked -ftree-slp-vectorize separately before and
> > results was hit/miss, so perhaps enabling only -ftree-vectorize would
> > give better compile time tradeoffs. I was worried of partial memory
> > stalls, but I will benchmark it and also benchmark difference between
> > cost models.
> 
> ; Alias to enable both -ftree-loop-vectorize and -ftree-slp-vectorize.
> ftree-vectorize
> Common Report Optimization
> Enable vectorization on trees.

Thanks! I would probably fall into that trap and run same set of
benchmarks again.

Honza
> 
> -- 
> Eric Botcazou

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

2019-01-07 Thread Segher Boessenkool

On Mon, Jan 07, 2019 at 09:29:09AM +0100, Richard Biener wrote:
> On Sun, 6 Jan 2019, Jan Hubicka wrote:
> > Even though it is late in release cycle I wonder if we can do that for
> > GCC 9?  Performance of vectorization is very architecture specific, I
> > would propose enabling vectorization for Zen, core based chips and
> > generic in x86-64. I can also run benchmarks on buldozer. I can then
> > tune down the cheap model to avoid some of more expensive
> > transformations.
> 
> I'd rather not do this now, it's _way_ too late (also considering
> you are again doing inliner tuning so late).

This probably should be more generic than just x86 really, we have similar
problems on Power (-O3 is almost always faster than -O2, which is bad).
Likely other archs have the same problems.

But yes, too late for GCC 9.


Segher

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

2019-01-07 Thread Jan Hubicka

> On Mon, Jan 07, 2019 at 09:29:09AM +0100, Richard Biener wrote:
> > On Sun, 6 Jan 2019, Jan Hubicka wrote:
> > > Even though it is late in release cycle I wonder if we can do that for
> > > GCC 9?  Performance of vectorization is very architecture specific, I
> > > would propose enabling vectorization for Zen, core based chips and
> > > generic in x86-64. I can also run benchmarks on buldozer. I can then
> > > tune down the cheap model to avoid some of more expensive
> > > transformations.
> > 
> > I'd rather not do this now, it's _way_ too late (also considering
> > you are again doing inliner tuning so late).
> 
> This probably should be more generic than just x86 really, we have similar
> problems on Power (-O3 is almost always faster than -O2, which is bad).
> Likely other archs have the same problems.
> 
> But yes, too late for GCC 9.

Yep, I guessed so, still wanted to ask :)
I think this is similar to schedule-insns(2) which is subtarget specific
whether it is a win or not. So I think it is good to leave up to target
to enable the pass - we probably have fewer targets that do want
vectorizing than those we don't.

I would suggest enabling it on x86 early next stage1 and try to do
similar benchmarks on ppc and arm.  We can then try to tune the code
size/speed tradeoffs.

Honza
> 
> 
> Segher

GCC 9 Status report (2019-01-07), trunk in regression and documentation fixes mode

2019-01-07 Thread Richard Biener



Status
==

Stage 3 is done now.

Changes of GCC trunk should now be restricted to regression and documentation
fixes.  That is, it is in the same mode as the open release branches we have.
As soon as the count of P1 bugs drops to zero (and un-categorized, aka P3
bugs have been categorized) you can expect trunk to branch and stage 1 open
for general development of GCC 10.  Do not hold your breath though, history
suggests you'll have to wait until mid of April for that to happen.

You can make it happen faster by fixing regressions.

Please also give your favorite target production-level quality testing
and make sure to file bugs about regressions you encounter.


Quality Data


Priority  #   Change from GCC 8 stage3 -> stage4 transition
---   ---
P1   42   +   6
P2  187   +  54
P3   47   -  10
P4  182   +  24
P5   25   -   2
---   ---
Total P1-P3 276   +  50
Total   483   +  72


Previous Report
===

https://gcc.gnu.org/ml/gcc/2018-11/msg00067.html

Patch Resend

2019-01-07 Thread nick

Greetings All,

I was wondering as I sent a patch before the holidays if I should resend it 
as I did not get any replies.

Thanks,

Nick

Re: Patch Resend

2019-01-07 Thread Jonathan Wakely

On Mon, 7 Jan 2019 at 15:42, nick wrote:
>
> Greetings All,
>
> I was wondering as I sent a patch before the holidays if I should resend it
> as I did not get any replies.

Which patch? I don't see any patch from you that didn't get some replies.

Re: Patch Resend

2019-01-07 Thread nick




On 2019-01-07 10:44 a.m., Jonathan Wakely wrote:
> On Mon, 7 Jan 2019 at 15:42, nick wrote:
>>
>> Greetings All,
>>
>> I was wondering as I sent a patch before the holidays if I should resend it
>> as I did not get any replies.
> 
> Which patch? I don't see any patch from you that didn't get some replies.
> 
Sorry this is what I was talking about it's a fix for a bad patch:

This fixes the bug id, 71176 to use the proper known
code print formatter type, %lu for size_t rather than
%d which is considered best pratice for print statements.

Signed-off-by: Nicholas Krause 
---
 fixincludes/fixincl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fixincludes/fixincl.c b/fixincludes/fixincl.c
index 6dba2f6e830..5b8b77a77f0 100644
--- a/fixincludes/fixincl.c
+++ b/fixincludes/fixincl.c
@@ -158,11 +158,11 @@ main (int argc, char** argv)
   if (VLEVEL( VERB_PROGRESS )) {
 tSCC zFmt[] =
   "\
-Processed %5d files containing %d bytes\n\
+Processed %5d files containing %lu bytes\n\
 Applying  %5d fixes to %d files\n\
 Altering  %5d of them\n";
 
-fprintf (stderr, zFmt, process_ct, ttl_data_size, apply_ct,
+fprintf (stderr, zFmt, process_ct, (unsigned int long) ttl_data_size, 
apply_ct,
  fixed_ct, altered_ct);
   }
 #endif /* DO_STATS */
-- 
2.17.1



Nick

Re: Patch Resend

2019-01-07 Thread Jonathan Wakely

On Mon, 7 Jan 2019 at 15:51, nick  wrote:
>
>
>
> On 2019-01-07 10:44 a.m., Jonathan Wakely wrote:
> > On Mon, 7 Jan 2019 at 15:42, nick wrote:
> >>
> >> Greetings All,
> >>
> >> I was wondering as I sent a patch before the holidays if I should resend it
> >> as I did not get any replies.
> >
> > Which patch? I don't see any patch from you that didn't get some replies.
> >
> Sorry this is what I was talking about it's a fix for a bad patch:

Ah yes thanks, I see it now, at
https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01511.html

LLVM/GCC social in Nanjing China: Jan 19, 2019

2019-01-07 Thread 吴伟

Hi all,

The 5th LLVM/GCC social in Nanjing will happen on Jan 19, 2019.

Everyone interested in LLVM/GCC/Toolchain/IDE related projects is
invited to join.
Event details is at https://mp.weixin.qq.com/s/7jupkPiRrlxjYEuglMbvFA

BoF style. Presentations are welcome :-)

Looking forward to meet you !

-- 
Best wishes,
Wei Wu (吴伟)

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

Re:4G WiFi cameras

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

Re: Enabling vectorization at -O2 for x86 generic, core and zen tuning

GCC 9 Status report (2019-01-07), trunk in regression and documentation fixes mode

Patch Resend

Re: Patch Resend

Re: Patch Resend

Re: Patch Resend

LLVM/GCC social in Nanjing China: Jan 19, 2019

12 matches

Site Navigation

Mail list logo

Footer information