[Bug c/70686] New: -fprofile-generate (not fprofile-use) somehow produces much faster binary

2016-04-15 Thread alekshs at hotmail dot com
: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: alekshs at hotmail dot com Target Milestone: --- I have this small benchmark that does 100mn loops of 20 divisions by 2. Periodically it bumps up the values so that it continues

[Bug tree-optimization/70686] GIMPLE if-conversion slows down code

2016-04-18 Thread alekshs at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70686 --- Comment #2 from alekshs at hotmail dot com --- (In reply to Richard Biener from comment #1) > It's not so mind-blowing - it's simply that -fprofile-generate makes our > GIMPLE level if-conversion no longer apply. Without -f

[Bug tree-optimization/70686] GIMPLE if-conversion slows down code

2016-04-18 Thread alekshs at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70686 --- Comment #4 from alekshs at hotmail dot com --- I would be somewhat understanding in the context of -O2/-O3 (compiler guessing) but not in the context of PGO (compiler understands the flow after a run - so it should be able to understand that

[Bug c/81467] New: AVX-512 support for inline assembly

2017-07-17 Thread alekshs at hotmail dot com
Assignee: unassigned at gcc dot gnu.org Reporter: alekshs at hotmail dot com Target Milestone: --- I'm trying some avx-512 code with the inline assembly. 1) Clobbering xmm16-31 and k-type registers won't work. I guess it won't work in ymm16-31 or zmm16-31 either:

[Bug target/81467] AVX-512 support for inline assembly

2017-07-18 Thread alekshs at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81467 --- Comment #3 from alekshs at hotmail dot com --- Aha, ok thanks for the clarification. It was pretty helpful. Regarding clobbering, I was compiling on a Skylake Xeon which has avx512f avx512dq avx512cd avx512bw avx512vl using -march=native