[Bug middle-end/51017] New: GCC 4.6 performance regression (vs. 4.4/4.5)

2011-11-07 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 Bug #: 51017 Summary: GCC 4.6 performance regression (vs. 4.4/4.5) Classification: Unclassified Product: gcc Version: 4.6.2 Status: UNCONFIRMED Severity: normal Prior

[Bug middle-end/51017] GCC 4.6 performance regression (vs. 4.4/4.5)

2011-11-07 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #1 from Alexander Peslyak 2011-11-08 00:47:49 UTC --- (In reply to comment #0) > [...] Similar behavior > is seen with current CVS version of John the Ripper, even though it has OpenMP > support for DES heavily revised and integrated

[Bug middle-end/51017] GCC 4.6 performance regression (vs. 4.4/4.5)

2011-11-07 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #2 from Alexander Peslyak 2011-11-08 00:56:47 UTC --- The affected code is in DES_bs_b.c: DES_bs_crypt_25(). (Sorry, I should have mentioned that right away.)

[Bug web/51019] New: unclear documentation on -fomit-frame-pointer default for -Os and different platforms

2011-11-07 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51019 Bug #: 51019 Summary: unclear documentation on -fomit-frame-pointer default for -Os and different platforms Classification: Unclassified Product: gcc Version: 4.6.2 S

[Bug target/13822] enable -fomit-frame-pointer or at least -momit-frame-pointer by default on x86/dwarf2 platforms

2011-11-07 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13822 Alexander Peslyak changed: What|Removed |Added CC||solar-gcc at openwall dot

[Bug middle-end/51017] GCC 4.6 performance regression (vs. 4.4/4.5)

2012-01-02 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #4 from Alexander Peslyak 2012-01-03 04:45:43 UTC --- (In reply to comment #3) > It might be interesting to get numbers for the trunk. There have been some > register allocator fixes which might have improved this. I've just tested

[Bug middle-end/51017] GCC 4.6 performance regression (vs. 4.4/4.5)

2012-01-04 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #5 from Alexander Peslyak 2012-01-04 19:39:26 UTC --- I wrote and ran some scripts to test many versions/snapshots of gcc. It turns out that 4.6-20100703 (oldest 4.6 snapshot available for FTP) was already affected by this regression

[Bug middle-end/51017] GCC 4.6 performance regression (vs. 4.4/4.5)

2012-01-04 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #7 from Alexander Peslyak 2012-01-04 23:00:24 UTC --- (I ran the tests below and wrote this comment before seeing Jakub's. Then I thought I'd post it anyway.) Here are some numbers for gcc releases: 4.0.0 - 383K c/s, 71879 bytes (t

[Bug target/54349] _mm_cvtsi128_si64 unnecessary stores value at stack

2016-02-26 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349 Alexander Peslyak changed: What|Removed |Added CC||solar-gcc at openwall dot com

[Bug target/54349] _mm_cvtsi128_si64 unnecessary stores value at stack

2016-02-26 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349 --- Comment #11 from Alexander Peslyak --- Turns out that gcc 4.6.x to 4.8.x generating "movd" instead of "movq" is actually a deliberate hack, to support binutils older than 2.17 ("movq" support committed in 2005, released in 2006) and (presumab

[Bug tree-optimization/65427] New: ICE in emit_move_insn with wide vector types

2015-03-14 Thread solar-gcc at openwall dot com
: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: solar-gcc at openwall dot com Created attachment 35037 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35037&action=edit testcase GCC 4.7.0 through at least 4.9.2 and 5.0 20150215 snapshot (I haven&#

[Bug middle-end/51017] GCC 4.6 performance regression (vs. 4.4/4.5)

2015-02-15 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #9 from Alexander Peslyak --- (In reply to Andrew Pinski from comment #8) > Can you try GCC 4.9? Yes. Bad news: things mostly became even worse. Same machine, same JtR version, same test script as in my previous comment: 4.9.2 - 1

[Bug middle-end/51017] GCC 4.6 performance regression (vs. 4.4/4.5)

2015-02-15 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #10 from Alexander Peslyak --- I decided to take a look at the generated code. Compared to 4.6.2, GCC 4.9.2 started generating lots of xorps, orps, andps, andnps where it previously generated pxor, por, pand, pandn. Changing those w

[Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure

2015-02-16 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #12 from Alexander Peslyak --- (In reply to Richard Biener from comment #11) > I wonder if you could share the exact CPU type you are using? This is on (dual) Xeon E5420 (using only one core for these benchmarks), but there was simil

[Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure

2015-02-16 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #13 from Alexander Peslyak --- (In reply to Richard Biener from comment #11) > We are putting quite heavy register-pressure on the thing by means of > partial redundancy elimination, thus disabling PRE using -fno-tree-pre > might help

[Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure

2015-02-16 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #14 from Alexander Peslyak --- For completeness, here are the results for 4.7.x, 4.8.x, and 4.9.0: 4.7.0o - 2142K c/s, 29692 bytes, 1267 movaps, 465 movups 4.7.0h - 2823K c/s, 29692 bytes, 1732 movaps, 0 movups 4.7.4o - 2144K c/s, 29

[Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure

2015-02-17 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #17 from Alexander Peslyak --- (In reply to Richard Biener from comment #16) > I'm completely confused now as to what the original regression was reported > against. I'm sorry, I should have re-read my original description of the reg

[Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure

2015-02-17 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #18 from Alexander Peslyak --- (In reply to Richard Biener from comment #11) > Note that we have to use movups because DES_bs_all is not aligned as seen > from DES_bs_b.c (it's defined in DES_bs.c and only there annotated with > CC_CA

[Bug tree-optimization/59124] [4.8/4.9/5 Regression] Wrong warnings "array subscript is above array bounds"

2015-02-17 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124 Alexander Peslyak changed: What|Removed |Added CC||solar-gcc at openwall dot com

[Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure

2015-02-17 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017 --- Comment #19 from Alexander Peslyak --- (In reply to Alexander Peslyak from comment #17) > Should we create a new bug for the unnecessary and non-optional use of > unaligned load instructions for source code like this, or is this considered >

[Bug tree-optimization/59124] [4.8/4.9/5 Regression] Wrong warnings "array subscript is above array bounds"

2015-02-17 Thread solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124 --- Comment #9 from Alexander Peslyak --- (In reply to Alexander Peslyak from comment #8) > $ gcc -S -Wall -O2 -funroll-loops testcase.c > testcase.c: In function 'DES_std_set_key': > testcase.c:14:17: warning: array subscript is above array bou

[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-08-24 Thread solar-gcc at openwall dot com
--- Comment #17 from solar-gcc at openwall dot com 2010-08-24 11:07 --- (In reply to comment #16) > I would really like to see this bug tackled. I second that. > Fixing it is easily done by lowering the spin count as proposed. Otherwise, > please show cases where a low s

[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-08-24 Thread solar-gcc at openwall dot com
--- Comment #19 from solar-gcc at openwall dot com 2010-08-24 12:18 --- (In reply to comment #18) > Then, at the start of the spinning libgomp could initialize that flag and > check > it from time to time (say every few hundred or thousand iterations) whether it > has

[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-09-05 Thread solar-gcc at openwall dot com
--- Comment #22 from solar-gcc at openwall dot com 2010-09-05 11:37 --- (In reply to comment #20) > Maybe we could agree on a compromise for a start. Alexander, what are the > corresponding results for GOMP_SPINCOUNT=10? Unfortunately, I no longer have access to the dual

[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-11-09 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706 --- Comment #23 from Alexander Peslyak 2010-11-09 16:32:53 UTC --- (In reply to comment #20) > Maybe we could agree on a compromise for a start. Alexander, what are the > corresponding results for GOMP_SPINCOUNT=10? I reproduced slowdown of

[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-11-12 Thread solar-gcc at openwall dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706 --- Comment #25 from Alexander Peslyak 2010-11-12 11:19:13 UTC --- (In reply to comment #24) > If only one out of 35 tests becomes slower, You might have misread what I wrote. I did not mention "35 tests"; I mentioned that a test became slower

[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-07-01 Thread solar-gcc at openwall dot com
--- Comment #14 from solar-gcc at openwall dot com 2010-07-02 01:39 --- We're also seeing this problem on OpenMP-using code built with gcc 4.5.0 release on linux-x86_64. Here's a user's report (400x slowdown on an 8-core system when there's a single other proc