https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349
--- Comment #11 from Alexander Peslyak ---
Turns out that gcc 4.6.x to 4.8.x generating "movd" instead of "movq" is
actually a deliberate hack, to support binutils older than 2.17 ("movq" support
committed in 2005, released in 2006) and (presumab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349
Alexander Peslyak changed:
What|Removed |Added
CC||solar-gcc at openwall dot com
: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: solar-gcc at openwall dot com
Created attachment 35037
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35037&action=edit
testcase
GCC 4.7.0 through at least 4.9.2 and 5.0 20150215 snapshot (I haven
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124
--- Comment #9 from Alexander Peslyak ---
(In reply to Alexander Peslyak from comment #8)
> $ gcc -S -Wall -O2 -funroll-loops testcase.c
> testcase.c: In function 'DES_std_set_key':
> testcase.c:14:17: warning: array subscript is above array bou
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #19 from Alexander Peslyak ---
(In reply to Alexander Peslyak from comment #17)
> Should we create a new bug for the unnecessary and non-optional use of
> unaligned load instructions for source code like this, or is this considered
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124
Alexander Peslyak changed:
What|Removed |Added
CC||solar-gcc at openwall dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #18 from Alexander Peslyak ---
(In reply to Richard Biener from comment #11)
> Note that we have to use movups because DES_bs_all is not aligned as seen
> from DES_bs_b.c (it's defined in DES_bs.c and only there annotated with
> CC_CA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #17 from Alexander Peslyak ---
(In reply to Richard Biener from comment #16)
> I'm completely confused now as to what the original regression was reported
> against.
I'm sorry, I should have re-read my original description of the reg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #14 from Alexander Peslyak ---
For completeness, here are the results for 4.7.x, 4.8.x, and 4.9.0:
4.7.0o - 2142K c/s, 29692 bytes, 1267 movaps, 465 movups
4.7.0h - 2823K c/s, 29692 bytes, 1732 movaps, 0 movups
4.7.4o - 2144K c/s, 29
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #13 from Alexander Peslyak ---
(In reply to Richard Biener from comment #11)
> We are putting quite heavy register-pressure on the thing by means of
> partial redundancy elimination, thus disabling PRE using -fno-tree-pre
> might help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #12 from Alexander Peslyak ---
(In reply to Richard Biener from comment #11)
> I wonder if you could share the exact CPU type you are using?
This is on (dual) Xeon E5420 (using only one core for these benchmarks), but
there was simil
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #10 from Alexander Peslyak ---
I decided to take a look at the generated code. Compared to 4.6.2, GCC 4.9.2
started generating lots of xorps, orps, andps, andnps where it previously
generated pxor, por, pand, pandn. Changing those w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #9 from Alexander Peslyak ---
(In reply to Andrew Pinski from comment #8)
> Can you try GCC 4.9?
Yes. Bad news: things mostly became even worse. Same machine, same JtR
version, same test script as in my previous comment:
4.9.2 - 1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #7 from Alexander Peslyak
2012-01-04 23:00:24 UTC ---
(I ran the tests below and wrote this comment before seeing Jakub's. Then I
thought I'd post it anyway.)
Here are some numbers for gcc releases:
4.0.0 - 383K c/s, 71879 bytes (t
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #5 from Alexander Peslyak
2012-01-04 19:39:26 UTC ---
I wrote and ran some scripts to test many versions/snapshots of gcc. It turns
out that 4.6-20100703 (oldest 4.6 snapshot available for FTP) was already
affected by this regression
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #4 from Alexander Peslyak
2012-01-03 04:45:43 UTC ---
(In reply to comment #3)
> It might be interesting to get numbers for the trunk. There have been some
> register allocator fixes which might have improved this.
I've just tested
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13822
Alexander Peslyak changed:
What|Removed |Added
CC||solar-gcc at openwall dot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51019
Bug #: 51019
Summary: unclear documentation on -fomit-frame-pointer default
for -Os and different platforms
Classification: Unclassified
Product: gcc
Version: 4.6.2
S
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #2 from Alexander Peslyak
2011-11-08 00:56:47 UTC ---
The affected code is in DES_bs_b.c: DES_bs_crypt_25(). (Sorry, I should have
mentioned that right away.)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
--- Comment #1 from Alexander Peslyak
2011-11-08 00:47:49 UTC ---
(In reply to comment #0)
> [...] Similar behavior
> is seen with current CVS version of John the Ripper, even though it has OpenMP
> support for DES heavily revised and integrated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017
Bug #: 51017
Summary: GCC 4.6 performance regression (vs. 4.4/4.5)
Classification: Unclassified
Product: gcc
Version: 4.6.2
Status: UNCONFIRMED
Severity: normal
Prior
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706
--- Comment #25 from Alexander Peslyak
2010-11-12 11:19:13 UTC ---
(In reply to comment #24)
> If only one out of 35 tests becomes slower,
You might have misread what I wrote. I did not mention "35 tests"; I mentioned
that a test became slower
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706
--- Comment #23 from Alexander Peslyak
2010-11-09 16:32:53 UTC ---
(In reply to comment #20)
> Maybe we could agree on a compromise for a start. Alexander, what are the
> corresponding results for GOMP_SPINCOUNT=10?
I reproduced slowdown of
--- Comment #22 from solar-gcc at openwall dot com 2010-09-05 11:37 ---
(In reply to comment #20)
> Maybe we could agree on a compromise for a start. Alexander, what are the
> corresponding results for GOMP_SPINCOUNT=10?
Unfortunately, I no longer have access to the dual
--- Comment #19 from solar-gcc at openwall dot com 2010-08-24 12:18 ---
(In reply to comment #18)
> Then, at the start of the spinning libgomp could initialize that flag and
> check
> it from time to time (say every few hundred or thousand iterations) whether it
> has
--- Comment #17 from solar-gcc at openwall dot com 2010-08-24 11:07 ---
(In reply to comment #16)
> I would really like to see this bug tackled.
I second that.
> Fixing it is easily done by lowering the spin count as proposed. Otherwise,
> please show cases where a low s
--- Comment #14 from solar-gcc at openwall dot com 2010-07-02 01:39 ---
We're also seeing this problem on OpenMP-using code built with gcc 4.5.0
release on linux-x86_64. Here's a user's report (400x slowdown on an 8-core
system when there's a single other proc
27 matches
Mail list logo