https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Ma Lin changed:
What|Removed |Added
CC||malincns at 163 dot com
--- Comment #18 from Ma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Andrew Senkevich changed:
What|Removed |Added
CC||andrew.n.senkevich at gmail
dot co
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
--- Comment #16 from Travis Downs ---
Also, this is fixed for Skylake for tzcnt and lzcnt but not popcnt.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Travis Downs changed:
What|Removed |Added
CC||travis.downs at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Uroš Bizjak changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Uroš Bizjak changed:
What|Removed |Added
Target Milestone|--- |4.9.2
--- Comment #13 from Uroš Bizjak --
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
--- Comment #12 from uros at gcc dot gnu.org ---
Author: uros
Date: Thu Aug 21 18:03:49 2014
New Revision: 214279
URL: https://gcc.gnu.org/viewcvs?rev=214279&root=gcc&view=rev
Log:
Backport from mainline
2014-08-19 H.J. Lu
* confi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
--- Comment #11 from uros at gcc dot gnu.org ---
Author: uros
Date: Mon Aug 18 18:00:52 2014
New Revision: 214112
URL: https://gcc.gnu.org/viewcvs?rev=214112&root=gcc&view=rev
Log:
PR target/62011
* config/i386/x86-tune.def (X86_TUNE_AVOI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Richard Biener changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
--- Comment #9 from Yuri Rumyantsev ---
This is not u32 version but u64. The first loop (u32) version looks like:
.L23:
leal1(%rdx), %ecx
xorq%rax, %rax
popcntq(%rbx,%rax,8), %rax
leal2(%rdx), %r8d
xorq
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
--- Comment #8 from finis at in dot tum.de ---
@Yuri: Note however, that the result of your fixed u32 version seems to be
wrong.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
--- Comment #7 from Yuri Rumyantsev ---
Please ignore my previous comment - if we insert nullifying of destination
register before each popcnt (and lzcnt) performance will restore:
original test results:
unsigned8388663 0.848533
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Yuri Rumyantsev changed:
What|Removed |Added
CC||ysrumyan at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
--- Comment #5 from finis at in dot tum.de ---
Maybe there are a lot more instructions with such a false dependency. popcnt
may only be the tip of the ice berg. I don't think Intel only got this
operation wrong and all other SSE/AVX/... instructio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
finis at in dot tum.de changed:
What|Removed |Added
CC||finis at in dot tum.de
--- Comme
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
--- Comment #3 from Andev ---
This seems to be specific to some latest Intel CPUs. I am not sure which other
CPUs are affected. There is no official errata for this behavior AFAIK.
As Alexander suggested, it would be a great idea to have a work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
--- Comm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62011
Richard Biener changed:
What|Removed |Added
Keywords||missed-optimization
Target|
18 matches
Mail list logo