https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728
Bug ID: 99728
Summary: code pessimization when using wrapper classes around
SIMD types
Product: gcc
Version: 10.2.1
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728
--- Comment #1 from Martin Reinecke ---
Created attachment 50457
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50457&action=edit
generated assembler
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728
--- Comment #2 from Martin Reinecke ---
Created attachment 50458
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50458&action=edit
additional test case by Alexander Monakov
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728
--- Comment #5 from Martin Reinecke ---
(In reply to Matthias Kretz (Vir) from comment #4)
> FWIW, using std::experimental::native_simd also does not hoist the
> stores out of the loop. However, if you pass d by value and return d, the
> issue go
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728
--- Comment #7 from Martin Reinecke ---
Thanks!
(BTW, I'm aware your code and will immediately switch to it once it lands in
gcc! But for the time being I try to make do with my poor man's version to
avoid the external dependency.)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97564
Bug ID: 97564
Summary: [11.0 regression] pybind11 compilation failure
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516
Bug ID: 98516
Summary: Wrong code generated by tree vectorizer
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optim
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516
--- Comment #1 from Martin Reinecke ---
Minimal set of flags to trigger the problem seems to be
g++ -std=c++17 -O1 -ftree-vectorize -fno-signed-zeros bug.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516
--- Comment #9 from Martin Reinecke ---
Thanks, this fixes the reduced test case for me as well!
Unfortunately there seems to be more where this one came from, since my
comprehensive test suite still fails ... I'll try to produce test cases and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544
Bug ID: 98544
Summary: [11 regression] Wrong code generated by tree
vectorizer
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priorit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544
--- Comment #1 from Martin Reinecke ---
Problem seems to be related to the use of __restrict__.
If I remove the DUCC0_RESTRICT from the function definitions of "radb3",
"radb4" etc., the problem goes away.
However I don't see where I'm violatin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544
--- Comment #13 from Martin Reinecke ---
> What kind of shape (w/o too much guessing) is the function expecting for its
> input arrays?
For radb the size of the cc and ch arrays is l1*ido*x.
Size of wa is (x-1)*ido.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544
--- Comment #15 from Martin Reinecke ---
"Problem at length N" means that the FFT of length N is computed incorrectly.
Also, N==l1*ido*x.
For an FFT of length N, the computation is broken down into several passes.
Let's take N=15.
First the prom
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98544
--- Comment #22 from Martin Reinecke ---
Brilliant, thank you very much for tracking this one down!
My FFT library now works correctly again with all optimizations enabled, which
is a great relief. The scipy maintainers will be happy that they wo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103805
Bug ID: 103805
Summary: Inconsistent exception specifications
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103805
--- Comment #4 from Martin Reinecke ---
Sorry if I specified the wrong version. My local (Debian unstable) g++ reports
martin@marvin:~/codes/ducc$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103805
--- Comment #6 from Martin Reinecke ---
Ouch. That reminds me when Redhat(?) did the same many years ago and caused no
end of confusion. Anyway, sorry for the noise!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103850
Bug ID: 103850
Summary: missed optimization in AVX code
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103850
--- Comment #2 from Martin Reinecke ---
Thanks! This flag indeed causes both kernels to have the same speed, but (at
least for me) it's slower than both original versions...
slow kernel version: 29.027915 GFlops/s
fast kernel version: 29.008313
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103850
--- Comment #3 from Martin Reinecke ---
Just for completeness, this is the CPU I'm running on:
vendor_id : AuthenticAMD
cpu family : 23
model : 96
model name : AMD Ryzen 7 4800H with Radeon Graphics
stepping: 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103850
--- Comment #6 from Martin Reinecke ---
I would have expected that this does not make a significant difference,
assuming that speculative execution works and the branch predictor takes the
jump backwards at the loop's end. In that picture both v
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728
--- Comment #12 from Martin Reinecke ---
Any hope of addressing this for gcc 12?
I have a real-world test case where this effect causes roughly 15-20% slowdown,
and I expect that with the wider availability of std::simd types more people
will enc
22 matches
Mail list logo