[Bug target/115325] RVV vmulh and vmulhu unknown without -march, but vmul is known

2024-06-04 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115325 --- Comment #2 from Jan Wassenberg --- Thanks, we are equipped to use pragma GCC target as soon as it is ready. Is there any bug/tracker to which I could subscribe for updates on that?

[Bug c++/115325] New: RVV vmulh and vmulhu unknown without -march, but vmul is known

2024-06-03 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115325 Bug ID: 115325 Summary: RVV vmulh and vmulhu unknown without -march, but vmul is known Product: gcc Version: 14.1.0 Status: UNCONFIRMED Severity: normal

[Bug target/115115] [12/13/14/15 Regression] highway-1.0.7 wrong _mm_cvttps_epi32() constant fold

2024-05-20 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115115 --- Comment #12 from Jan Wassenberg --- Thanks for continuing to look into this and filing the issues. It seems like the Highway tests are examples of nontrivial vector code that are detecting some regressions. Would it be useful to add them to

[Bug target/115115] [12/13/14/15 Regression] highway-1.0.7 wrong _mm_cvttps_epi32() constant fold

2024-05-17 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115115 --- Comment #10 from Jan Wassenberg --- We have a workaround. I changed the ConvertTo (round to nearest int) code to const auto overflow = RebindMask(di, Ge(v, Set(df, 2147483648.0f))); return IfThenElse(overflow, Set(di, LimitsMax()), Convert

[Bug target/115115] [12/13/14/15 Regression] highway-1.0.7 wrong _mm_cvttps_epi32() constant fold

2024-05-17 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115115 --- Comment #9 from Jan Wassenberg --- On second thought, we are actually trying to convert out-of-bounds values to the closest representable. We use the documented behavior of the instruction, as mentioned in #5, and then correct the result aft

[Bug target/115115] [12/13/14/15 Regression] highway-1.0.7 wrong _mm_cvttps_epi32() constant fold

2024-05-17 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115115 Jan Wassenberg changed: What|Removed |Added CC||jan.wassenberg at gmail dot com --- Co

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-16 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828 --- Comment #4 from Jan Wassenberg --- I understand the slippery slope concern. But the empty asm string is a special case, we and others use it (with +r output and memory clobber) to prevent optimizing variables out e.g. during tests. It seems

[Bug target/111366] error: inlining failed in call to 'always_inline' 'hwy::PreventElision(int&)void': target specific option mismatch

2023-09-14 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111366 --- Comment #18 from Jan Wassenberg --- Ah, got it. We'll change the pragma to ",htm" as you suggest. Thank you :)

[Bug target/111366] error: inlining failed in call to 'always_inline' 'hwy::PreventElision(int&)void': target specific option mismatch

2023-09-11 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111366 --- Comment #6 from Jan Wassenberg --- Thinking about this more, the LTO means more opportunity for inlining and thus for the compiler to hit the legit "don't want to inline POWER9 into POWER8" error. Interestingly this does not happen on x86 -

[Bug target/111366] error: inlining failed in call to 'always_inline' 'hwy::PreventElision(int&)void': target specific option mismatch

2023-09-11 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111366 Jan Wassenberg changed: What|Removed |Added CC||jan.wassenberg at gmail dot com --- Co

[Bug c++/111117] New: Crash in vector code

2023-08-23 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17 Bug ID: 17 Summary: Crash in vector code Product: gcc Version: 8.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: un

[Bug tree-optimization/109175] error: 'void* __builtin_memset(void*, int, long unsigned int)' writing 4 or more bytes into a region of size 0 overflows the destination [-Werror=stringop-overflow=]

2023-03-24 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109175 --- Comment #7 from Jan Wassenberg --- Thanks, I will be changing the code to add a nullptr check.

[Bug target/109228] warning: implicit declaration of function '__riscv_vlenb'

2023-03-22 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109228 --- Comment #6 from Jan Wassenberg --- Nice, thank you Mathieu, Kito and JuzheZhong!

[Bug target/109173] incorrect intrinsic signature for _mm_srai_epi64

2023-03-17 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109173 --- Comment #5 from Jan Wassenberg --- Thanks, Mathieu, for raising this. Note that clang has changed their intrinsic to require an unsigned arg: https://github.com/google/highway/commit/45b1fac0b1c404e6573c2f182b36c245af6503e0 I understand t

[Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)

2022-07-08 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187 --- Comment #23 from Jan Wassenberg --- Thanks for having a look. For casting, we CopyBytes between the two representations, which boils down to __builtin_memcpy (https://github.com/google/highway/blob/master/hwy/base.h#L819). Is there some othe

[Bug target/106187] armhf: Miscompilation at O2 level (O0 / O1 are working)

2022-07-05 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106187 --- Comment #7 from Jan Wassenberg --- The easiest way to reduce the amount of code in the binary is to comment out from mul_test.cc all the HWY_EXPORT_AND_TEST_P except the one with TestAllMulEven. The actual miscompilation is probably happeni

[Bug rtl-optimization/106041] [12/13 Regression] infinite loop in fast_dce at -O1 with aarch64

2022-06-21 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106041 --- Comment #7 from Jan Wassenberg --- Sure, added GCC11 .ii. It does indeed still freeze GCC 12.

[Bug rtl-optimization/106041] [12/13 Regression] infinite loop in fast_dce at -O1 with aarch64

2022-06-21 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106041 --- Comment #6 from Jan Wassenberg --- Created attachment 53181 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53181&action=edit GCC11 zipped preprocessed source

[Bug rtl-optimization/106041] infinite loop in fast_dce at -O1 with aarch64

2022-06-21 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106041 --- Comment #4 from Jan Wassenberg --- Thanks for having a look! BTW forgot to mention: version 11.0 does not have this issue.

[Bug c++/106041] New: Long/infinite compile time for Arm SIMD -O1 or -O2 but not -O0

2022-06-20 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106041 Bug ID: 106041 Summary: Long/infinite compile time for Arm SIMD -O1 or -O2 but not -O0 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal