[PATCH] Add missing expander for vector float_extend and float_truncate [PR target/95125]
Bootstrap is ok, regression test on i386/x86-64 backend is ok. gcc/ChangeLog PR target/95125 * config/i386/sse.md (sf2dfmode_lower): New mode attribute. (trunc2) New expander. (extend2): Ditto. gcc/testsuite/ChangeLog * gcc.target/i386/pr95125-avx.c: New test. * gcc.target/i386/pr95125-avx512f.c: Ditto. -- BR, Hongtao 0001-Add-missing-expander-for-vector-float_extend-and-flo.patch Description: Binary data
Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]
On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > Hi: > > This patch fix non-conforming expander for > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, > > refer to PR95211, PR95256. > > bootstrap ok, regression test on i386/x86-64 backend is ok. > > > > gcc/ChangeLog: > > PR target/95211 PR target/95256 > Changed. > Please put every PR reference in a separate line. > > > * config/i386/sse.md v2div2sf2): New expander. > > (fix_truncv2sfv2di2): Ditto. > > (floatv2div2sf2_internal): Renaming from > > floatv2div2sf2. > > (fix_truncv2sfv2di2_internal): > > The convention throughout sse,md is to prefix a standard pattern that > is used through builtins with avx512_ instead of suffixing > the pattern name with _internal. > Changed. > > Renaming from fix_truncv2sfv2di2. > > (vec_pack_float_): Adjust icode name. > > (vec_unpack_fix_trunc_lo_): Ditto. > > * config/i386/i386-builtin.def: Ditto. > > Uros. Update patch. gcc/ChangeLog: PR target/95211 PR target/95256 * config/i386/sse.md v2div2sf2): New expander. (fix_truncv2sfv2di2): Ditto. (avx512dq_floatv2div2sf2): Renaming from floatv2div2sf2. (avx512dq_fix_truncv2sfv2di2): Renaming from fix_truncv2sfv2di2. (vec_pack_float_): Adjust icode name. (vec_unpack_fix_trunc_lo_): Ditto. (vec_unpack_fix_trunc_hi_): Ditto. * config/i386/i386-builtin.def: Ditto. gcc/testsuite/ChangeLog * gcc.target/i386/pr95211.c: New test. -- BR, Hongtao 0001-Fix-non-comforming-expander-for_V2.patch Description: Binary data
Re: [PATCH] x86: Handle -mavx512vpopcntdq for -march=native
On Sat, May 23, 2020 at 5:07 PM H.J. Lu wrote: > > On Fri, May 22, 2020 at 12:42 AM Uros Bizjak wrote: > > > > On Thu, May 21, 2020 at 2:54 PM H.J. Lu wrote: > > > > > > Add -mavx512vpopcntdq for -march=native if AVX512VPOPCNTDQ is available. > > > > > > PR target/95258 > > > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > > > AVX512VPOPCNTDQ. > > > > OK. > > > > OK for backports? OK. Thanks, Uros. > Thanks. > > > > > > --- > > > gcc/config/i386/driver-i386.c | 9 ++--- > > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > > > diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c > > > index 7612ddfb846..3a816400729 100644 > > > --- a/gcc/config/i386/driver-i386.c > > > +++ b/gcc/config/i386/driver-i386.c > > > @@ -420,6 +420,7 @@ const char *host_detect_local_cpu (int argc, const > > > char **argv) > > >unsigned int has_avx5124fmaps = 0, has_avx5124vnniw = 0; > > >unsigned int has_gfni = 0, has_avx512vbmi2 = 0; > > >unsigned int has_avx512bitalg = 0; > > > + unsigned int has_avx512vpopcntdq = 0; > > >unsigned int has_shstk = 0; > > >unsigned int has_avx512vnni = 0, has_vaes = 0; > > >unsigned int has_vpclmulqdq = 0; > > > @@ -528,6 +529,7 @@ const char *host_detect_local_cpu (int argc, const > > > char **argv) > > >has_vaes = ecx & bit_VAES; > > >has_vpclmulqdq = ecx & bit_VPCLMULQDQ; > > >has_avx512bitalg = ecx & bit_AVX512BITALG; > > > + has_avx512vpopcntdq = ecx & bit_AVX512VPOPCNTDQ; > > >has_movdiri = ecx & bit_MOVDIRI; > > >has_movdir64b = ecx & bit_MOVDIR64B; > > >has_enqcmd = ecx & bit_ENQCMD; > > > @@ -1189,6 +1191,7 @@ const char *host_detect_local_cpu (int argc, const > > > char **argv) > > >const char *avx512vp2intersect = has_avx512vp2intersect ? " > > > -mavx512vp2intersect" : " -mno-avx512vp2intersect"; > > >const char *tsxldtrk = has_tsxldtrk ? " -mtsxldtrk " : " > > > -mno-tsxldtrk"; > > >const char *avx512bitalg = has_avx512bitalg ? " -mavx512bitalg" : > > > " -mno-avx512bitalg"; > > > + const char *avx512vpopcntdq = has_avx512vpopcntdq ? " > > > -mavx512vpopcntdq" : " -mno-avx512vpopcntdq"; > > >const char *movdiri = has_movdiri ? " -mmovdiri" : " -mno-movdiri"; > > >const char *movdir64b = has_movdir64b ? " -mmovdir64b" : " > > > -mno-movdir64b"; > > >const char *enqcmd = has_enqcmd ? " -menqcmd" : " -mno-enqcmd"; > > > @@ -1210,9 +1213,9 @@ const char *host_detect_local_cpu (int argc, const > > > char **argv) > > > avx512ifma, avx512vbmi, avx5124fmaps, > > > avx5124vnniw, > > > clwb, mwaitx, clzero, pku, rdpid, gfni, shstk, > > > avx512vbmi2, avx512vnni, vaes, vpclmulqdq, > > > - avx512bitalg, movdiri, movdir64b, waitpkg, > > > cldemote, > > > - ptwrite, avx512bf16, enqcmd, avx512vp2intersect, > > > - serialize, tsxldtrk, NULL); > > > + avx512bitalg, avx512vpopcntdq, movdiri, movdir64b, > > > + waitpkg, cldemote, ptwrite, avx512bf16, enqcmd, > > > + avx512vp2intersect, serialize, tsxldtrk, NULL); > > > } > > > > > > done: > > > -- > > > 2.26.2 > > > > > > > -- > H.J.
[pushed] Darwin: Make sanitizer local vars linker-visible.
Hi Another case where we need a linker-visible symbols in order to preserve the ld64 atom model. If these symbols are emitted as 'local' the linker cannot see that they are separate from any global weak entry that precedes them. This will cause the linker to complain that there is (apparently) direct access to such a weak global, preventing it from being replaced. This is a short-term fix for the problem - we need generic handling for relevant cases (that also does not pessimise objects by emitting unnecessary symbols and relocations). tested on x86_64-darwin16, applied to master and 10.2, so far. thanks Iain gcc/ChangeLog: 2020-05-23 Iain Sandoe * config/darwin.h (ASM_GENERATE_INTERNAL_LABEL): Make ubsan_{data,type},ASAN symbols linker-visible. --- gcc/ChangeLog | 5 + gcc/config/darwin.h | 6 ++ 2 files changed, 11 insertions(+) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 7a7b599ff93..ede1f15eb7a 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2020-05-23 Iain Sandoe + + * config/darwin.h (ASM_GENERATE_INTERNAL_LABEL): Make + ubsan_{data,type},ASAN symbols linker-visible. + 2020-05-22 Jan Hubicka * lto-streamer-out.c (DFS::DFS): Silence warning. diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h index 27665b34a18..f528b1766bf 100644 --- a/gcc/config/darwin.h +++ b/gcc/config/darwin.h @@ -808,6 +808,12 @@ extern GTY(()) section * darwin_sections[NUM_DARWIN_SECTIONS]; do { \ if (strcmp ("LC", PREFIX) == 0)\ sprintf (LABEL, "*%s%ld", "lC", (long)(NUM));\ +else if (strcmp ("Lubsan_data", PREFIX) == 0) \ + sprintf (LABEL, "*%s%ld", "lubsan_data", (long)(NUM));\ +else if (strcmp ("Lubsan_type", PREFIX) == 0) \ + sprintf (LABEL, "*%s%ld", "lubsan_type", (long)(NUM));\ +else if (strcmp ("LASAN", PREFIX) == 0)\ + sprintf (LABEL, "*%s%ld", "lASAN", (long)(NUM));\ else \ sprintf (LABEL, "*%s%ld", PREFIX, (long)(NUM)); \ } while (0) -- 2.24.1
[PATCH v1 1/2][PPC64] [PR88877]
Here is a discussion we did some time ago regarding the defect. https://gcc.gnu.org/pipermail/gcc/2019-January/227834.html please see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88877 for testcase behavior. We incorporating below Jakub's suggestion in this patch series. Jakub wrote: "" Yeah, all the callers of emit_library_call* would need to be changed to pass triplets rtx, machine_mode, int/bool /*unsignedp*/, instead of just rtx_mode_t pair. "" In this patch series trying to address same by creating a struct Tuple which bundles existing rtx and machine_mode and added one more bool member which store unsigned_p which by default is false. This patch does not change underlying behavior yet. This will be done in follow up patches. ChangeLog Entry: 2020-05-24 Kamlesh Kumar * rtl.h (Tuple): Defined and typedefed to rtx_mode_t. (emit_library_call): Added default arg unsigned_p. (emit_library_call_value): Added default arg unsigned_p. --- gcc/rtl.h | 26 ++ 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/gcc/rtl.h b/gcc/rtl.h index b0b1aac..ee42de7 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -2238,10 +2238,20 @@ struct address_info { enum rtx_code base_outer_code; }; -/* This is used to bundle an rtx and a mode together so that the pair - can be used with the wi:: routines. If we ever put modes into rtx - integer constants, this should go away and then just pass an rtx in. */ -typedef std::pair rtx_mode_t; +/* This is used to bundle an rtx and a mode and unsignedness together so + that the tuple can be used with the wi:: routines. If we ever put modes + into rtx integer constants, this should go away and then just pass an rtx in. */ +typedef struct Tuple { + rtx first; + machine_mode second; + /* unsigned_p */ + bool third; + Tuple (rtx f, machine_mode s, bool t = false) { +first = f; +second = s; +third = t; + } +} rtx_mode_t; namespace wi { @@ -4176,9 +4186,9 @@ emit_library_call (rtx fun, libcall_type fn_type, machine_mode outmode) inline void emit_library_call (rtx fun, libcall_type fn_type, machine_mode outmode, - rtx arg1, machine_mode arg1_mode) + rtx arg1, machine_mode arg1_mode, bool unsigned_p = false) { - rtx_mode_t args[] = { rtx_mode_t (arg1, arg1_mode) }; + rtx_mode_t args[] = { rtx_mode_t (arg1, arg1_mode, unsigned_p) }; emit_library_call_value_1 (0, fun, NULL_RTX, fn_type, outmode, 1, args); } @@ -4238,9 +4248,9 @@ emit_library_call_value (rtx fun, rtx value, libcall_type fn_type, inline rtx emit_library_call_value (rtx fun, rtx value, libcall_type fn_type, machine_mode outmode, -rtx arg1, machine_mode arg1_mode) +rtx arg1, machine_mode arg1_mode, bool unsigned_p = false) { - rtx_mode_t args[] = { rtx_mode_t (arg1, arg1_mode) }; + rtx_mode_t args[] = { rtx_mode_t (arg1, arg1_mode, unsigned_p) }; return emit_library_call_value_1 (1, fun, value, fn_type, outmode, 1, args); } -- 2.7.4
Re: [PATCH] Extend std::copy/std::copy_n char* overload to deque iterator
Now tested in C++98 mode, there was indeed a small problem. I even wonder if I shouldn't have extend the std::copy overload to any call with deque iterator as the output so that it is transform into an output to pointer. Ok to commit ? François On 23/05/20 6:37 pm, Jonathan Wakely wrote: On 22/05/20 22:57 +0200, François Dumont via Libstdc++ wrote: On 21/05/20 2:17 pm, Jonathan Wakely wrote: Why is the optimization not done for C++03 mode? I did it this way because the new std::copy overload rely on std::copy_n implementation details which is a C++11 algo. It looks like the uses of 'auto' can be reaplced easily, and __enable_if_t<> can be replaced with __gnu_cxx::__enable_if<>::__type. But yes, we can indeed provide those implementation details in pre-C++11. This is what I've done in this new version. Tested under Linux x86_64 in default c++ mode. I tried to use CXXFLAGS=-std=c++03 but it doesn't seem to work even if I do see the option in build logs. I remember you adivised a different approach, can you tell me again ? See the documentation: https://gcc.gnu.org/onlinedocs/libstdc++/manual/test.html#test.run.permutations diff --git a/libstdc++-v3/include/bits/deque.tcc b/libstdc++-v3/include/bits/deque.tcc index e773f32b256..d7dbe64f3e1 100644 --- a/libstdc++-v3/include/bits/deque.tcc +++ b/libstdc++-v3/include/bits/deque.tcc @@ -1065,6 +1065,57 @@ _GLIBCXX_END_NAMESPACE_CONTAINER return __result; } + template +typename __gnu_cxx::__enable_if< + __is_char<_CharT>::__value, + _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> >::__type +__copy_move_a2( + istreambuf_iterator<_CharT, char_traits<_CharT> > __first, + istreambuf_iterator<_CharT, char_traits<_CharT> > __last, + _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> __result) +{ + if (__first == __last) + return __result; + + for (;;) + { + const std::ptrdiff_t __len = __result._M_last - __result._M_cur; + const std::ptrdiff_t __nb + = std::__copy_n_a(__first, __len, __result._M_cur, false) + - __result._M_cur; + __result += __nb; + + if (__nb != __len) + break; + } + + return __result; +} + + template +typename __gnu_cxx::__enable_if< + __is_char<_CharT>::__value, + _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> >::__type +__copy_n_a( + istreambuf_iterator<_CharT, char_traits<_CharT> > __it, _Size __size, + _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> __result, + bool __strict) +{ + if (__size == 0) + return __result; + + do + { + const _Size __len + = std::min<_Size>(__result._M_last - __result._M_cur, __size); + std::__copy_n_a(__it, __len, __result._M_cur, __strict); + __result += __len; + __size -= __len; + } + while (__size != 0); + return __result; +} + template _OI diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h index 932ece55529..70d8232aece 100644 --- a/libstdc++-v3/include/bits/stl_algo.h +++ b/libstdc++-v3/include/bits/stl_algo.h @@ -705,31 +705,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return __result; } - template -_GLIBCXX20_CONSTEXPR -_OutputIterator -__copy_n_a(_InputIterator __first, _Size __n, _OutputIterator __result) -{ - if (__n > 0) - { - while (true) - { - *__result = *__first; - ++__result; - if (--__n > 0) - ++__first; - else - break; - } - } - return __result; -} - - template -__enable_if_t<__is_char<_CharT>::__value, _CharT*> -__copy_n_a(istreambuf_iterator<_CharT, char_traits<_CharT>>, - _Size, _CharT*); - template _GLIBCXX20_CONSTEXPR _OutputIterator @@ -738,7 +713,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { return std::__niter_wrap(__result, __copy_n_a(__first, __n, - std::__niter_base(__result))); + std::__niter_base(__result), true)); } template::value) { return __it; } + template +_Ite +__niter_base(const ::__gnu_debug::_Safe_iterator<_Ite, _Seq, + std::random_access_iterator_tag>&); + // Reverse the __niter_base transformation to get a // __normal_iterator back again (this assumes that __normal_iterator // is only used to wrap random access iterators, like pointers). @@ -466,6 +471,15 @@ _GLIBCXX_END_NAMESPACE_CONTAINER __copy_move_a2(istreambuf_iterator<_CharT, char_traits<_CharT> >, istreambuf_iterator<_CharT, char_traits<_CharT> >, _CharT*); + template +typename __gnu_cxx::__enable_if< + __is_char<_CharT>::__value, + _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> >::__type +__copy_move_a2( + istreambuf_iterator<_CharT, char_traits<_CharT> >, + istreambuf_iterator<_CharT, char_traits<_CharT> >, + _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*>); + template _GLIBCXX20_CONSTEXPR inline _OI @@ -539,6 +553,41 @@ _
Re: [PATCH v1 1/2][PPC64] [PR88877]
Hi! On Sun, May 24, 2020 at 07:03:13PM +0530, Kamlesh Kumar wrote: > In this patch series trying to address same by creating a struct Tuple > which bundles existing rtx and machine_mode and added one more > bool member which store unsigned_p which by default is false. The idea is good. However, you cannot call something as specific as this "tuple", in a header file that is used everywhere even. (We also do not have a "leading caps on types" convention). > This patch does not change underlying behavior yet. This will be done in > follow up patches. Thanks :-) > * rtl.h (Tuple): Defined and typedefed to rtx_mode_t. It's the other way around: rtx_mode_t is typedeffed to struct Tuple, so rtx_mode_t should be listed to the left of a : as well. OTOH, you don't need to name Tuple at all... It should not *have* a constructor, since you declared it as class... But you can just use std::tuple here? > (emit_library_call): Added default arg unsigned_p. > (emit_library_call_value): Added default arg unsigned_p. Yeah, eww. Default arguments have all the problems you had before, except now it is hidden and much more surprising. Those functions really should take rtx_mode_t arguments? Thanks again for working on this, Segher
Re: [PATCH] Add support for C++20 barriers
This time with 100% more patch… 0001-Add-support-for-C-20-barriers_f.patch Description: Binary data > On May 23, 2020, at 3:58 PM, Thomas Rodgers wrote: > > This patch requires the patch for atomic::wait/notify to be applied first. > > This implementation is based on the libc++ implementation, but excludes the > alternative “central barrier” implementation for now as there is no standard > way to switch between the two. > >* include/Makefile.am (std_headers): Add new header. >* include/Makefile.in: Regenerate. >* include/std/barrier: New file. >* testsuite/30_thread/barrier/1.cc: New test. >* testsuite/30_thread/barrier/2.cc: Likewise. >* testsuite/30_thread/barrier/arrive_and_drop.cc: Likewise. >* testsuite/30_thread/barrier/arrive_and_wait.cc: Likewise. >* testsuite/30_thread/barrier/arrive.cc: Likewise. >* testsuite/30_thread/barrier/completion.cc: Likewise. >* testsuite/30_thread/barrier/max.cc: Likewise. >
Re: [PATCH] Add support for C++20 barriers
* Thomas Rodgers: > + static __gthread_t > + _S_get_tid() noexcept > + { > +#ifdef __GLIBC__ > + // For the GNU C library pthread_self() is usable without linking to > + // libpthread.so but returns 0, so we cannot use it in single-threaded > + // programs, because this_thread::get_id() != thread::id{} must be true. > + // We know that pthread_t is an integral type in the GNU C library. > + if (!__gthread_active_p()) > + return 1; > +#endif > + return __gthread_self(); > + } This comment seems outdated or incomplete. pthread_self returns a proper pointer since glibc 2.27, I believe. I'm also not sure how the difference is observable for the libstdc++ implementation. Late loading of libpthread isn't quite supported.
Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]
On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote: > > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > > > Hi: > > > This patch fix non-conforming expander for > > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, > > > refer to PR95211, PR95256. > > > bootstrap ok, regression test on i386/x86-64 backend is ok. > > > > > > gcc/ChangeLog: > > > PR target/95211 PR target/95256 > > > Changed. > > Please put every PR reference in a separate line. > > > > > * config/i386/sse.md v2div2sf2): New expander. > > > (fix_truncv2sfv2di2): Ditto. > > > (floatv2div2sf2_internal): Renaming from > > > floatv2div2sf2. > > > (fix_truncv2sfv2di2_internal): > > > > The convention throughout sse,md is to prefix a standard pattern that > > is used through builtins with avx512_ instead of suffixing > > the pattern name with _internal. > > > Changed. > > > Renaming from fix_truncv2sfv2di2. > > > (vec_pack_float_): Adjust icode name. > > > (vec_unpack_fix_trunc_lo_): Ditto. > > > * config/i386/i386-builtin.def: Ditto. > > > > Uros. > > Update patch. The patch is wrong, and the correct way to fix these patterns is more complex: a) the pattern should not access register in mode, narrower than 128 bits, as this implies MMX register in non-TARGET-MMX-WITH-SSE targets. So, the correct way to define insn with narrow mode is to use vec_select, something like: (define_insn "sse4_1_v8qiv8hi2" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") (any_extend:V8HI (vec_select:V8QI (match_operand:V16QI 1 "register_operand" "Yr,*x,v") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)]] The instruction accesses the memory in the correct mode, so the memory operand is: (define_insn "*sse4_1_v8qiv8hi2_1" [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") (any_extend:V8HI (match_operand:V8QI 1 "memory_operand" "m,m,m")))] and a pre-reload split has to be introduced to convert insn from register form to memory form, when memory gets propagated to the insn: (define_insn_and_split "*sse4_1_v8qiv8hi2_2" [(set (match_operand:V8HI 0 "register_operand") (any_extend:V8HI (vec_select:V8QI (subreg:V16QI (vec_concat:V2DI (match_operand:DI 1 "memory_operand") (const_int 0)) 0) (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)]] For a middle end to use this insn, an expander is used: (define_expand "v8qiv8hi2" [(set (match_operand:V8HI 0 "register_operand") (any_extend:V8HI (match_operand:V8QI 1 "nonimmediate_operand")))] b) Similar approach is used when an output is narrower than 128 bits: (define_insn "*floatv2div2sf2" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_concat:V4SF (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm")) (match_operand:V2SF 2 "const0_operand" "C")))] In your concrete case, (define_insn "fix_truncv2sfv2di2" [(set (match_operand:V2DI 0 "register_operand" "=v") (any_fix:V2DI (vec_select:V2SF (match_operand:V4SF 1 "nonimmediate_operand" "vm") (parallel [(const_int 0) (const_int 1)]] is already _NOT_ defined in a correct way as far as memory operand is concerned, see a) above. But, , we will apparently have to live with that. The problem is, that it is named as a standard named pattern, so middle-end discovers it and tries to use it. It should be renamed with avx512dq_... prefix. Let's give middle-end something correct, similar to: (define_expand "v8qiv8hi2" [(set (match_operand:V8HI 0 "register_operand") (any_extend:V8HI (match_operand:V8QI 1 "nonimmediate_operand")))] "TARGET_SSE4_1" { if (!MEM_P (operands[1])) { operands[1] = simplify_gen_subreg (V16QImode, operands[1], V8QImode, 0); emit_insn (gen_sse4_1_v8qiv8hi2 (operands[0], operands[1])); DONE; } }) The second case is with v2sf output, less than 128 bits wide: (define_insn "*floatv2div2sf2" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_concat:V4SF (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm")) (match_operand:V2SF 2 "const0_operand" "C")))] The above insn pattern is OK, we access the output register with 128bit access, so we are sure no MMX reg will be generated. The problem is with the existing expander (define_expand "floatv2div2sf2" [(set (match_operand:V4SF 0 "register_operand" "=v") (vec_concat:V4SF (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm")) (match_dup 2)))] "TARGET_AVX512DQ && TARGET_AVX512VL" "o
Re: [PATCH] Add missing expander for vector float_extend and float_truncate [PR target/95125]
On Sun, May 24, 2020 at 9:20 AM Hongtao Liu wrote: > > Bootstrap is ok, regression test on i386/x86-64 backend is ok. > > gcc/ChangeLog > PR target/95125 > * config/i386/sse.md (sf2dfmode_lower): New mode attribute. > (trunc2) New expander. > (extend2): Ditto. > > gcc/testsuite/ChangeLog > * gcc.target/i386/pr95125-avx.c: New test. > * gcc.target/i386/pr95125-avx512f.c: Ditto. OK. Thanks, Uros.
Re: [PATCH] Add support for C++20 barriers
On Sun, 24 May 2020 at 18:55, Florian Weimer wrote: > > * Thomas Rodgers: > > > + static __gthread_t > > + _S_get_tid() noexcept > > + { > > +#ifdef __GLIBC__ > > + // For the GNU C library pthread_self() is usable without linking to > > + // libpthread.so but returns 0, so we cannot use it in single-threaded > > + // programs, because this_thread::get_id() != thread::id{} must be > > true. > > + // We know that pthread_t is an integral type in the GNU C library. > > + if (!__gthread_active_p()) > > + return 1; > > +#endif > > + return __gthread_self(); > > + } > > This comment seems outdated or incomplete. pthread_self returns a > proper pointer since glibc 2.27, I believe. The comment is copied from the header, and dates from 2015. > I'm also not sure how the difference is observable for the libstdc++ > implementation. Late loading of libpthread isn't quite supported. It's nothing to do with late loading. A single threaded program that doesn't create any threads and doesn't link to libpthread can still expect std::this_thread::get_id() != std::thread::id() to be true in the main (and only) thread. If pthread_self() returns 0, and thread::id() default constructs with a value of 0, then we can't distinguish "the main thread" from "not a thread". But I do see a non-zero value from glibc now, which is great. I'll add it to my TODO list to remove that workaround from .
Re: [PATCH] Add support for C++20 barriers
* Jonathan Wakely: > On Sun, 24 May 2020 at 18:55, Florian Weimer wrote: >> >> * Thomas Rodgers: >> >> > + static __gthread_t >> > + _S_get_tid() noexcept >> > + { >> > +#ifdef __GLIBC__ >> > + // For the GNU C library pthread_self() is usable without linking to >> > + // libpthread.so but returns 0, so we cannot use it in >> > single-threaded >> > + // programs, because this_thread::get_id() != thread::id{} must be >> > true. >> > + // We know that pthread_t is an integral type in the GNU C library. >> > + if (!__gthread_active_p()) >> > + return 1; >> > +#endif >> > + return __gthread_self(); >> > + } >> >> This comment seems outdated or incomplete. pthread_self returns a >> proper pointer since glibc 2.27, I believe. > > The comment is copied from the header, and dates from 2015. > >> I'm also not sure how the difference is observable for the libstdc++ >> implementation. Late loading of libpthread isn't quite supported. > > It's nothing to do with late loading. A single threaded program that > doesn't create any threads and doesn't link to libpthread can still > expect std::this_thread::get_id() != std::thread::id() to be true in > the main (and only) thread. If pthread_self() returns 0, and > thread::id() default constructs with a value of 0, then we can't > distinguish "the main thread" from "not a thread". Ahh. Yes, the POSIX interface does not have any “not a thread” value for pthread_t, so I can see how it's difficult to implement something on top of it htat meets the C++ requirements.
[patch, fortran] Fix memory leaks for finalized types
Hello world, this patch fixes a 8/9/10/11 regression, where finalized types were not finalized (and deallocated), which led to memory leaks. Once the offending commit was identified (thanks, Harald!) error analysis was rather straightforward. The central idea was that it is the expression that should not be finalized twice, not the component (which is shared). Less straightforward was writing a meaningful test case; why I could not get ! { dg-final { scan-tree-dump-times "__builtin_free.*dat" 2 "original" } } to work (dejagnu always complained about finding it once) I don't know. Anyway, here is the patch. I have regression-tested it and made sure that the size part of PR 87352 did not go up again through the roof. I have also tested all affected finalize test cases with valgrind and made sure they are still valid and do not leak. Once this is in, it will be interesting to see if any other finalizer bugs are affected. So, OK for trunk and for backporting to all affected branches? Regards Thomas Finalization depends on the expression, not on the component. gcc/fortran/ChangeLog: 2020-05-24 Thomas Koenig PR fortran/94361 * class.c (finalize_component): Use expr->finalized instead of comp->finalized. * gfortran.h (gfc_component): Remove finalized member. (gfc_expr): Add it here instead. gcc/testsuite/ChangeLog: 2020-05-24 Thomas Koenig PR fortran/94361 * gfortran.dg/finalize_28.f90: Adjusted free counts. * gfortran.dg/finalize_33.f90: Likewise. * gfortran.dg/finalize_34.f90: Likewise. * gfortran.dg/finalize_35.f90: New test.. diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c index 9aa3eb7282c..b5a1edae27f 100644 --- a/gcc/fortran/class.c +++ b/gcc/fortran/class.c @@ -911,7 +911,7 @@ finalize_component (gfc_expr *expr, gfc_symbol *derived, gfc_component *comp, if (!comp_is_finalizable (comp)) return; - if (comp->finalized) + if (expr->finalized) return; e = gfc_copy_expr (expr); @@ -1002,6 +1002,7 @@ finalize_component (gfc_expr *expr, gfc_symbol *derived, gfc_component *comp, } else (*code) = cond; + } else if (comp->ts.type == BT_DERIVED && comp->ts.u.derived->f2k_derived @@ -1041,7 +1042,7 @@ finalize_component (gfc_expr *expr, gfc_symbol *derived, gfc_component *comp, sub_ns); gfc_free_expr (e); } - comp->finalized = true; + expr->finalized = 1; } diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 7094791e871..5af44847f9b 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -1107,7 +1107,6 @@ typedef struct gfc_component struct gfc_typebound_proc *tb; /* When allocatable/pointer and in a coarray the associated token. */ tree caf_token; - bool finalized; } gfc_component; @@ -2218,6 +2217,9 @@ typedef struct gfc_expr /* Set this if the expression came from expanding an array constructor. */ unsigned int from_constructor : 1; + /* Set this if the expression has already been finalized. */ + unsigned int finalized : 1; + /* If an expression comes from a Hollerith constant or compile-time evaluation of a transfer statement, it may have a prescribed target- memory representation, and these cannot always be backformed from diff --git a/gcc/testsuite/gfortran.dg/finalize_28.f90 b/gcc/testsuite/gfortran.dg/finalize_28.f90 index 597413b2dd3..f0c9665252f 100644 --- a/gcc/testsuite/gfortran.dg/finalize_28.f90 +++ b/gcc/testsuite/gfortran.dg/finalize_28.f90 @@ -21,4 +21,4 @@ contains integer, intent(out) :: edges(:,:) end subroutine coo_dump_edges end module coo_graphs -! { dg-final { scan-tree-dump-times "__builtin_free" 5 "original" } } +! { dg-final { scan-tree-dump-times "__builtin_free" 6 "original" } } diff --git a/gcc/testsuite/gfortran.dg/finalize_33.f90 b/gcc/testsuite/gfortran.dg/finalize_33.f90 index 2205f9eed7f..3857e4485ee 100644 --- a/gcc/testsuite/gfortran.dg/finalize_33.f90 +++ b/gcc/testsuite/gfortran.dg/finalize_33.f90 @@ -116,4 +116,4 @@ contains ! (iii) mci_template end program main_ut ! { dg-final { scan-tree-dump-times "__builtin_malloc" 17 "original" } } -! { dg-final { scan-tree-dump-times "__builtin_free" 19 "original" } } +! { dg-final { scan-tree-dump-times "__builtin_free" 20 "original" } } diff --git a/gcc/testsuite/gfortran.dg/finalize_34.f90 b/gcc/testsuite/gfortran.dg/finalize_34.f90 index e2f02a5c51c..fef7dac6d89 100644 --- a/gcc/testsuite/gfortran.dg/finalize_34.f90 +++ b/gcc/testsuite/gfortran.dg/finalize_34.f90 @@ -22,4 +22,4 @@ program main use testmodule type(evtlist_type), dimension(10) :: a end program main -! { dg-final { scan-tree-dump-times "__builtin_free" 8 "original" } } +! { dg-final { scan-tree-dump-times "__builtin_free" 12 "original" } } diff --git a/gcc/testsuite/gfortran.dg/finalize_35.f90 b/gcc/testsuite/gfortran.dg/finalize_35.f90 new f
Re: [Patch] PR fortran/95106 - truncation of long symbol names with EQUIVALENCE
Hi Harald, OK for master? The patch is OK. Regarding the test case - I think it should be OK. If not, expect to hear from people soon, you could then still restrict it to Linux (or something else along those lines). Regards Thomas
PR libfortran/95195 - improve runtime error for namelist i/o to unformatted file
Without the patch below, an attempted namelist write to an unformatted file - which is prohibited by the standard - would generate the following runtime error: At line 12 of file pr95195.f90 (unit = 10, file = 'test.dat') Fortran runtime error: End of record followed by some backtrace. The patch attempts to generate an error pointing the user to the real issue. Regtested on x86_64-pc-linux-gnu. OK for master? Thanks, Harald PR libfortran/95195 - improve runtime error for namelist i/o to unformatted file Namelist input/output to unformatted files is prohibited. Generate useful runtime errors instead instead of misleading ones. libgfortran/ 2020-05-24 Harald Anlauf PR fortran/95195 * io/transfer.c (finalize_transfer): Generate runtime error for namelist input/output to unformatted file. gcc/testsuite/ 2020-05-24 Harald Anlauf PR fortran/95195 * gfortran.dg/namelist_97.f90: New test. diff --git a/gcc/testsuite/gfortran.dg/namelist_97.f90 b/gcc/testsuite/gfortran.dg/namelist_97.f90 new file mode 100644 index 000..4907e46b46a --- /dev/null +++ b/gcc/testsuite/gfortran.dg/namelist_97.f90 @@ -0,0 +1,14 @@ +! { dg-do run } +! { dg-output "At line 12 .*" } +! { dg-shouldfail "Fortran runtime error: Namelist formatting .* FORM='UNFORMATTED'" } +! +! PR95195 - improve runtime error when writing a namelist to an unformatted file + +program test + character(len=11) :: my_form = 'unformatted' + integer :: i = 1, j = 2, k = 3 + namelist /nml1/ i, j, k + open (unit=10, file='test.dat', form=my_form) + write (unit=10, nml=nml1) + close (unit=10, status='delete') +end program test diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c index b8db47dbff9..d071c1ce915 100644 --- a/libgfortran/io/transfer.c +++ b/libgfortran/io/transfer.c @@ -4123,6 +4123,14 @@ finalize_transfer (st_parameter_dt *dtp) if ((dtp->u.p.ionml != NULL) && (cf & IOPARM_DT_HAS_NAMELIST_NAME) != 0) { + if (dtp->u.p.current_unit->flags.form == FORM_UNFORMATTED) + { + generate_error (&dtp->common, LIBERROR_OPTION_CONFLICT, + "Namelist formatting for unit connected " + "with FORM='UNFORMATTED"); + return; + } + dtp->u.p.namelist_mode = 1; if ((cf & IOPARM_DT_NAMELIST_READ_MODE) != 0) namelist_read (dtp);
[PATCH] Port libgccjit to Windows.
Hello gcc devs. I have ported libgccjit to Windows. I have tested it with the native-compilation branch of Emacs so I'm confident that it works well. The work is not finished though, I could use some help with these two points: I have had to concede defeat to libtool and Automake. I could not get libgccjit to create a dll and put it in the correct directories. So for now we'll have to copy lib/libgccjit.so to bin/libgccjit.dll. It is not necessary to use --enable-host-shared in Windows (I tested it), but I don't know the proper way to disable that check. Nicolas 0001-Incomplete-port-of-libgccjit-to-Windows.patch Description: Binary data
[RFC PATCH] i386: Remove broadcasts from TARGET_MMX_WITH_SSE vec_dup insn patterns
XMM broadcast instructions broadcast value from general reg to all elements of the vector. This is not allowed for TARGET_MMX_WITH_SSE, where it is expected that bits outside lower 64bits load or retain zero value. Following testcases expect broadcast, and are thus invalid: FAIL: gcc.target/i386/sse2-mmx-18b.c scan-assembler-not movd FAIL: gcc.target/i386/sse2-mmx-18b.c scan-assembler-times pbroadcastd 1 FAIL: gcc.target/i386/sse2-mmx-19b.c scan-assembler-not movd FAIL: gcc.target/i386/sse2-mmx-19b.c scan-assembler-times pbroadcastw 1 FAIL: gcc.target/i386/sse2-mmx-19d.c scan-assembler-times pbroadcastw 1 FAIL: gcc.target/i386/sse2-mmx-19e.c scan-assembler-times pbroadcastw 1 These testcases will be fixed or removed entirely. (The patch is prerequisite to implement support for generic v2sf/v2si/v4hi shuffles). 2020-05-24 Uroš Bizjak gcc/ChangeLog: * config/i386/mmx.md (*vec_dupv2sf): Redefine as define_insn. (mmx_pshufw_1): Change Yv constraint to xYw. Correct type attribute. (*vec_dupv4hi): Redefine as define_insn. Remove alternative with general register input. (*vec_dupv2si): Ditto. Uros. diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 5deef683b0b..b5564711aa4 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -947,27 +947,22 @@ (set_attr "prefix_extra" "1") (set_attr "mode" "V2SF")]) -(define_insn_and_split "*vec_dupv2sf" +(define_insn "*vec_dupv2sf" [(set (match_operand:V2SF 0 "register_operand" "=y,x,Yv") (vec_duplicate:V2SF (match_operand:SF 1 "register_operand" "0,0,Yv")))] "TARGET_MMX || TARGET_MMX_WITH_SSE" "@ punpckldq\t%0, %0 - # - #" - "TARGET_SSE && reload_completed - && SSE_REGNO_P (REGNO (operands[0]))" - [(set (match_dup 0) - (vec_duplicate:V4SF (match_dup 1)))] -{ - operands[0] = lowpart_subreg (V4SFmode, operands[0], - GET_MODE (operands[0])); -} - [(set_attr "isa" "*,sse_noavx,avx") + shufps\t{$0xe0, %0, %0|%0, %0, 0xe0} + %vmovsldup\t{%1, %0|%0, %1}" + [(set_attr "isa" "*,sse_noavx,sse3") (set_attr "mmx_isa" "native,*,*") - (set_attr "type" "mmxcvt,ssemov,ssemov") - (set_attr "mode" "DI,TI,TI")]) + (set_attr "type" "mmxcvt,sseshuf1,sse") + (set_attr "length_immediate" "*,1,*") + (set_attr "prefix_rep" "*,*,1") + (set_attr "prefix" "*,orig,maybe_vex") + (set_attr "mode" "DI,V4SF,V4SF")]) (define_insn "*mmx_concatv2sf" [(set (match_operand:V2SF 0 "register_operand" "=y,y") @@ -1960,9 +1955,9 @@ }) (define_insn "mmx_pshufw_1" - [(set (match_operand:V4HI 0 "register_operand" "=y,Yv") + [(set (match_operand:V4HI 0 "register_operand" "=y,xYw") (vec_select:V4HI - (match_operand:V4HI 1 "register_mmxmem_operand" "ym,Yv") + (match_operand:V4HI 1 "register_mmxmem_operand" "ym,xYw") (parallel [(match_operand 2 "const_0_to_3_operand") (match_operand 3 "const_0_to_3_operand") (match_operand 4 "const_0_to_3_operand") @@ -1989,7 +1984,7 @@ } [(set_attr "isa" "*,sse2") (set_attr "mmx_isa" "native,*") - (set_attr "type" "mmxcvt,sselog") + (set_attr "type" "mmxcvt,sselog1") (set_attr "length_immediate" "1") (set_attr "mode" "DI,TI")]) @@ -2004,77 +1999,37 @@ (set_attr "prefix_extra" "1") (set_attr "mode" "DI")]) -(define_insn_and_split "*vec_dupv4hi" - [(set (match_operand:V4HI 0 "register_operand" "=y,xYw,Yw") +(define_insn "*vec_dupv4hi" + [(set (match_operand:V4HI 0 "register_operand" "=y,xYw") (vec_duplicate:V4HI (truncate:HI - (match_operand:SI 1 "register_operand" "0,xYw,r"] + (match_operand:SI 1 "register_operand" "0,xYw"] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)" "@ pshufw\t{$0, %0, %0|%0, %0, 0} - # - #" - "TARGET_SSE2 && reload_completed - && SSE_REGNO_P (REGNO (operands[0]))" - [(const_int 0)] -{ - rtx op; - operands[0] = lowpart_subreg (V8HImode, operands[0], - GET_MODE (operands[0])); - if (TARGET_AVX2) -{ - operands[1] = lowpart_subreg (HImode, operands[1], - GET_MODE (operands[1])); - op = gen_rtx_VEC_DUPLICATE (V8HImode, operands[1]); -} - else -{ - operands[1] = lowpart_subreg (V8HImode, operands[1], - GET_MODE (operands[1])); - rtx mask = gen_rtx_PARALLEL (VOIDmode, - gen_rtvec (8, - GEN_INT (0), - GEN_INT (0), - GEN_INT (0), - GEN_INT (0), - GEN_INT (4), - GEN_INT (5), - GEN_INT (6), -
Re: [PATCH] Port libgccjit to Windows.
On Sun, 2020-05-24 at 17:02 -0300, Nicolas Bértolo via Gcc-patches wrote: > Hello gcc devs. Hi Nicolas. > I have ported libgccjit to Windows. I have tested it with the > native-compilation branch of Emacs so I'm confident that it works well. Excellent - thanks for doing this work. Do you have copyright assignment paperwork on file? https://gcc.gnu.org/contribute.html#legal > The work is not finished though, I could use some help with these two > points: > > I have had to concede defeat to libtool and Automake. I could not get > libgccjit > to create a dll and put it in the correct directories. So for now we'll > have to > copy lib/libgccjit.so to bin/libgccjit.dll. The autotools are not my strongest suit. In a previous life I was a Windows developer, but I think it's been about 18 years since I've done any coding on Windows, so I'm going to have to trust your Windows expertise. > It is not necessary to use --enable-host-shared in Windows (I tested it), > but I > don't know the proper way to disable that check. (I'm not sure here) Various comments inline below... > Nicolas > > From 8644b979cf732e0b4d57c8281229fc3dcc9dc739 Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?Nicol=C3=A1s=20B=C3=A9rtolo?= > Date: Fri, 22 May 2020 17:54:41 -0300 > Subject: [PATCH] Incomplete port of libgccjit to Windows. > > * gcc/Makefile.in: don't look for libiberty in the "pic" subdirectory when > building for Mingw. Add dependency on xgcc with the proper extension. > * gcc/c/Make-lang.in: Remove extra slash. > * gcc/jit/Make-lang.in: Remove extra slash. > * gcc/jit/jit-playback.c: Do not chmod files in Windows. Use LoadLibrary, > FreeLibrary and GetProcAddress instead of libdl. > * gcc/jit/jit-tempdir.c: Do not use mkdtemp() in Windows. Get a filename with > GetTempFileName. > --- > gcc/Makefile.in| 10 +--- > gcc/c/Make-lang.in | 2 +- > gcc/jit/Make-lang.in | 10 > gcc/jit/jit-playback.c | 25 +-- > gcc/jit/jit-result.c | 46 ++ > gcc/jit/jit-tempdir.c | 56 ++ > 6 files changed, 132 insertions(+), 17 deletions(-) > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index 0fe2ba241..e6dd9f59e 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1046,10 +1046,12 @@ ALL_LINKERFLAGS = $(ALL_CXXFLAGS) > > # Build and host support libraries. > > -# Use the "pic" build of libiberty if --enable-host-shared. > +# Use the "pic" build of libiberty if --enable-host-shared, unless we are > +# building for mingw. > +LIBIBERTY_PICDIR=$(if $(findstring mingw,$(build)),,pic) > ifeq ($(enable_host_shared),yes) > -LIBIBERTY = ../libiberty/pic/libiberty.a > -BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/pic/libiberty.a > +LIBIBERTY = ../libiberty/$(LIBIBERTY_PICDIR)/libiberty.a > +BUILD_LIBIBERTY = > $(build_libobjdir)/libiberty/$(LIBIBERTY_PICDIR)/libiberty.a > else > LIBIBERTY = ../libiberty/libiberty.a > BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a > @@ -1726,7 +1728,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h > insn-codes.h \ > # This symlink makes the full installation name of the driver be available > # from within the *build* directory, for use when running the JIT library > # from there (e.g. when running its testsuite). > -$(FULL_DRIVER_NAME): ./xgcc > +$(FULL_DRIVER_NAME): ./xgcc$(exeext) > rm -f $@ > $(LN_S) $< $@ > > diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in > index 8944b9b9f..7efc7c2c3 100644 > --- a/gcc/c/Make-lang.in > +++ b/gcc/c/Make-lang.in > @@ -162,7 +162,7 @@ c.install-plugin: installdirs > # Install import library. > ifeq ($(plugin_implib),yes) > $(mkinstalldirs) $(DESTDIR)$(plugin_resourcesdir) > - $(INSTALL_DATA) cc1$(exeext).a > $(DESTDIR)/$(plugin_resourcesdir)/cc1$(exeext).a > + $(INSTALL_DATA) cc1$(exeext).a > $(DESTDIR)$(plugin_resourcesdir)/cc1$(exeext).a > endif > > c.uninstall: > diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in > index 38ddfad28..24f37c98b 100644 > --- a/gcc/jit/Make-lang.in > +++ b/gcc/jit/Make-lang.in > @@ -277,17 +277,17 @@ selftest-jit: > # Install hooks: > jit.install-common: installdirs > $(INSTALL_PROGRAM) $(LIBGCCJIT_FILENAME) \ > - $(DESTDIR)/$(libdir)/$(LIBGCCJIT_FILENAME) > + $(DESTDIR)$(libdir)/$(LIBGCCJIT_FILENAME) > ln -sf \ > $(LIBGCCJIT_FILENAME) \ > - $(DESTDIR)/$(libdir)/$(LIBGCCJIT_SONAME_SYMLINK) > + $(DESTDIR)$(libdir)/$(LIBGCCJIT_SONAME_SYMLINK) > ln -sf \ > $(LIBGCCJIT_SONAME_SYMLINK)\ > - $(DESTDIR)/$(libdir)/$(LIBGCCJIT_LINKER_NAME_SYMLINK) > + $(DESTDIR)$(libdir)/$(LIBGCCJIT_LINKER_NAME_SYMLINK) > $(INSTALL_DATA) $(srcdir)/jit/libgccjit.h \ > - $(DESTDIR)/$(includedir)/libgccjit.h > + $(DESTDIR)$(includedir)/libgccjit.h > $(INSTALL_DATA) $(srcdir)/jit/libgccjit++.h \ > - $(DESTDIR)/$(includedir)/libgccjit++.h > + $(DESTDIR)$(includedir)/l
Re: [PATCH] Add support for C++20 barriers
> On May 24, 2020, at 11:11 AM, Jonathan Wakely wrote: > > On Sun, 24 May 2020 at 18:55, Florian Weimer wrote: >> >> * Thomas Rodgers: >> >>> + static __gthread_t >>> + _S_get_tid() noexcept >>> + { >>> +#ifdef __GLIBC__ >>> + // For the GNU C library pthread_self() is usable without linking to >>> + // libpthread.so but returns 0, so we cannot use it in single-threaded >>> + // programs, because this_thread::get_id() != thread::id{} must be >>> true. >>> + // We know that pthread_t is an integral type in the GNU C library. >>> + if (!__gthread_active_p()) >>> + return 1; >>> +#endif >>> + return __gthread_self(); >>> + } >> >> This comment seems outdated or incomplete. pthread_self returns a >> proper pointer since glibc 2.27, I believe. > > The comment is copied from the header, and dates from 2015. Yes, this comes from to avoid pulling in all of to just get a hash from the current thread identity. I’m now using it in two places, is this worth splitting out somewhere? > >> I'm also not sure how the difference is observable for the libstdc++ >> implementation. Late loading of libpthread isn't quite supported. > > It's nothing to do with late loading. A single threaded program that > doesn't create any threads and doesn't link to libpthread can still > expect std::this_thread::get_id() != std::thread::id() to be true in > the main (and only) thread. If pthread_self() returns 0, and > thread::id() default constructs with a value of 0, then we can't > distinguish "the main thread" from "not a thread". > > But I do see a non-zero value from glibc now, which is great. I'll add > it to my TODO list to remove that workaround from .
[PATCH] diagnostics: Add function call parens matching to c_parser.
The C++ parser already tracks function call parens matching, but the C parser doesn't. This adds the same functionality to the C parser and adds a testcase showing the C++ and C parser matching function call parens in an error message. gcc/c/ChangeLog: * c-parser.c (c_parser_postfix_expression_after_primary): Add scope with matching_parens after CPP_OPEN_PAREN. gcc/testsuite/ChangeLog: * c-c++-common/missing-close-func-paren.c: New test. --- gcc/c/c-parser.c | 32 --- .../c-c++-common/missing-close-func-paren.c | 40 +++ 2 files changed, 57 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/missing-close-func-paren.c diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 5d11e7e73c16..23d6fa22b685 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -10458,21 +10458,23 @@ c_parser_postfix_expression_after_primary (c_parser *parser, break; case CPP_OPEN_PAREN: /* Function call. */ - c_parser_consume_token (parser); - for (i = 0; i < 3; i++) - { - sizeof_arg[i] = NULL_TREE; - sizeof_arg_loc[i] = UNKNOWN_LOCATION; - } - literal_zero_mask = 0; - if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN)) - exprlist = NULL; - else - exprlist = c_parser_expr_list (parser, true, false, &origtypes, - sizeof_arg_loc, sizeof_arg, - &arg_loc, &literal_zero_mask); - c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, -"expected %<)%>"); + { + matching_parens parens; + parens.consume_open (parser); + for (i = 0; i < 3; i++) + { + sizeof_arg[i] = NULL_TREE; + sizeof_arg_loc[i] = UNKNOWN_LOCATION; + } + literal_zero_mask = 0; + if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN)) + exprlist = NULL; + else + exprlist = c_parser_expr_list (parser, true, false, &origtypes, +sizeof_arg_loc, sizeof_arg, +&arg_loc, &literal_zero_mask); + parens.skip_until_found_close (parser); + } orig_expr = expr; mark_exp_read (expr.value); if (warn_sizeof_pointer_memaccess) diff --git a/gcc/testsuite/c-c++-common/missing-close-func-paren.c b/gcc/testsuite/c-c++-common/missing-close-func-paren.c new file mode 100644 index ..3177e250e1c3 --- /dev/null +++ b/gcc/testsuite/c-c++-common/missing-close-func-paren.c @@ -0,0 +1,40 @@ +/* { dg-options "-fdiagnostics-show-caret" } */ + +/* Verify that the C/C++ frontends show the pertinent opening symbol when + a closing symbol is missing for a function call. */ + +/* Verify that, when they are on the same line, that the opening symbol is + shown as a secondary range within the main diagnostic. */ + +extern int __attribute__((const)) foo (int a, int b, int c); + +void single_func () +{ + int single = +foo (1, (1 + 2), (1 + 2 + 3):); /* { dg-error "expected '\\)' before ':' token" } */ + /* { dg-begin-multiline-output "" } + foo (1, (1 + 2), (1 + 2 + 3):); + ~ ^ + ) + { dg-end-multiline-output "" } */ +} + +/* Verify that, when they are on different lines, that the opening symbol is + shown via a secondary diagnostic. */ + +void multi_func () +{ + int multi = +foo (1, /* { dg-message "to match this '\\('" } */ + (1 + 2), + (1 + 2 + 3):); /* { dg-error "expected '\\)' before ':' token" } */ + /* { dg-begin-multiline-output "" } + (1 + 2 + 3):); + ^ + ) + { dg-end-multiline-output "" } */ + /* { dg-begin-multiline-output "" } + foo (1, + ^ + { dg-end-multiline-output "" } */ +} -- 2.20.1
Re: [PATCH] contrib/gen_autofdo_event.py: Allow for it to work if there are more than 3 hyphens in Family-model
Javad Karabi via Gcc-patches writes: > > diff --git a/contrib/gen_autofdo_event.py b/contrib/gen_autofdo_event.py > index c97460c61c6..cd77a8686d9 100755 > --- a/contrib/gen_autofdo_event.py > +++ b/contrib/gen_autofdo_event.py > @@ -94,7 +94,7 @@ for j in u: > n = j.rstrip().split(',') > if len(n) >= 4 and (args.all or n[0] == cpu) and n[3] == "core": > if args.all: > -vendor, fam, model = n[0].split("-") > +vendor, fam, model = n[0].split("-")[:3] That doesn't fix the problem, we really need to match the stepping too. You turned a visible failure into a silent one. -Andi
[PATCH] Adjust wait logic to limit spurious evalution of wait predicate.
* include/bits/atomic_wait.h (__waiters::_M_do_wait): adjust wakeup logic.
[PATCH] Remove binary_semaphore implementation from stop_token
* include/std/stop_token: Remove local binary_semaphore implementation. (_Stop_state_t::_M_do_try_lock): Use __thread_yield() from bits/atomic_wait.h. 0001-Remove-binary_semaphore-implementation-from-stop_tok.patch Description: Binary data
Re: [PATCH] Adjust wait logic to limit spurious evalution of wait predicate.
And this time, with patch. wake_up_fix.patch Description: Binary data > On May 24, 2020, at 3:06 PM, Thomas Rodgers wrote: > > * include/bits/atomic_wait.h (__waiters::_M_do_wait): adjust wakeup > logic.
RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero
Hi, > -Original Message- > From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] > Sent: Saturday, May 23, 2020 10:57 PM > To: Yangfei (Felix) > Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) > Subject: Re: [PATCH PR94026] combine missed opportunity to simplify > comparisons with zero > > Hi! > > Sorry this is taking so long. > > On Wed, May 06, 2020 at 08:57:52AM +, Yangfei (Felix) wrote: > > > On Tue, Mar 24, 2020 at 06:30:12AM +, Yangfei (Felix) wrote: > > > > I modified combine emitting a simple AND operation instead of > > > > making one > > > zero_extract for this scenario. > > > > Attached please find the new patch. Hope this solves both of our > concerns. > > > > > > This looks promising. I'll try it out, see what it does on other > > > targets. (It will have to wait for GCC 11 stage 1, of course). > > It creates better code on all targets :-) A quite small improvement, but not > entirely trivial. Thanks for the effort. It's great to hear that :- ) Attached please find the v3 patch. Rebased on the latest trunk. Bootstrapped and tested on aarch64-linux-gnu. Could you please help install it? > > > p.s. Please use a correct mime type? application/octet-stream > > > isn't something I can reply to. Just text/plain is fine :-) > > > > I have using plain text now, hope that works for you. :-) > > Nope: > > [-- Attachment #2: pr94026-v2.diff --] > [-- Type: application/octet-stream, Encoding: base64, Size: 5.9K --] This time I switched to use UUEncode type for the attachment. Does it work? I am using Outlook and I didn't find the place to change the MIME type : - ( Felix pr94026-v3.diff Description: pr94026-v3.diff
Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]
On Mon, May 25, 2020 at 1:55 AM Uros Bizjak wrote: > > On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote: > > > > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > > > > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > > > > > Hi: > > > > This patch fix non-conforming expander for > > > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, > > > > refer to PR95211, PR95256. > > > > bootstrap ok, regression test on i386/x86-64 backend is ok. > > > > > > > > gcc/ChangeLog: > > > > PR target/95211 PR target/95256 > > > > > Changed. > > > Please put every PR reference in a separate line. > > > > > > > * config/i386/sse.md v2div2sf2): New expander. > > > > (fix_truncv2sfv2di2): Ditto. > > > > (floatv2div2sf2_internal): Renaming from > > > > floatv2div2sf2. > > > > (fix_truncv2sfv2di2_internal): > > > > > > The convention throughout sse,md is to prefix a standard pattern that > > > is used through builtins with avx512_ instead of suffixing > > > the pattern name with _internal. > > > > > Changed. > > > > Renaming from fix_truncv2sfv2di2. > > > > (vec_pack_float_): Adjust icode name. > > > > (vec_unpack_fix_trunc_lo_): Ditto. > > > > * config/i386/i386-builtin.def: Ditto. > > > > > > Uros. > > > > Update patch. > > The patch is wrong, and the correct way to fix these patterns is more complex: > > a) the pattern should not access register in mode, narrower than 128 > bits, as this implies MMX register in non-TARGET-MMX-WITH-SSE targets. > So, the correct way to define insn with narrow mode is to use > vec_select, something like: > > (define_insn "sse4_1_v8qiv8hi2" > [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") > (any_extend:V8HI > (vec_select:V8QI > (match_operand:V16QI 1 "register_operand" "Yr,*x,v") > (parallel [(const_int 0) (const_int 1) >(const_int 2) (const_int 3) >(const_int 4) (const_int 5) >(const_int 6) (const_int 7)]] > > The instruction accesses the memory in the correct mode, so the memory > operand is: > > (define_insn "*sse4_1_v8qiv8hi2_1" > [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") > (any_extend:V8HI > (match_operand:V8QI 1 "memory_operand" "m,m,m")))] > > and a pre-reload split has to be introduced to convert insn from > register form to memory form, when memory gets propagated to the insn: > > (define_insn_and_split "*sse4_1_v8qiv8hi2_2" > [(set (match_operand:V8HI 0 "register_operand") > (any_extend:V8HI > (vec_select:V8QI > (subreg:V16QI > (vec_concat:V2DI > (match_operand:DI 1 "memory_operand") > (const_int 0)) 0) > (parallel [(const_int 0) (const_int 1) >(const_int 2) (const_int 3) >(const_int 4) (const_int 5) >(const_int 6) (const_int 7)]] > > For a middle end to use this insn, an expander is used: > > (define_expand "v8qiv8hi2" > [(set (match_operand:V8HI 0 "register_operand") > (any_extend:V8HI > (match_operand:V8QI 1 "nonimmediate_operand")))] > > b) Similar approach is used when an output is narrower than 128 bits: > > (define_insn "*floatv2div2sf2" > [(set (match_operand:V4SF 0 "register_operand" "=v") > (vec_concat:V4SF > (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm")) > (match_operand:V2SF 2 "const0_operand" "C")))] > > In your concrete case, > > (define_insn "fix_truncv2sfv2di2" > [(set (match_operand:V2DI 0 "register_operand" "=v") > (any_fix:V2DI > (vec_select:V2SF > (match_operand:V4SF 1 "nonimmediate_operand" "vm") > (parallel [(const_int 0) (const_int 1)]] > > is already _NOT_ defined in a correct way as far as memory operand is > concerned, see a) above. But, , we will apparently have to live > with that. The problem is, that it is named as a standard named > pattern, so middle-end discovers it and tries to use it. It should be > renamed with avx512dq_... prefix. Let's give middle-end something > correct, similar to: > > (define_expand "v8qiv8hi2" > [(set (match_operand:V8HI 0 "register_operand") > (any_extend:V8HI > (match_operand:V8QI 1 "nonimmediate_operand")))] > "TARGET_SSE4_1" > { > if (!MEM_P (operands[1])) > { > operands[1] = simplify_gen_subreg (V16QImode, operands[1], V8QImode, 0); > emit_insn (gen_sse4_1_v8qiv8hi2 (operands[0], operands[1])); > DONE; > } > }) > > The second case is with v2sf output, less than 128 bits wide: > > (define_insn "*floatv2div2sf2" > [(set (match_operand:V4SF 0 "register_operand" "=v") > (vec_concat:V4SF > (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm")) > (match_operand:V2SF 2 "const0_operand" "C")))] > > The above insn pattern is OK, we access the output register with > 128bit access, so we are sure no MMX reg will be ge
Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]
On Mon, May 25, 2020 at 7:53 AM Hongtao Liu wrote: > > We have to introduce a new expander, that will have conforming mode of > > output operand (V2SF) and will produce RTX that will match > > *floatv2div2sf2. A paradoxical output subreg from > > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is > > the case with paradoxical input subreg. > > Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0) > will return NULL since > > 948 /* Subregs involving floating point modes are not allowed to > 949 change size. Therefore (subreg:DI (reg:DF) 0) is fine, but > 950 (subreg:SI (reg:DF) 0) isn't. */ But, we are not changing size, we are still operating with SFmode. It looks to me that this limitation is too strict, the intention is to not expand scalar SFmode to DFmode. Let's ask experts. Uros.
Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]
On Mon, May 25, 2020 at 1:55 AM Uros Bizjak wrote: > > On Sun, May 24, 2020 at 9:26 AM Hongtao Liu wrote: > > > > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak wrote: > > > > > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu wrote: > > > > > > > > Hi: > > > > This patch fix non-conforming expander for > > > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di, > > > > refer to PR95211, PR95256. > > > > bootstrap ok, regression test on i386/x86-64 backend is ok. > > > > > > > > gcc/ChangeLog: > > > > PR target/95211 PR target/95256 > > > > > Changed. > > > Please put every PR reference in a separate line. > > > > > > > * config/i386/sse.md v2div2sf2): New expander. > > > > (fix_truncv2sfv2di2): Ditto. > > > > (floatv2div2sf2_internal): Renaming from > > > > floatv2div2sf2. > > > > (fix_truncv2sfv2di2_internal): > > > > > > The convention throughout sse,md is to prefix a standard pattern that > > > is used through builtins with avx512_ instead of suffixing > > > the pattern name with _internal. > > > > > Changed. > > > > Renaming from fix_truncv2sfv2di2. > > > > (vec_pack_float_): Adjust icode name. > > > > (vec_unpack_fix_trunc_lo_): Ditto. > > > > * config/i386/i386-builtin.def: Ditto. > > > > > > Uros. > > > > Update patch. > > The patch is wrong, and the correct way to fix these patterns is more complex: > > a) the pattern should not access register in mode, narrower than 128 > bits, as this implies MMX register in non-TARGET-MMX-WITH-SSE targets. It seems there are some patterns in sse.md not obey this rule. i.e: (define_insn "sse_storehps" [(set (match_operand:V2SF 0 "nonimmediate_operand" "=m,v,v") (vec_select:V2SF (match_operand:V4SF 1 "nonimmediate_operand" "v,v,o") (parallel [(const_int 2) (const_int 3)])))] "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ %vmovhps\t{%1, %0|%q0, %1} %vmovhlps\t{%1, %d0|%d0, %1} %vmovlps\t{%H1, %d0|%d0, %H1}" [(set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "V2SF,V4SF,V2SF")]) (define_insn "sse_storelps" [(set (match_operand:V2SF 0 "nonimmediate_operand" "=m,v,v") (vec_select:V2SF (match_operand:V4SF 1 "nonimmediate_operand" " v,v,m") (parallel [(const_int 0) (const_int 1)])))] "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ %vmovlps\t{%1, %0|%q0, %1} %vmovaps\t{%1, %0|%0, %1} %vmovlps\t{%1, %d0|%d0, %q1}" [(set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "V2SF,V4SF,V2SF")]) Should they be restricted under TARGET_MMX_WITH_SSE or is there anything i missed? > So, the correct way to define insn with narrow mode is to use > vec_select, something like: > > (define_insn "sse4_1_v8qiv8hi2" > [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") > (any_extend:V8HI > (vec_select:V8QI > (match_operand:V16QI 1 "register_operand" "Yr,*x,v") > (parallel [(const_int 0) (const_int 1) >(const_int 2) (const_int 3) >(const_int 4) (const_int 5) >(const_int 6) (const_int 7)]] > > The instruction accesses the memory in the correct mode, so the memory > operand is: > > (define_insn "*sse4_1_v8qiv8hi2_1" > [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v") > (any_extend:V8HI > (match_operand:V8QI 1 "memory_operand" "m,m,m")))] > > and a pre-reload split has to be introduced to convert insn from > register form to memory form, when memory gets propagated to the insn: > > (define_insn_and_split "*sse4_1_v8qiv8hi2_2" > [(set (match_operand:V8HI 0 "register_operand") > (any_extend:V8HI > (vec_select:V8QI > (subreg:V16QI > (vec_concat:V2DI > (match_operand:DI 1 "memory_operand") > (const_int 0)) 0) > (parallel [(const_int 0) (const_int 1) >(const_int 2) (const_int 3) >(const_int 4) (const_int 5) >(const_int 6) (const_int 7)]] > > For a middle end to use this insn, an expander is used: > > (define_expand "v8qiv8hi2" > [(set (match_operand:V8HI 0 "register_operand") > (any_extend:V8HI > (match_operand:V8QI 1 "nonimmediate_operand")))] > > b) Similar approach is used when an output is narrower than 128 bits: > > (define_insn "*floatv2div2sf2" > [(set (match_operand:V4SF 0 "register_operand" "=v") > (vec_concat:V4SF > (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm")) > (match_operand:V2SF 2 "const0_operand" "C")))] > > In your concrete case, > > (define_insn "fix_truncv2sfv2di2" > [(set (match_operand:V2DI 0 "register_operand" "=v") > (any_fix:V2DI > (vec_select:V2SF > (match_operand:V4SF 1 "nonimmediate_operand" "vm") > (parallel [(const_int 0) (const_int 1)]] > > is already _NOT_ defined in a correct way a
Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]
On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak wrote: >On Mon, May 25, 2020 at 7:53 AM Hongtao Liu wrote: > >> > We have to introduce a new expander, that will have conforming mode >of >> > output operand (V2SF) and will produce RTX that will match >> > *floatv2div2sf2. A paradoxical output subreg from >> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is >> > the case with paradoxical input subreg. >> >> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0) >> will return NULL since >> >> 948 /* Subregs involving floating point modes are not allowed to >> 949 change size. Therefore (subreg:DI (reg:DF) 0) is fine, but >> 950 (subreg:SI (reg:DF) 0) isn't. */ > >But, we are not changing size, we are still operating with SFmode. It >looks to me that this limitation is too strict, the intention is to >not expand scalar SFmode to DFmode. I guess so. The test probably wants to tes the component mode. >Let's ask experts. > >Uros.