date:20200524

[PATCH] Add missing expander for vector float_extend and float_truncate [PR target/95125]

2020-05-24 Thread Hongtao Liu via Gcc-patches

  Bootstrap is ok, regression test on i386/x86-64 backend is ok.

gcc/ChangeLog
PR target/95125
* config/i386/sse.md (sf2dfmode_lower): New mode attribute.
(trunc2) New expander.
(extend2): Ditto.

gcc/testsuite/ChangeLog
* gcc.target/i386/pr95125-avx.c: New test.
* gcc.target/i386/pr95125-avx512f.c: Ditto.

--
BR,
Hongtao


0001-Add-missing-expander-for-vector-float_extend-and-flo.patch
Description: Binary data

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Hongtao Liu via Gcc-patches

On Sat, May 23, 2020 at 6:11 PM Uros Bizjak  wrote:
>
> On Sat, May 23, 2020 at 9:25 AM Hongtao Liu  wrote:
> >
> > Hi:
> >   This patch fix non-conforming expander for
> > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di,
> > refer to PR95211, PR95256.
> >   bootstrap ok, regression test on i386/x86-64 backend is ok.
> >
> > gcc/ChangeLog:
> > PR target/95211 PR target/95256
>
Changed.
> Please put every PR reference in a separate line.
>
> > * config/i386/sse.md v2div2sf2): New expander.
> > (fix_truncv2sfv2di2): Ditto.
> > (floatv2div2sf2_internal): Renaming from
> > floatv2div2sf2.
> > (fix_truncv2sfv2di2_internal):
>
> The convention throughout sse,md is to prefix a standard pattern that
> is used through builtins with avx512_ instead of suffixing
> the pattern name with _internal.
>
Changed.
> > Renaming from fix_truncv2sfv2di2.
> > (vec_pack_float_): Adjust icode name.
> > (vec_unpack_fix_trunc_lo_): Ditto.
> > * config/i386/i386-builtin.def: Ditto.
>
> Uros.

Update patch.

gcc/ChangeLog:
PR target/95211
PR target/95256
* config/i386/sse.md v2div2sf2): New expander.
(fix_truncv2sfv2di2): Ditto.
(avx512dq_floatv2div2sf2): Renaming from
floatv2div2sf2.
(avx512dq_fix_truncv2sfv2di2):
Renaming from fix_truncv2sfv2di2.
(vec_pack_float_): Adjust icode name.
(vec_unpack_fix_trunc_lo_): Ditto.
(vec_unpack_fix_trunc_hi_): Ditto.
* config/i386/i386-builtin.def: Ditto.

gcc/testsuite/ChangeLog
* gcc.target/i386/pr95211.c: New test.

-- 
BR,
Hongtao


0001-Fix-non-comforming-expander-for_V2.patch
Description: Binary data

Re: [PATCH] x86: Handle -mavx512vpopcntdq for -march=native

2020-05-24 Thread Uros Bizjak via Gcc-patches

On Sat, May 23, 2020 at 5:07 PM H.J. Lu  wrote:
>
> On Fri, May 22, 2020 at 12:42 AM Uros Bizjak  wrote:
> >
> > On Thu, May 21, 2020 at 2:54 PM H.J. Lu  wrote:
> > >
> > > Add -mavx512vpopcntdq for -march=native if AVX512VPOPCNTDQ is available.
> > >
> > > PR target/95258
> > > * config/i386/driver-i386.c (host_detect_local_cpu): Detect
> > > AVX512VPOPCNTDQ.
> >
> > OK.
> >
>
> OK for backports?

OK.

Thanks,
Uros.

> Thanks.
>
> >
> > > ---
> > >  gcc/config/i386/driver-i386.c | 9 ++---
> > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
> > > index 7612ddfb846..3a816400729 100644
> > > --- a/gcc/config/i386/driver-i386.c
> > > +++ b/gcc/config/i386/driver-i386.c
> > > @@ -420,6 +420,7 @@ const char *host_detect_local_cpu (int argc, const 
> > > char **argv)
> > >unsigned int has_avx5124fmaps = 0, has_avx5124vnniw = 0;
> > >unsigned int has_gfni = 0, has_avx512vbmi2 = 0;
> > >unsigned int has_avx512bitalg = 0;
> > > +  unsigned int has_avx512vpopcntdq = 0;
> > >unsigned int has_shstk = 0;
> > >unsigned int has_avx512vnni = 0, has_vaes = 0;
> > >unsigned int has_vpclmulqdq = 0;
> > > @@ -528,6 +529,7 @@ const char *host_detect_local_cpu (int argc, const 
> > > char **argv)
> > >has_vaes = ecx & bit_VAES;
> > >has_vpclmulqdq = ecx & bit_VPCLMULQDQ;
> > >has_avx512bitalg = ecx & bit_AVX512BITALG;
> > > +  has_avx512vpopcntdq = ecx & bit_AVX512VPOPCNTDQ;
> > >has_movdiri = ecx & bit_MOVDIRI;
> > >has_movdir64b = ecx & bit_MOVDIR64B;
> > >has_enqcmd = ecx & bit_ENQCMD;
> > > @@ -1189,6 +1191,7 @@ const char *host_detect_local_cpu (int argc, const 
> > > char **argv)
> > >const char *avx512vp2intersect = has_avx512vp2intersect ? " 
> > > -mavx512vp2intersect" : " -mno-avx512vp2intersect";
> > >const char *tsxldtrk = has_tsxldtrk ? " -mtsxldtrk " : " 
> > > -mno-tsxldtrk";
> > >const char *avx512bitalg = has_avx512bitalg ? " -mavx512bitalg" : 
> > > " -mno-avx512bitalg";
> > > +  const char *avx512vpopcntdq = has_avx512vpopcntdq ? " 
> > > -mavx512vpopcntdq" : " -mno-avx512vpopcntdq";
> > >const char *movdiri = has_movdiri ? " -mmovdiri" : " -mno-movdiri";
> > >const char *movdir64b = has_movdir64b ? " -mmovdir64b" : " 
> > > -mno-movdir64b";
> > >const char *enqcmd = has_enqcmd ? " -menqcmd" : " -mno-enqcmd";
> > > @@ -1210,9 +1213,9 @@ const char *host_detect_local_cpu (int argc, const 
> > > char **argv)
> > > avx512ifma, avx512vbmi, avx5124fmaps, 
> > > avx5124vnniw,
> > > clwb, mwaitx, clzero, pku, rdpid, gfni, shstk,
> > > avx512vbmi2, avx512vnni, vaes, vpclmulqdq,
> > > -   avx512bitalg, movdiri, movdir64b, waitpkg, 
> > > cldemote,
> > > -   ptwrite, avx512bf16, enqcmd, avx512vp2intersect,
> > > -   serialize, tsxldtrk, NULL);
> > > +   avx512bitalg, avx512vpopcntdq, movdiri, movdir64b,
> > > +   waitpkg, cldemote, ptwrite, avx512bf16, enqcmd,
> > > +   avx512vp2intersect, serialize, tsxldtrk, NULL);
> > >  }
> > >
> > >  done:
> > > --
> > > 2.26.2
> > >
>
>
>
> --
> H.J.

[pushed] Darwin: Make sanitizer local vars linker-visible.

2020-05-24 Thread Iain Sandoe

Hi

Another case where we need a linker-visible symbols in order to
preserve the ld64 atom model.  If these symbols are emitted as
'local' the linker cannot see that they are separate from any
global weak entry that precedes them.  This will cause the linker
to complain that there is (apparently) direct access to such a
weak global, preventing it from being replaced.

This is a short-term fix for the problem - we need generic
handling for relevant cases (that also does not pessimise objects
by emitting unnecessary symbols and relocations).

tested on x86_64-darwin16,
applied to master and 10.2, so far.
thanks
Iain

gcc/ChangeLog:

2020-05-23  Iain Sandoe  

* config/darwin.h (ASM_GENERATE_INTERNAL_LABEL):
Make ubsan_{data,type},ASAN symbols linker-visible.
---
 gcc/ChangeLog   | 5 +
 gcc/config/darwin.h | 6 ++
 2 files changed, 11 insertions(+)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7a7b599ff93..ede1f15eb7a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2020-05-23 Iain Sandoe 
+
+   * config/darwin.h (ASM_GENERATE_INTERNAL_LABEL): Make
+   ubsan_{data,type},ASAN symbols linker-visible.
+
 2020-05-22  Jan Hubicka  
 
* lto-streamer-out.c (DFS::DFS): Silence warning.
diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 27665b34a18..f528b1766bf 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -808,6 +808,12 @@ extern GTY(()) section * 
darwin_sections[NUM_DARWIN_SECTIONS];
   do { \
 if (strcmp ("LC", PREFIX) == 0)\
   sprintf (LABEL, "*%s%ld", "lC", (long)(NUM));\
+else if (strcmp ("Lubsan_data", PREFIX) == 0)  \
+  sprintf (LABEL, "*%s%ld", "lubsan_data", (long)(NUM));\
+else if (strcmp ("Lubsan_type", PREFIX) == 0)  \
+  sprintf (LABEL, "*%s%ld", "lubsan_type", (long)(NUM));\
+else if (strcmp ("LASAN", PREFIX) == 0)\
+  sprintf (LABEL, "*%s%ld", "lASAN", (long)(NUM));\
 else   \
   sprintf (LABEL, "*%s%ld", PREFIX, (long)(NUM));  \
   } while (0)
-- 
2.24.1

[PATCH v1 1/2][PPC64] [PR88877]

2020-05-24 Thread Kamlesh Kumar via Gcc-patches

Here is a discussion we did some time ago regarding the defect.
https://gcc.gnu.org/pipermail/gcc/2019-January/227834.html
please see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88877 for testcase
behavior.

We incorporating below Jakub's suggestion in this patch series.

Jakub wrote:
""
Yeah, all the callers of emit_library_call* would need to be changed to pass
triplets rtx, machine_mode, int/bool /*unsignedp*/, instead of just
rtx_mode_t pair.
""


In this patch series trying to address same by creating a struct Tuple
which bundles existing rtx and machine_mode and added one more
bool member which store unsigned_p which by default is false.
This patch does not change underlying behavior yet. This will be done in
follow up patches.

ChangeLog Entry:

2020-05-24 Kamlesh Kumar 

* rtl.h (Tuple): Defined and typedefed to rtx_mode_t.
(emit_library_call): Added default arg unsigned_p.
(emit_library_call_value): Added default arg unsigned_p.
---
 gcc/rtl.h | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/gcc/rtl.h b/gcc/rtl.h
index b0b1aac..ee42de7 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -2238,10 +2238,20 @@ struct address_info {
   enum rtx_code base_outer_code;
 };
 
-/* This is used to bundle an rtx and a mode together so that the pair
-   can be used with the wi:: routines.  If we ever put modes into rtx
-   integer constants, this should go away and then just pass an rtx in.  */
-typedef std::pair  rtx_mode_t;
+/* This is used to bundle an rtx and a mode and unsignedness together so
+   that the tuple can be used with the wi:: routines.  If we ever put modes
+   into rtx integer constants, this should go away and then just pass an rtx 
in.  */
+typedef struct Tuple {
+  rtx first;
+  machine_mode second;
+  /* unsigned_p  */
+  bool third;
+  Tuple (rtx f, machine_mode s, bool t = false) {
+first = f;
+second = s;
+third = t;
+  }
+} rtx_mode_t;
 
 namespace wi
 {
@@ -4176,9 +4186,9 @@ emit_library_call (rtx fun, libcall_type fn_type, 
machine_mode outmode)
 
 inline void
 emit_library_call (rtx fun, libcall_type fn_type, machine_mode outmode,
-  rtx arg1, machine_mode arg1_mode)
+  rtx arg1, machine_mode arg1_mode, bool unsigned_p = false)
 {
-  rtx_mode_t args[] = { rtx_mode_t (arg1, arg1_mode) };
+  rtx_mode_t args[] = { rtx_mode_t (arg1, arg1_mode, unsigned_p) };
   emit_library_call_value_1 (0, fun, NULL_RTX, fn_type, outmode, 1, args);
 }
 
@@ -4238,9 +4248,9 @@ emit_library_call_value (rtx fun, rtx value, libcall_type 
fn_type,
 inline rtx
 emit_library_call_value (rtx fun, rtx value, libcall_type fn_type,
 machine_mode outmode,
-rtx arg1, machine_mode arg1_mode)
+rtx arg1, machine_mode arg1_mode, bool unsigned_p = 
false)
 {
-  rtx_mode_t args[] = { rtx_mode_t (arg1, arg1_mode) };
+  rtx_mode_t args[] = { rtx_mode_t (arg1, arg1_mode, unsigned_p) };
   return emit_library_call_value_1 (1, fun, value, fn_type, outmode, 1, args);
 }
 
-- 
2.7.4

Re: [PATCH] Extend std::copy/std::copy_n char* overload to deque iterator

2020-05-24 Thread François Dumont via Gcc-patches


Now tested in C++98 mode, there was indeed a small problem.

I even wonder if I shouldn't have extend the std::copy overload to any 
call with deque iterator as the output so that it is transform into an 
output to pointer.


Ok to commit ?

François

On 23/05/20 6:37 pm, Jonathan Wakely wrote:

On 22/05/20 22:57 +0200, FranÃ§ois Dumont via Libstdc++ wrote:

On 21/05/20 2:17 pm, Jonathan Wakely wrote:


Why is the optimization not done for C++03 mode?

I did it this way because the new std::copy overload rely on 
std::copy_n implementation details which is a C++11 algo.




It looks like the uses of 'auto' can be reaplced easily, and
__enable_if_t<> can be replaced with __gnu_cxx::__enable_if<>::__type.

But yes, we can indeed provide those implementation details in 
pre-C++11.


This is what I've done in this new version.

Tested under Linux x86_64 in default c++ mode.

I tried to use CXXFLAGS=-std=c++03 but it doesn't seem to work even 
if I do see the option in build logs. I remember you adivised a 
different approach, can you tell me again ?


See the documentation:
https://gcc.gnu.org/onlinedocs/libstdc++/manual/test.html#test.run.permutations 






diff --git a/libstdc++-v3/include/bits/deque.tcc b/libstdc++-v3/include/bits/deque.tcc
index e773f32b256..d7dbe64f3e1 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -1065,6 +1065,57 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
   return __result;
 }
 
+  template
+typename __gnu_cxx::__enable_if<
+  __is_char<_CharT>::__value,
+  _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> >::__type
+__copy_move_a2(
+	istreambuf_iterator<_CharT, char_traits<_CharT> > __first,
+	istreambuf_iterator<_CharT, char_traits<_CharT> > __last,
+	_GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> __result)
+{
+  if (__first == __last)
+	return __result;
+
+  for (;;)
+	{
+	  const std::ptrdiff_t __len = __result._M_last - __result._M_cur;
+	  const std::ptrdiff_t __nb
+	= std::__copy_n_a(__first, __len, __result._M_cur, false)
+	- __result._M_cur;
+	  __result += __nb;
+
+	  if (__nb != __len)
+	break;
+	}
+
+  return __result;
+}
+
+  template
+typename __gnu_cxx::__enable_if<
+  __is_char<_CharT>::__value,
+  _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> >::__type
+__copy_n_a(
+  istreambuf_iterator<_CharT, char_traits<_CharT> > __it, _Size __size,
+  _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> __result,
+  bool __strict)
+{
+  if (__size == 0)
+	return __result;
+
+  do
+	{
+	  const _Size __len
+	= std::min<_Size>(__result._M_last - __result._M_cur, __size);
+	  std::__copy_n_a(__it, __len, __result._M_cur, __strict);
+	  __result += __len;
+	  __size -= __len;
+	}
+  while (__size != 0);
+  return __result;
+}
+
   template
 _OI
diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index 932ece55529..70d8232aece 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -705,31 +705,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return __result;
 }
 
-  template
-_GLIBCXX20_CONSTEXPR
-_OutputIterator
-__copy_n_a(_InputIterator __first, _Size __n, _OutputIterator __result)
-{
-  if (__n > 0)
-	{
-	  while (true)
-	{
-	  *__result = *__first;
-	  ++__result;
-	  if (--__n > 0)
-		++__first;
-	  else
-		break;
-	}
-	}
-  return __result;
-}
- 
-  template
-__enable_if_t<__is_char<_CharT>::__value, _CharT*>
-__copy_n_a(istreambuf_iterator<_CharT, char_traits<_CharT>>,
-	   _Size, _CharT*);
-
   template
 _GLIBCXX20_CONSTEXPR
 _OutputIterator
@@ -738,7 +713,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   return std::__niter_wrap(__result,
 			   __copy_n_a(__first, __n,
-	  std::__niter_base(__result)));
+	  std::__niter_base(__result), true));
 }
 
   template::value)
 { return __it; }
 
+  template
+_Ite
+__niter_base(const ::__gnu_debug::_Safe_iterator<_Ite, _Seq,
+		 std::random_access_iterator_tag>&);
+
   // Reverse the __niter_base transformation to get a
   // __normal_iterator back again (this assumes that __normal_iterator
   // is only used to wrap random access iterators, like pointers).
@@ -466,6 +471,15 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
 __copy_move_a2(istreambuf_iterator<_CharT, char_traits<_CharT> >,
 		   istreambuf_iterator<_CharT, char_traits<_CharT> >, _CharT*);
 
+  template
+typename __gnu_cxx::__enable_if<
+  __is_char<_CharT>::__value,
+  _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*> >::__type
+__copy_move_a2(
+	istreambuf_iterator<_CharT, char_traits<_CharT> >,
+	istreambuf_iterator<_CharT, char_traits<_CharT> >,
+	_GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*>);
+
   template
 _GLIBCXX20_CONSTEXPR
 inline _OI
@@ -539,6 +553,41 @@ _

Re: [PATCH v1 1/2][PPC64] [PR88877]

2020-05-24 Thread Segher Boessenkool

Hi!

On Sun, May 24, 2020 at 07:03:13PM +0530, Kamlesh Kumar wrote:
> In this patch series trying to address same by creating a struct Tuple
> which bundles existing rtx and machine_mode and added one more
> bool member which store unsigned_p which by default is false.

The idea is good.  However, you cannot call something as specific as this
"tuple", in a header file that is used everywhere even.  (We also do not
have a "leading caps on types" convention).

> This patch does not change underlying behavior yet. This will be done in
> follow up patches.

Thanks :-)

> * rtl.h (Tuple): Defined and typedefed to rtx_mode_t.

It's the other way around: rtx_mode_t is typedeffed to struct Tuple, so
rtx_mode_t should be listed to the left of a : as well.

OTOH, you don't need to name Tuple at all...  It should not *have* a
constructor, since you declared it as class...  But you can just use
std::tuple here?

> (emit_library_call): Added default arg unsigned_p.
> (emit_library_call_value): Added default arg unsigned_p.

Yeah, eww.  Default arguments have all the problems you had before,
except now it is hidden and much more surprising.

Those functions really should take rtx_mode_t arguments?

Thanks again for working on this,

Segher

Re: [PATCH] Add support for C++20 barriers

2020-05-24 Thread Thomas Rodgers

This time with 100% more patch…



0001-Add-support-for-C-20-barriers_f.patch
Description: Binary data


> On May 23, 2020, at 3:58 PM, Thomas Rodgers  wrote:
> 
> This patch requires the patch for atomic::wait/notify to be applied first.
> 
> This implementation is based on the libc++ implementation, but excludes the 
> alternative “central barrier” implementation for now as there is no standard 
> way to switch between the two.
> 
>* include/Makefile.am (std_headers): Add new header.
>* include/Makefile.in: Regenerate.
>* include/std/barrier: New file.
>* testsuite/30_thread/barrier/1.cc: New test.
>* testsuite/30_thread/barrier/2.cc: Likewise.
>* testsuite/30_thread/barrier/arrive_and_drop.cc: Likewise.
>* testsuite/30_thread/barrier/arrive_and_wait.cc: Likewise.
>* testsuite/30_thread/barrier/arrive.cc: Likewise.
>* testsuite/30_thread/barrier/completion.cc: Likewise.
>* testsuite/30_thread/barrier/max.cc: Likewise.
>

Re: [PATCH] Add support for C++20 barriers

2020-05-24 Thread Florian Weimer

* Thomas Rodgers:

> +  static __gthread_t
> +  _S_get_tid() noexcept
> +  {
> +#ifdef __GLIBC__
> + // For the GNU C library pthread_self() is usable without linking to
> + // libpthread.so but returns 0, so we cannot use it in single-threaded
> + // programs, because this_thread::get_id() != thread::id{} must be true.
> + // We know that pthread_t is an integral type in the GNU C library.
> + if (!__gthread_active_p())
> +   return 1;
> +#endif
> + return __gthread_self();
> +  }

This comment seems outdated or incomplete.  pthread_self returns a
proper pointer since glibc 2.27, I believe.

I'm also not sure how the difference is observable for the libstdc++
implementation.  Late loading of libpthread isn't quite supported.

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Uros Bizjak via Gcc-patches

On Sun, May 24, 2020 at 9:26 AM Hongtao Liu  wrote:
>
> On Sat, May 23, 2020 at 6:11 PM Uros Bizjak  wrote:
> >
> > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu  wrote:
> > >
> > > Hi:
> > >   This patch fix non-conforming expander for
> > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di,
> > > refer to PR95211, PR95256.
> > >   bootstrap ok, regression test on i386/x86-64 backend is ok.
> > >
> > > gcc/ChangeLog:
> > > PR target/95211 PR target/95256
> >
> Changed.
> > Please put every PR reference in a separate line.
> >
> > > * config/i386/sse.md v2div2sf2): New expander.
> > > (fix_truncv2sfv2di2): Ditto.
> > > (floatv2div2sf2_internal): Renaming from
> > > floatv2div2sf2.
> > > (fix_truncv2sfv2di2_internal):
> >
> > The convention throughout sse,md is to prefix a standard pattern that
> > is used through builtins with avx512_ instead of suffixing
> > the pattern name with _internal.
> >
> Changed.
> > > Renaming from fix_truncv2sfv2di2.
> > > (vec_pack_float_): Adjust icode name.
> > > (vec_unpack_fix_trunc_lo_): Ditto.
> > > * config/i386/i386-builtin.def: Ditto.
> >
> > Uros.
>
> Update patch.

The patch is wrong, and the correct way to fix these patterns is more complex:

a) the pattern should not access register in mode, narrower than 128
bits, as this implies MMX register in non-TARGET-MMX-WITH-SSE targets.
So, the correct way to define insn with narrow mode is to use
vec_select, something like:

(define_insn "sse4_1_v8qiv8hi2"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
  (vec_select:V8QI
(match_operand:V16QI 1 "register_operand" "Yr,*x,v")
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]]

The instruction accesses the memory in the correct mode, so the memory
operand is:

(define_insn "*sse4_1_v8qiv8hi2_1"
  [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
(any_extend:V8HI
  (match_operand:V8QI 1 "memory_operand" "m,m,m")))]

and a pre-reload split has to be introduced to convert insn from
register form to memory form, when memory gets propagated to the insn:

(define_insn_and_split "*sse4_1_v8qiv8hi2_2"
  [(set (match_operand:V8HI 0 "register_operand")
(any_extend:V8HI
  (vec_select:V8QI
(subreg:V16QI
  (vec_concat:V2DI
(match_operand:DI 1 "memory_operand")
(const_int 0)) 0)
(parallel [(const_int 0) (const_int 1)
   (const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)]]

For a middle end to use this insn, an expander is used:

(define_expand "v8qiv8hi2"
  [(set (match_operand:V8HI 0 "register_operand")
(any_extend:V8HI
  (match_operand:V8QI 1 "nonimmediate_operand")))]

b) Similar approach is used when an output is narrower than 128 bits:

(define_insn "*floatv2div2sf2"
  [(set (match_operand:V4SF 0 "register_operand" "=v")
(vec_concat:V4SF
(any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
(match_operand:V2SF 2 "const0_operand" "C")))]

In your concrete case,

(define_insn "fix_truncv2sfv2di2"
  [(set (match_operand:V2DI 0 "register_operand" "=v")
(any_fix:V2DI
  (vec_select:V2SF
(match_operand:V4SF 1 "nonimmediate_operand" "vm")
(parallel [(const_int 0) (const_int 1)]]

is already _NOT_ defined in a correct way as far as memory operand is
concerned, see a) above. But, , we will apparently have to live
with that. The problem is, that it is named as a standard named
pattern, so middle-end discovers it and tries to use it. It should be
renamed with avx512dq_... prefix. Let's give middle-end something
correct, similar to:

(define_expand "v8qiv8hi2"
  [(set (match_operand:V8HI 0 "register_operand")
(any_extend:V8HI
  (match_operand:V8QI 1 "nonimmediate_operand")))]
  "TARGET_SSE4_1"
{
  if (!MEM_P (operands[1]))
{
  operands[1] = simplify_gen_subreg (V16QImode, operands[1], V8QImode, 0);
  emit_insn (gen_sse4_1_v8qiv8hi2 (operands[0], operands[1]));
  DONE;
}
})

The second case is with v2sf output, less than 128 bits wide:

(define_insn "*floatv2div2sf2"
  [(set (match_operand:V4SF 0 "register_operand" "=v")
(vec_concat:V4SF
(any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
(match_operand:V2SF 2 "const0_operand" "C")))]

The above insn pattern is OK, we access the output register with
128bit access, so we are sure no MMX reg will be generated. The
problem is with the existing expander

(define_expand "floatv2div2sf2"
  [(set (match_operand:V4SF 0 "register_operand" "=v")
(vec_concat:V4SF
(any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
(match_dup 2)))]
  "TARGET_AVX512DQ && TARGET_AVX512VL"
  "o

Re: [PATCH] Add missing expander for vector float_extend and float_truncate [PR target/95125]

2020-05-24 Thread Uros Bizjak via Gcc-patches

On Sun, May 24, 2020 at 9:20 AM Hongtao Liu  wrote:
>
>   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
>
> gcc/ChangeLog
> PR target/95125
> * config/i386/sse.md (sf2dfmode_lower): New mode attribute.
> (trunc2) New expander.
> (extend2): Ditto.
>
> gcc/testsuite/ChangeLog
> * gcc.target/i386/pr95125-avx.c: New test.
> * gcc.target/i386/pr95125-avx512f.c: Ditto.

OK.

Thanks,
Uros.

Re: [PATCH] Add support for C++20 barriers

2020-05-24 Thread Jonathan Wakely via Gcc-patches

On Sun, 24 May 2020 at 18:55, Florian Weimer wrote:
>
> * Thomas Rodgers:
>
> > +  static __gthread_t
> > +  _S_get_tid() noexcept
> > +  {
> > +#ifdef __GLIBC__
> > + // For the GNU C library pthread_self() is usable without linking to
> > + // libpthread.so but returns 0, so we cannot use it in single-threaded
> > + // programs, because this_thread::get_id() != thread::id{} must be 
> > true.
> > + // We know that pthread_t is an integral type in the GNU C library.
> > + if (!__gthread_active_p())
> > +   return 1;
> > +#endif
> > + return __gthread_self();
> > +  }
>
> This comment seems outdated or incomplete.  pthread_self returns a
> proper pointer since glibc 2.27, I believe.

The comment is copied from the  header, and dates from 2015.

> I'm also not sure how the difference is observable for the libstdc++
> implementation.  Late loading of libpthread isn't quite supported.

It's nothing to do with late loading. A single threaded program that
doesn't create any threads and doesn't link to libpthread can still
expect std::this_thread::get_id() != std::thread::id() to be true in
the main (and only) thread. If pthread_self() returns 0, and
thread::id() default constructs with a value of 0, then we can't
distinguish "the main thread" from "not a thread".

But I do see a non-zero value from glibc now, which is great. I'll add
it to my TODO list to remove that workaround from .

Re: [PATCH] Add support for C++20 barriers

2020-05-24 Thread Florian Weimer

* Jonathan Wakely:

> On Sun, 24 May 2020 at 18:55, Florian Weimer wrote:
>>
>> * Thomas Rodgers:
>>
>> > +  static __gthread_t
>> > +  _S_get_tid() noexcept
>> > +  {
>> > +#ifdef __GLIBC__
>> > + // For the GNU C library pthread_self() is usable without linking to
>> > + // libpthread.so but returns 0, so we cannot use it in 
>> > single-threaded
>> > + // programs, because this_thread::get_id() != thread::id{} must be 
>> > true.
>> > + // We know that pthread_t is an integral type in the GNU C library.
>> > + if (!__gthread_active_p())
>> > +   return 1;
>> > +#endif
>> > + return __gthread_self();
>> > +  }
>>
>> This comment seems outdated or incomplete.  pthread_self returns a
>> proper pointer since glibc 2.27, I believe.
>
> The comment is copied from the  header, and dates from 2015.
>
>> I'm also not sure how the difference is observable for the libstdc++
>> implementation.  Late loading of libpthread isn't quite supported.
>
> It's nothing to do with late loading. A single threaded program that
> doesn't create any threads and doesn't link to libpthread can still
> expect std::this_thread::get_id() != std::thread::id() to be true in
> the main (and only) thread. If pthread_self() returns 0, and
> thread::id() default constructs with a value of 0, then we can't
> distinguish "the main thread" from "not a thread".

Ahh.  Yes, the POSIX interface does not have any “not a thread” value
for pthread_t, so I can see how it's difficult to implement something
on top of it htat meets the C++ requirements.

[patch, fortran] Fix memory leaks for finalized types

2020-05-24 Thread Thomas Koenig via Gcc-patches


Hello world,

this patch fixes a 8/9/10/11 regression, where finalized types
were not finalized (and deallocated), which led to memory
leaks.

Once the offending commit was identified (thanks, Harald!) error
analysis was rather straightforward.  The central idea was that
it is the expression that should not be finalized twice, not the
component (which is shared).

Less straightforward was writing a meaningful test case; why I could not
get

! { dg-final  { scan-tree-dump-times "__builtin_free.*dat" 2 "original"
} }

to work (dejagnu always complained about finding it once) I don't know.

Anyway, here is the patch.  I have regression-tested it and made sure
that the size part of PR 87352 did not go up again through the roof.
I have also tested all affected finalize test cases with valgrind and
made sure they are still valid and do not leak.

Once this is in, it will be interesting to see if any other finalizer
bugs are affected.

So, OK for trunk and for backporting to all affected branches?

Regards

Thomas

Finalization depends on the expression, not on the component.

gcc/fortran/ChangeLog:

2020-05-24  Thomas Koenig  

PR fortran/94361
* class.c (finalize_component): Use expr->finalized instead of
comp->finalized.
* gfortran.h (gfc_component): Remove finalized member.
(gfc_expr): Add it here instead.

gcc/testsuite/ChangeLog:

2020-05-24  Thomas Koenig  

PR fortran/94361
* gfortran.dg/finalize_28.f90: Adjusted free counts.
* gfortran.dg/finalize_33.f90: Likewise.
* gfortran.dg/finalize_34.f90: Likewise.
* gfortran.dg/finalize_35.f90: New test..


diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 9aa3eb7282c..b5a1edae27f 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -911,7 +911,7 @@ finalize_component (gfc_expr *expr, gfc_symbol *derived, gfc_component *comp,
   if (!comp_is_finalizable (comp))
 return;
 
-  if (comp->finalized)
+  if (expr->finalized)
 return;
 
   e = gfc_copy_expr (expr);
@@ -1002,6 +1002,7 @@ finalize_component (gfc_expr *expr, gfc_symbol *derived, gfc_component *comp,
 	}
   else
 	(*code) = cond;
+
 }
   else if (comp->ts.type == BT_DERIVED
 	&& comp->ts.u.derived->f2k_derived
@@ -1041,7 +1042,7 @@ finalize_component (gfc_expr *expr, gfc_symbol *derived, gfc_component *comp,
 			sub_ns);
   gfc_free_expr (e);
 }
-  comp->finalized = true;
+  expr->finalized = 1;
 }
 
 
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 7094791e871..5af44847f9b 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1107,7 +1107,6 @@ typedef struct gfc_component
   struct gfc_typebound_proc *tb;
   /* When allocatable/pointer and in a coarray the associated token.  */
   tree caf_token;
-  bool finalized;
 }
 gfc_component;
 
@@ -2218,6 +2217,9 @@ typedef struct gfc_expr
   /* Set this if the expression came from expanding an array constructor.  */
   unsigned int from_constructor : 1;
 
+  /* Set this if the expression has already been finalized.  */
+  unsigned int finalized : 1;
+
   /* If an expression comes from a Hollerith constant or compile-time
  evaluation of a transfer statement, it may have a prescribed target-
  memory representation, and these cannot always be backformed from
diff --git a/gcc/testsuite/gfortran.dg/finalize_28.f90 b/gcc/testsuite/gfortran.dg/finalize_28.f90
index 597413b2dd3..f0c9665252f 100644
--- a/gcc/testsuite/gfortran.dg/finalize_28.f90
+++ b/gcc/testsuite/gfortran.dg/finalize_28.f90
@@ -21,4 +21,4 @@ contains
 integer, intent(out) :: edges(:,:)
   end subroutine coo_dump_edges
 end module coo_graphs
-! { dg-final { scan-tree-dump-times "__builtin_free" 5 "original" } }
+! { dg-final { scan-tree-dump-times "__builtin_free" 6 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/finalize_33.f90 b/gcc/testsuite/gfortran.dg/finalize_33.f90
index 2205f9eed7f..3857e4485ee 100644
--- a/gcc/testsuite/gfortran.dg/finalize_33.f90
+++ b/gcc/testsuite/gfortran.dg/finalize_33.f90
@@ -116,4 +116,4 @@ contains
! (iii) mci_template
 end program main_ut
 ! { dg-final { scan-tree-dump-times "__builtin_malloc" 17 "original" } }
-! { dg-final { scan-tree-dump-times "__builtin_free" 19 "original" } }
+! { dg-final { scan-tree-dump-times "__builtin_free" 20 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/finalize_34.f90 b/gcc/testsuite/gfortran.dg/finalize_34.f90
index e2f02a5c51c..fef7dac6d89 100644
--- a/gcc/testsuite/gfortran.dg/finalize_34.f90
+++ b/gcc/testsuite/gfortran.dg/finalize_34.f90
@@ -22,4 +22,4 @@ program main
   use testmodule
   type(evtlist_type), dimension(10) :: a
 end program main
-! { dg-final  { scan-tree-dump-times "__builtin_free" 8 "original" } }
+! { dg-final  { scan-tree-dump-times "__builtin_free" 12 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/finalize_35.f90 b/gcc/testsuite/gfortran.dg/finalize_35.f90
new f

Re: [Patch] PR fortran/95106 - truncation of long symbol names with EQUIVALENCE

2020-05-24 Thread Thomas Koenig via Gcc-patches


Hi Harald,


OK for master?


The patch is OK.

Regarding the test case - I think it should be OK.  If not,
expect to hear from people soon, you could then still restrict
it to Linux (or something else along those lines).

Regards

Thomas

PR libfortran/95195 - improve runtime error for namelist i/o to unformatted file

2020-05-24 Thread Harald Anlauf

Without the patch below, an attempted namelist write to an unformatted file -
which is prohibited by the standard - would generate the following runtime 
error:

At line 12 of file pr95195.f90 (unit = 10, file = 'test.dat')
Fortran runtime error: End of record

followed by some backtrace.  The patch attempts to generate an error pointing
the user to the real issue.

Regtested on x86_64-pc-linux-gnu.

OK for master?

Thanks,
Harald


PR libfortran/95195 - improve runtime error for namelist i/o to unformatted file

Namelist input/output to unformatted files is prohibited.
Generate useful runtime errors instead instead of misleading ones.

libgfortran/

2020-05-24  Harald Anlauf  

PR fortran/95195
* io/transfer.c (finalize_transfer): Generate runtime error for
namelist input/output to unformatted file.

gcc/testsuite/

2020-05-24  Harald Anlauf  

PR fortran/95195
* gfortran.dg/namelist_97.f90: New test.

diff --git a/gcc/testsuite/gfortran.dg/namelist_97.f90 b/gcc/testsuite/gfortran.dg/namelist_97.f90
new file mode 100644
index 000..4907e46b46a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/namelist_97.f90
@@ -0,0 +1,14 @@
+! { dg-do run }
+! { dg-output "At line 12 .*" }
+! { dg-shouldfail "Fortran runtime error: Namelist formatting .* FORM='UNFORMATTED'" }
+!
+! PR95195 - improve runtime error when writing a namelist to an unformatted file
+
+program test
+  character(len=11) :: my_form = 'unformatted'
+  integer   :: i = 1, j = 2, k = 3
+  namelist /nml1/ i, j, k
+  open  (unit=10, file='test.dat', form=my_form)
+  write (unit=10, nml=nml1)
+  close (unit=10, status='delete')
+end program test
diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index b8db47dbff9..d071c1ce915 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -4123,6 +4123,14 @@ finalize_transfer (st_parameter_dt *dtp)
   if ((dtp->u.p.ionml != NULL)
   && (cf & IOPARM_DT_HAS_NAMELIST_NAME) != 0)
 {
+   if (dtp->u.p.current_unit->flags.form == FORM_UNFORMATTED)
+	 {
+	   generate_error (&dtp->common, LIBERROR_OPTION_CONFLICT,
+			   "Namelist formatting for unit connected "
+			   "with FORM='UNFORMATTED");
+	   return;
+	 }
+
dtp->u.p.namelist_mode = 1;
if ((cf & IOPARM_DT_NAMELIST_READ_MODE) != 0)
 	 namelist_read (dtp);

[PATCH] Port libgccjit to Windows.

2020-05-24 Thread Nicolas Bértolo via Gcc-patches

Hello gcc devs.

I have ported libgccjit to Windows. I have tested it with the
native-compilation branch of Emacs so I'm confident that it works well.

The work is not finished though, I could use some help with these two
points:

I have had to concede defeat to libtool and Automake. I could not get
libgccjit
to create a dll and put it in the correct directories. So for now we'll
have to
copy lib/libgccjit.so to bin/libgccjit.dll.

It is not necessary to use --enable-host-shared in Windows (I tested it),
but I
don't know the proper way to disable that check.

Nicolas


0001-Incomplete-port-of-libgccjit-to-Windows.patch
Description: Binary data

[RFC PATCH] i386: Remove broadcasts from TARGET_MMX_WITH_SSE vec_dup insn patterns

2020-05-24 Thread Uros Bizjak via Gcc-patches

XMM broadcast instructions broadcast value from general reg to all
elements of the vector.  This is not allowed for TARGET_MMX_WITH_SSE,
where it is expected that bits outside lower 64bits load or retain
zero value.  Following testcases expect broadcast, and are thus invalid:

FAIL: gcc.target/i386/sse2-mmx-18b.c scan-assembler-not movd
FAIL: gcc.target/i386/sse2-mmx-18b.c scan-assembler-times pbroadcastd 1
FAIL: gcc.target/i386/sse2-mmx-19b.c scan-assembler-not movd
FAIL: gcc.target/i386/sse2-mmx-19b.c scan-assembler-times pbroadcastw 1
FAIL: gcc.target/i386/sse2-mmx-19d.c scan-assembler-times pbroadcastw 1
FAIL: gcc.target/i386/sse2-mmx-19e.c scan-assembler-times pbroadcastw 1

These testcases will be fixed or removed entirely.

(The patch is prerequisite to implement support for generic
v2sf/v2si/v4hi shuffles).

2020-05-24  Uroš Bizjak  

gcc/ChangeLog:
* config/i386/mmx.md (*vec_dupv2sf): Redefine as define_insn.
(mmx_pshufw_1): Change Yv constraint to xYw.  Correct type attribute.
(*vec_dupv4hi): Redefine as define_insn.
Remove alternative with general register input.
(*vec_dupv2si): Ditto.

Uros.
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 5deef683b0b..b5564711aa4 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -947,27 +947,22 @@
(set_attr "prefix_extra" "1")
(set_attr "mode" "V2SF")])
 
-(define_insn_and_split "*vec_dupv2sf"
+(define_insn "*vec_dupv2sf"
   [(set (match_operand:V2SF 0 "register_operand" "=y,x,Yv")
(vec_duplicate:V2SF
  (match_operand:SF 1 "register_operand" "0,0,Yv")))]
   "TARGET_MMX || TARGET_MMX_WITH_SSE"
   "@
punpckldq\t%0, %0
-   #
-   #"
-  "TARGET_SSE && reload_completed
-   && SSE_REGNO_P (REGNO (operands[0]))"
-  [(set (match_dup 0)
-   (vec_duplicate:V4SF (match_dup 1)))]
-{
-  operands[0] = lowpart_subreg (V4SFmode, operands[0],
-   GET_MODE (operands[0]));
-}
-  [(set_attr "isa" "*,sse_noavx,avx")
+   shufps\t{$0xe0, %0, %0|%0, %0, 0xe0}
+   %vmovsldup\t{%1, %0|%0, %1}"
+  [(set_attr "isa" "*,sse_noavx,sse3")
(set_attr "mmx_isa" "native,*,*")
-   (set_attr "type" "mmxcvt,ssemov,ssemov")
-   (set_attr "mode" "DI,TI,TI")])
+   (set_attr "type" "mmxcvt,sseshuf1,sse")
+   (set_attr "length_immediate" "*,1,*")
+   (set_attr "prefix_rep" "*,*,1")
+   (set_attr "prefix" "*,orig,maybe_vex")
+   (set_attr "mode" "DI,V4SF,V4SF")])
 
 (define_insn "*mmx_concatv2sf"
   [(set (match_operand:V2SF 0 "register_operand" "=y,y")
@@ -1960,9 +1955,9 @@
 })
 
 (define_insn "mmx_pshufw_1"
-  [(set (match_operand:V4HI 0 "register_operand" "=y,Yv")
+  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
 (vec_select:V4HI
-  (match_operand:V4HI 1 "register_mmxmem_operand" "ym,Yv")
+  (match_operand:V4HI 1 "register_mmxmem_operand" "ym,xYw")
   (parallel [(match_operand 2 "const_0_to_3_operand")
  (match_operand 3 "const_0_to_3_operand")
  (match_operand 4 "const_0_to_3_operand")
@@ -1989,7 +1984,7 @@
 }
   [(set_attr "isa" "*,sse2")
(set_attr "mmx_isa" "native,*")
-   (set_attr "type" "mmxcvt,sselog")
+   (set_attr "type" "mmxcvt,sselog1")
(set_attr "length_immediate" "1")
(set_attr "mode" "DI,TI")])
 
@@ -2004,77 +1999,37 @@
(set_attr "prefix_extra" "1")
(set_attr "mode" "DI")])
 
-(define_insn_and_split "*vec_dupv4hi"
-  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw,Yw")
+(define_insn "*vec_dupv4hi"
+  [(set (match_operand:V4HI 0 "register_operand" "=y,xYw")
(vec_duplicate:V4HI
  (truncate:HI
-   (match_operand:SI 1 "register_operand" "0,xYw,r"]
+   (match_operand:SI 1 "register_operand" "0,xYw"]
   "(TARGET_MMX || TARGET_MMX_WITH_SSE)
&& (TARGET_SSE || TARGET_3DNOW_A)"
   "@
pshufw\t{$0, %0, %0|%0, %0, 0}
-   #
-   #"
-  "TARGET_SSE2 && reload_completed
-   && SSE_REGNO_P (REGNO (operands[0]))"
-  [(const_int 0)]
-{
-  rtx op;
-  operands[0] = lowpart_subreg (V8HImode, operands[0],
-   GET_MODE (operands[0]));
-  if (TARGET_AVX2)
-{
-  operands[1] = lowpart_subreg (HImode, operands[1],
-   GET_MODE (operands[1]));
-  op = gen_rtx_VEC_DUPLICATE (V8HImode, operands[1]);
-}
-  else
-{
-  operands[1] = lowpart_subreg (V8HImode, operands[1],
-   GET_MODE (operands[1]));
-  rtx mask = gen_rtx_PARALLEL (VOIDmode,
-  gen_rtvec (8,
- GEN_INT (0),
- GEN_INT (0),
- GEN_INT (0),
- GEN_INT (0),
- GEN_INT (4),
- GEN_INT (5),
- GEN_INT (6),
-

Re: [PATCH] Port libgccjit to Windows.

2020-05-24 Thread David Malcolm via Gcc-patches

On Sun, 2020-05-24 at 17:02 -0300, Nicolas Bértolo via Gcc-patches wrote:

> Hello gcc devs.

Hi Nicolas.

> I have ported libgccjit to Windows. I have tested it with the
> native-compilation branch of Emacs so I'm confident that it works well.

Excellent - thanks for doing this work.

Do you have copyright assignment paperwork on file?
https://gcc.gnu.org/contribute.html#legal

> The work is not finished though, I could use some help with these two
> points:
> 
> I have had to concede defeat to libtool and Automake. I could not get
> libgccjit
> to create a dll and put it in the correct directories. So for now we'll
> have to
> copy lib/libgccjit.so to bin/libgccjit.dll.

The autotools are not my strongest suit.

In a previous life I was a Windows developer, but I think it's been
about 18 years since I've done any coding on Windows, so I'm going to
have to trust your Windows expertise.

> It is not necessary to use --enable-host-shared in Windows (I tested
it),
> but I
> don't know the proper way to disable that check.

(I'm not sure here)

Various comments inline below...

> Nicolas
> 
> From 8644b979cf732e0b4d57c8281229fc3dcc9dc739 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Nicol=C3=A1s=20B=C3=A9rtolo?= 
> Date: Fri, 22 May 2020 17:54:41 -0300
> Subject: [PATCH] Incomplete port of libgccjit to Windows.
> 
> * gcc/Makefile.in: don't look for libiberty in the "pic" subdirectory when
> building for Mingw. Add dependency on xgcc with the proper extension.
> * gcc/c/Make-lang.in: Remove extra slash.
> * gcc/jit/Make-lang.in: Remove extra slash.
> * gcc/jit/jit-playback.c: Do not chmod files in Windows. Use LoadLibrary,
> FreeLibrary and GetProcAddress instead of libdl.
> * gcc/jit/jit-tempdir.c: Do not use mkdtemp() in Windows. Get a filename with
> GetTempFileName.
> ---
>  gcc/Makefile.in| 10 +---
>  gcc/c/Make-lang.in |  2 +-
>  gcc/jit/Make-lang.in   | 10 
>  gcc/jit/jit-playback.c | 25 +--
>  gcc/jit/jit-result.c   | 46 ++
>  gcc/jit/jit-tempdir.c  | 56 ++
>  6 files changed, 132 insertions(+), 17 deletions(-)
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 0fe2ba241..e6dd9f59e 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1046,10 +1046,12 @@ ALL_LINKERFLAGS = $(ALL_CXXFLAGS)
>  
>  # Build and host support libraries.
>  
> -# Use the "pic" build of libiberty if --enable-host-shared.
> +# Use the "pic" build of libiberty if --enable-host-shared, unless we are
> +# building for mingw.
> +LIBIBERTY_PICDIR=$(if $(findstring mingw,$(build)),,pic)
>  ifeq ($(enable_host_shared),yes)
> -LIBIBERTY = ../libiberty/pic/libiberty.a
> -BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/pic/libiberty.a
> +LIBIBERTY = ../libiberty/$(LIBIBERTY_PICDIR)/libiberty.a
> +BUILD_LIBIBERTY = 
> $(build_libobjdir)/libiberty/$(LIBIBERTY_PICDIR)/libiberty.a
>  else
>  LIBIBERTY = ../libiberty/libiberty.a
>  BUILD_LIBIBERTY = $(build_libobjdir)/libiberty/libiberty.a
> @@ -1726,7 +1728,7 @@ MOSTLYCLEANFILES = insn-flags.h insn-config.h 
> insn-codes.h \
>  # This symlink makes the full installation name of the driver be available
>  # from within the *build* directory, for use when running the JIT library
>  # from there (e.g. when running its testsuite).
> -$(FULL_DRIVER_NAME): ./xgcc
> +$(FULL_DRIVER_NAME): ./xgcc$(exeext)
>   rm -f $@
>   $(LN_S) $< $@
>  
> diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in
> index 8944b9b9f..7efc7c2c3 100644
> --- a/gcc/c/Make-lang.in
> +++ b/gcc/c/Make-lang.in
> @@ -162,7 +162,7 @@ c.install-plugin: installdirs
>  # Install import library.
>  ifeq ($(plugin_implib),yes)
>   $(mkinstalldirs) $(DESTDIR)$(plugin_resourcesdir)
> - $(INSTALL_DATA) cc1$(exeext).a 
> $(DESTDIR)/$(plugin_resourcesdir)/cc1$(exeext).a
> + $(INSTALL_DATA) cc1$(exeext).a 
> $(DESTDIR)$(plugin_resourcesdir)/cc1$(exeext).a
>  endif
>  
>  c.uninstall:
> diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
> index 38ddfad28..24f37c98b 100644
> --- a/gcc/jit/Make-lang.in
> +++ b/gcc/jit/Make-lang.in
> @@ -277,17 +277,17 @@ selftest-jit:
>  # Install hooks:
>  jit.install-common: installdirs
>   $(INSTALL_PROGRAM) $(LIBGCCJIT_FILENAME) \
> -   $(DESTDIR)/$(libdir)/$(LIBGCCJIT_FILENAME)
> +   $(DESTDIR)$(libdir)/$(LIBGCCJIT_FILENAME)
>   ln -sf \
> $(LIBGCCJIT_FILENAME) \
> -   $(DESTDIR)/$(libdir)/$(LIBGCCJIT_SONAME_SYMLINK)
> +   $(DESTDIR)$(libdir)/$(LIBGCCJIT_SONAME_SYMLINK)
>   ln -sf \
> $(LIBGCCJIT_SONAME_SYMLINK)\
> -   $(DESTDIR)/$(libdir)/$(LIBGCCJIT_LINKER_NAME_SYMLINK)
> +   $(DESTDIR)$(libdir)/$(LIBGCCJIT_LINKER_NAME_SYMLINK)
>   $(INSTALL_DATA) $(srcdir)/jit/libgccjit.h \
> -   $(DESTDIR)/$(includedir)/libgccjit.h
> +   $(DESTDIR)$(includedir)/libgccjit.h
>   $(INSTALL_DATA) $(srcdir)/jit/libgccjit++.h \
> -   $(DESTDIR)/$(includedir)/libgccjit++.h
> +   $(DESTDIR)$(includedir)/l

Re: [PATCH] Add support for C++20 barriers

2020-05-24 Thread Thomas Rodgers




> On May 24, 2020, at 11:11 AM, Jonathan Wakely  wrote:
> 
> On Sun, 24 May 2020 at 18:55, Florian Weimer wrote:
>> 
>> * Thomas Rodgers:
>> 
>>> +  static __gthread_t
>>> +  _S_get_tid() noexcept
>>> +  {
>>> +#ifdef __GLIBC__
>>> + // For the GNU C library pthread_self() is usable without linking to
>>> + // libpthread.so but returns 0, so we cannot use it in single-threaded
>>> + // programs, because this_thread::get_id() != thread::id{} must be 
>>> true.
>>> + // We know that pthread_t is an integral type in the GNU C library.
>>> + if (!__gthread_active_p())
>>> +   return 1;
>>> +#endif
>>> + return __gthread_self();
>>> +  }
>> 
>> This comment seems outdated or incomplete.  pthread_self returns a
>> proper pointer since glibc 2.27, I believe.
> 
> The comment is copied from the  header, and dates from 2015.

Yes, this comes from  to avoid pulling in all of  to just get a 
hash from the current thread identity. I’m now using it in two places, is this 
worth splitting out somewhere?

> 
>> I'm also not sure how the difference is observable for the libstdc++
>> implementation.  Late loading of libpthread isn't quite supported.
> 
> It's nothing to do with late loading. A single threaded program that
> doesn't create any threads and doesn't link to libpthread can still
> expect std::this_thread::get_id() != std::thread::id() to be true in
> the main (and only) thread. If pthread_self() returns 0, and
> thread::id() default constructs with a value of 0, then we can't
> distinguish "the main thread" from "not a thread".
> 
> But I do see a non-zero value from glibc now, which is great. I'll add
> it to my TODO list to remove that workaround from .

[PATCH] diagnostics: Add function call parens matching to c_parser.

2020-05-24 Thread Mark Wielaard

The C++ parser already tracks function call parens matching, but the C
parser doesn't. This adds the same functionality to the C parser and adds
a testcase showing the C++ and C parser matching function call parens
in an error message.

gcc/c/ChangeLog:

* c-parser.c (c_parser_postfix_expression_after_primary): Add
scope with matching_parens after CPP_OPEN_PAREN.

gcc/testsuite/ChangeLog:

* c-c++-common/missing-close-func-paren.c: New test.
---
 gcc/c/c-parser.c  | 32 ---
 .../c-c++-common/missing-close-func-paren.c   | 40 +++
 2 files changed, 57 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/missing-close-func-paren.c

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 5d11e7e73c16..23d6fa22b685 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -10458,21 +10458,23 @@ c_parser_postfix_expression_after_primary (c_parser 
*parser,
  break;
case CPP_OPEN_PAREN:
  /* Function call.  */
- c_parser_consume_token (parser);
- for (i = 0; i < 3; i++)
-   {
- sizeof_arg[i] = NULL_TREE;
- sizeof_arg_loc[i] = UNKNOWN_LOCATION;
-   }
- literal_zero_mask = 0;
- if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
-   exprlist = NULL;
- else
-   exprlist = c_parser_expr_list (parser, true, false, &origtypes,
-  sizeof_arg_loc, sizeof_arg,
-  &arg_loc, &literal_zero_mask);
- c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
-"expected %<)%>");
+ {
+   matching_parens parens;
+   parens.consume_open (parser);
+   for (i = 0; i < 3; i++)
+ {
+   sizeof_arg[i] = NULL_TREE;
+   sizeof_arg_loc[i] = UNKNOWN_LOCATION;
+ }
+   literal_zero_mask = 0;
+   if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
+ exprlist = NULL;
+   else
+ exprlist = c_parser_expr_list (parser, true, false, &origtypes,
+sizeof_arg_loc, sizeof_arg,
+&arg_loc, &literal_zero_mask);
+   parens.skip_until_found_close (parser);
+ }
  orig_expr = expr;
  mark_exp_read (expr.value);
  if (warn_sizeof_pointer_memaccess)
diff --git a/gcc/testsuite/c-c++-common/missing-close-func-paren.c 
b/gcc/testsuite/c-c++-common/missing-close-func-paren.c
new file mode 100644
index ..3177e250e1c3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/missing-close-func-paren.c
@@ -0,0 +1,40 @@
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+/* Verify that the C/C++ frontends show the pertinent opening symbol when
+   a closing symbol is missing for a function call.  */
+
+/* Verify that, when they are on the same line, that the opening symbol is
+   shown as a secondary range within the main diagnostic.  */
+
+extern int __attribute__((const)) foo (int a, int b, int c);
+
+void single_func ()
+{
+  int single =
+foo (1, (1 + 2), (1 + 2 + 3):); /* { dg-error "expected '\\)' before ':' 
token" } */
+  /* { dg-begin-multiline-output "" }
+ foo (1, (1 + 2), (1 + 2 + 3):);
+ ~   ^
+ )
+ { dg-end-multiline-output "" } */
+}
+
+/* Verify that, when they are on different lines, that the opening symbol is
+   shown via a secondary diagnostic.  */
+
+void multi_func ()
+{
+  int multi =
+foo (1, /* { dg-message "to match this '\\('" } */
+ (1 + 2),
+ (1 + 2 + 3):); /* { dg-error "expected '\\)' before ':' token" } */
+  /* { dg-begin-multiline-output "" }
+  (1 + 2 + 3):);
+ ^
+ )
+ { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+ foo (1,
+ ^
+ { dg-end-multiline-output "" } */
+}
-- 
2.20.1

Re: [PATCH] contrib/gen_autofdo_event.py: Allow for it to work if there are more than 3 hyphens in Family-model

2020-05-24 Thread Andi Kleen via Gcc-patches

Javad Karabi via Gcc-patches  writes:
>
> diff --git a/contrib/gen_autofdo_event.py b/contrib/gen_autofdo_event.py
> index c97460c61c6..cd77a8686d9 100755
> --- a/contrib/gen_autofdo_event.py
> +++ b/contrib/gen_autofdo_event.py
> @@ -94,7 +94,7 @@ for j in u:
>  n = j.rstrip().split(',')
>  if len(n) >= 4 and (args.all or n[0] == cpu) and n[3] == "core":
>  if args.all:
> -vendor, fam, model = n[0].split("-")
> +vendor, fam, model = n[0].split("-")[:3]

That doesn't fix the problem, we really need to match
the stepping too. You turned a visible failure into a silent one.

-Andi

[PATCH] Adjust wait logic to limit spurious evalution of wait predicate.

2020-05-24 Thread Thomas Rodgers

* include/bits/atomic_wait.h (__waiters::_M_do_wait): adjust wakeup 
logic.

[PATCH] Remove binary_semaphore implementation from stop_token

2020-05-24 Thread Thomas Rodgers

 * include/std/stop_token: Remove local binary_semaphore implementation.
   (_Stop_state_t::_M_do_try_lock): Use __thread_yield() from
   bits/atomic_wait.h.



0001-Remove-binary_semaphore-implementation-from-stop_tok.patch
Description: Binary data

Re: [PATCH] Adjust wait logic to limit spurious evalution of wait predicate.

2020-05-24 Thread Thomas Rodgers

And this time, with patch.



wake_up_fix.patch
Description: Binary data


> On May 24, 2020, at 3:06 PM, Thomas Rodgers  wrote:
> 
>   * include/bits/atomic_wait.h (__waiters::_M_do_wait): adjust wakeup 
> logic.

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

2020-05-24 Thread Yangfei (Felix)

Hi,

> -Original Message-
> From: Segher Boessenkool [mailto:seg...@kernel.crashing.org]
> Sent: Saturday, May 23, 2020 10:57 PM
> To: Yangfei (Felix) 
> Cc: gcc-patches@gcc.gnu.org; Zhanghaijian (A) 
> Subject: Re: [PATCH PR94026] combine missed opportunity to simplify
> comparisons with zero
> 
> Hi!
> 
> Sorry this is taking so long.
> 
> On Wed, May 06, 2020 at 08:57:52AM +, Yangfei (Felix) wrote:
> > > On Tue, Mar 24, 2020 at 06:30:12AM +, Yangfei (Felix) wrote:
> > > > I modified combine emitting a simple AND operation instead of
> > > > making one
> > > zero_extract for this scenario.
> > > > Attached please find the new patch.  Hope this solves both of our
> concerns.
> > >
> > > This looks promising.  I'll try it out, see what it does on other
> > > targets.  (It will have to wait for GCC 11 stage 1, of course).
> 
> It creates better code on all targets :-)  A quite small improvement, but not
> entirely trivial.

Thanks for the effort.  It's great to hear that :- )
Attached please find the v3 patch.  Rebased on the latest trunk. 
Bootstrapped and tested on aarch64-linux-gnu.  Could you please help install it?

> > > p.s.  Please use a correct mime type?  application/octet-stream
> > > isn't something I can reply to.  Just text/plain is fine :-)
> >
> > I have using plain text now, hope that works for you.  :-)
> 
> Nope:
> 
> [-- Attachment #2: pr94026-v2.diff --]
> [-- Type: application/octet-stream, Encoding: base64, Size: 5.9K --]

This time I switched to use UUEncode type for the attachment.  Does it work?
I am using Outlook and I didn't find the place to change the MIME type : - (

Felix


pr94026-v3.diff
Description: pr94026-v3.diff

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Hongtao Liu via Gcc-patches

On Mon, May 25, 2020 at 1:55 AM Uros Bizjak  wrote:
>
> On Sun, May 24, 2020 at 9:26 AM Hongtao Liu  wrote:
> >
> > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak  wrote:
> > >
> > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu  wrote:
> > > >
> > > > Hi:
> > > >   This patch fix non-conforming expander for
> > > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di,
> > > > refer to PR95211, PR95256.
> > > >   bootstrap ok, regression test on i386/x86-64 backend is ok.
> > > >
> > > > gcc/ChangeLog:
> > > > PR target/95211 PR target/95256
> > >
> > Changed.
> > > Please put every PR reference in a separate line.
> > >
> > > > * config/i386/sse.md v2div2sf2): New expander.
> > > > (fix_truncv2sfv2di2): Ditto.
> > > > (floatv2div2sf2_internal): Renaming from
> > > > floatv2div2sf2.
> > > > (fix_truncv2sfv2di2_internal):
> > >
> > > The convention throughout sse,md is to prefix a standard pattern that
> > > is used through builtins with avx512_ instead of suffixing
> > > the pattern name with _internal.
> > >
> > Changed.
> > > > Renaming from fix_truncv2sfv2di2.
> > > > (vec_pack_float_): Adjust icode name.
> > > > (vec_unpack_fix_trunc_lo_): Ditto.
> > > > * config/i386/i386-builtin.def: Ditto.
> > >
> > > Uros.
> >
> > Update patch.
>
> The patch is wrong, and the correct way to fix these patterns is more complex:
>
> a) the pattern should not access register in mode, narrower than 128
> bits, as this implies MMX register in non-TARGET-MMX-WITH-SSE targets.
> So, the correct way to define insn with narrow mode is to use
> vec_select, something like:
>
> (define_insn "sse4_1_v8qiv8hi2"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
>   (vec_select:V8QI
> (match_operand:V16QI 1 "register_operand" "Yr,*x,v")
> (parallel [(const_int 0) (const_int 1)
>(const_int 2) (const_int 3)
>(const_int 4) (const_int 5)
>(const_int 6) (const_int 7)]]
>
> The instruction accesses the memory in the correct mode, so the memory
> operand is:
>
> (define_insn "*sse4_1_v8qiv8hi2_1"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
>   (match_operand:V8QI 1 "memory_operand" "m,m,m")))]
>
> and a pre-reload split has to be introduced to convert insn from
> register form to memory form, when memory gets propagated to the insn:
>
> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (vec_select:V8QI
> (subreg:V16QI
>   (vec_concat:V2DI
> (match_operand:DI 1 "memory_operand")
> (const_int 0)) 0)
> (parallel [(const_int 0) (const_int 1)
>(const_int 2) (const_int 3)
>(const_int 4) (const_int 5)
>(const_int 6) (const_int 7)]]
>
> For a middle end to use this insn, an expander is used:
>
> (define_expand "v8qiv8hi2"
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (match_operand:V8QI 1 "nonimmediate_operand")))]
>
> b) Similar approach is used when an output is narrower than 128 bits:
>
> (define_insn "*floatv2div2sf2"
>   [(set (match_operand:V4SF 0 "register_operand" "=v")
> (vec_concat:V4SF
> (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
> (match_operand:V2SF 2 "const0_operand" "C")))]
>
> In your concrete case,
>
> (define_insn "fix_truncv2sfv2di2"
>   [(set (match_operand:V2DI 0 "register_operand" "=v")
> (any_fix:V2DI
>   (vec_select:V2SF
> (match_operand:V4SF 1 "nonimmediate_operand" "vm")
> (parallel [(const_int 0) (const_int 1)]]
>
> is already _NOT_ defined in a correct way as far as memory operand is
> concerned, see a) above. But, , we will apparently have to live
> with that. The problem is, that it is named as a standard named
> pattern, so middle-end discovers it and tries to use it. It should be
> renamed with avx512dq_... prefix. Let's give middle-end something
> correct, similar to:
>
> (define_expand "v8qiv8hi2"
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (match_operand:V8QI 1 "nonimmediate_operand")))]
>   "TARGET_SSE4_1"
> {
>   if (!MEM_P (operands[1]))
> {
>   operands[1] = simplify_gen_subreg (V16QImode, operands[1], V8QImode, 0);
>   emit_insn (gen_sse4_1_v8qiv8hi2 (operands[0], operands[1]));
>   DONE;
> }
> })
>
> The second case is with v2sf output, less than 128 bits wide:
>
> (define_insn "*floatv2div2sf2"
>   [(set (match_operand:V4SF 0 "register_operand" "=v")
> (vec_concat:V4SF
> (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
> (match_operand:V2SF 2 "const0_operand" "C")))]
>
> The above insn pattern is OK, we access the output register with
> 128bit access, so we are sure no MMX reg will be ge

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Uros Bizjak via Gcc-patches

On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  wrote:

> > We have to introduce a new expander, that will have conforming mode of
> > output operand (V2SF) and will produce RTX that will match
> > *floatv2div2sf2. A paradoxical output subreg from
> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
> > the case with paradoxical input subreg.
>
> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
> will return NULL since
> 
> 948  /* Subregs involving floating point modes are not allowed to
> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
> 950 (subreg:SI (reg:DF) 0) isn't.  */

But, we are not changing size, we are still operating with SFmode. It
looks to me that this limitation is too strict, the intention is to
not expand scalar SFmode to DFmode.

Let's ask experts.

Uros.

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Hongtao Liu via Gcc-patches

On Mon, May 25, 2020 at 1:55 AM Uros Bizjak  wrote:
>
> On Sun, May 24, 2020 at 9:26 AM Hongtao Liu  wrote:
> >
> > On Sat, May 23, 2020 at 6:11 PM Uros Bizjak  wrote:
> > >
> > > On Sat, May 23, 2020 at 9:25 AM Hongtao Liu  wrote:
> > > >
> > > > Hi:
> > > >   This patch fix non-conforming expander for
> > > > floatv2div2sf2,floatunsv2div2sf2,fix_truncv2sfv2di,fixuns_truncv2sfv2di,
> > > > refer to PR95211, PR95256.
> > > >   bootstrap ok, regression test on i386/x86-64 backend is ok.
> > > >
> > > > gcc/ChangeLog:
> > > > PR target/95211 PR target/95256
> > >
> > Changed.
> > > Please put every PR reference in a separate line.
> > >
> > > > * config/i386/sse.md v2div2sf2): New expander.
> > > > (fix_truncv2sfv2di2): Ditto.
> > > > (floatv2div2sf2_internal): Renaming from
> > > > floatv2div2sf2.
> > > > (fix_truncv2sfv2di2_internal):
> > >
> > > The convention throughout sse,md is to prefix a standard pattern that
> > > is used through builtins with avx512_ instead of suffixing
> > > the pattern name with _internal.
> > >
> > Changed.
> > > > Renaming from fix_truncv2sfv2di2.
> > > > (vec_pack_float_): Adjust icode name.
> > > > (vec_unpack_fix_trunc_lo_): Ditto.
> > > > * config/i386/i386-builtin.def: Ditto.
> > >
> > > Uros.
> >
> > Update patch.
>
> The patch is wrong, and the correct way to fix these patterns is more complex:
>
> a) the pattern should not access register in mode, narrower than 128
> bits, as this implies MMX register in non-TARGET-MMX-WITH-SSE targets.

It seems there are some patterns in sse.md not obey this rule.
i.e:
(define_insn "sse_storehps"
  [(set (match_operand:V2SF 0 "nonimmediate_operand" "=m,v,v")
(vec_select:V2SF
  (match_operand:V4SF 1 "nonimmediate_operand" "v,v,o")
  (parallel [(const_int 2) (const_int 3)])))]
  "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
  "@
   %vmovhps\t{%1, %0|%q0, %1}
   %vmovhlps\t{%1, %d0|%d0, %1}
   %vmovlps\t{%H1, %d0|%d0, %H1}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "maybe_vex")
   (set_attr "mode" "V2SF,V4SF,V2SF")])

(define_insn "sse_storelps"
  [(set (match_operand:V2SF 0 "nonimmediate_operand"   "=m,v,v")
(vec_select:V2SF
  (match_operand:V4SF 1 "nonimmediate_operand" " v,v,m")
  (parallel [(const_int 0) (const_int 1)])))]
  "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
  "@
   %vmovlps\t{%1, %0|%q0, %1}
   %vmovaps\t{%1, %0|%0, %1}
   %vmovlps\t{%1, %d0|%d0, %q1}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "maybe_vex")
   (set_attr "mode" "V2SF,V4SF,V2SF")])

Should they be restricted under TARGET_MMX_WITH_SSE or is there
anything i missed?

> So, the correct way to define insn with narrow mode is to use
> vec_select, something like:
>
> (define_insn "sse4_1_v8qiv8hi2"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
>   (vec_select:V8QI
> (match_operand:V16QI 1 "register_operand" "Yr,*x,v")
> (parallel [(const_int 0) (const_int 1)
>(const_int 2) (const_int 3)
>(const_int 4) (const_int 5)
>(const_int 6) (const_int 7)]]
>
> The instruction accesses the memory in the correct mode, so the memory
> operand is:
>
> (define_insn "*sse4_1_v8qiv8hi2_1"
>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> (any_extend:V8HI
>   (match_operand:V8QI 1 "memory_operand" "m,m,m")))]
>
> and a pre-reload split has to be introduced to convert insn from
> register form to memory form, when memory gets propagated to the insn:
>
> (define_insn_and_split "*sse4_1_v8qiv8hi2_2"
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (vec_select:V8QI
> (subreg:V16QI
>   (vec_concat:V2DI
> (match_operand:DI 1 "memory_operand")
> (const_int 0)) 0)
> (parallel [(const_int 0) (const_int 1)
>(const_int 2) (const_int 3)
>(const_int 4) (const_int 5)
>(const_int 6) (const_int 7)]]
>
> For a middle end to use this insn, an expander is used:
>
> (define_expand "v8qiv8hi2"
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (match_operand:V8QI 1 "nonimmediate_operand")))]
>
> b) Similar approach is used when an output is narrower than 128 bits:
>
> (define_insn "*floatv2div2sf2"
>   [(set (match_operand:V4SF 0 "register_operand" "=v")
> (vec_concat:V4SF
> (any_float:V2SF (match_operand:V2DI 1 "nonimmediate_operand" "vm"))
> (match_operand:V2SF 2 "const0_operand" "C")))]
>
> In your concrete case,
>
> (define_insn "fix_truncv2sfv2di2"
>   [(set (match_operand:V2DI 0 "register_operand" "=v")
> (any_fix:V2DI
>   (vec_select:V2SF
> (match_operand:V4SF 1 "nonimmediate_operand" "vm")
> (parallel [(const_int 0) (const_int 1)]]
>
> is already _NOT_ defined in a correct way a

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

2020-05-24 Thread Richard Biener

On May 25, 2020 8:12:12 AM GMT+02:00, Uros Bizjak  wrote:
>On Mon, May 25, 2020 at 7:53 AM Hongtao Liu  wrote:
>
>> > We have to introduce a new expander, that will have conforming mode
>of
>> > output operand (V2SF) and will produce RTX that will match
>> > *floatv2div2sf2. A paradoxical output subreg from
>> > V2SFmode V4SFmode is needed, generated by simplify_gen_subreg as is
>> > the case with paradoxical input subreg.
>>
>> Problem`is simplify_gen_subreg (V4SFmode, operands[0], V2SFmode, 0)
>> will return NULL since
>> 
>> 948  /* Subregs involving floating point modes are not allowed to
>> 949 change size.  Therefore (subreg:DI (reg:DF) 0) is fine, but
>> 950 (subreg:SI (reg:DF) 0) isn't.  */
>
>But, we are not changing size, we are still operating with SFmode. It
>looks to me that this limitation is too strict, the intention is to
>not expand scalar SFmode to DFmode.

I guess so. The test probably wants to tes the component mode. 

>Let's ask experts.
>
>Uros.

[PATCH] Add missing expander for vector float_extend and float_truncate [PR target/95125]

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

Re: [PATCH] x86: Handle -mavx512vpopcntdq for -march=native

[pushed] Darwin: Make sanitizer local vars linker-visible.

[PATCH v1 1/2][PPC64] [PR88877]

Re: [PATCH] Extend std::copy/std::copy_n char* overload to deque iterator

Re: [PATCH v1 1/2][PPC64] [PR88877]

Re: [PATCH] Add support for C++20 barriers

Re: [PATCH] Add support for C++20 barriers

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

Re: [PATCH] Add missing expander for vector float_extend and float_truncate [PR target/95125]

Re: [PATCH] Add support for C++20 barriers

Re: [PATCH] Add support for C++20 barriers

[patch, fortran] Fix memory leaks for finalized types

Re: [Patch] PR fortran/95106 - truncation of long symbol names with EQUIVALENCE

PR libfortran/95195 - improve runtime error for namelist i/o to unformatted file

[PATCH] Port libgccjit to Windows.

[RFC PATCH] i386: Remove broadcasts from TARGET_MMX_WITH_SSE vec_dup insn patterns

Re: [PATCH] Port libgccjit to Windows.

Re: [PATCH] Add support for C++20 barriers

[PATCH] diagnostics: Add function call parens matching to c_parser.

Re: [PATCH] contrib/gen_autofdo_event.py: Allow for it to work if there are more than 3 hyphens in Family-model

[PATCH] Adjust wait logic to limit spurious evalution of wait predicate.

[PATCH] Remove binary_semaphore implementation from stop_token

Re: [PATCH] Adjust wait logic to limit spurious evalution of wait predicate.

RE: [PATCH PR94026] combine missed opportunity to simplify comparisons with zero

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

Re: [PATCH] Fix non-conforming expander [PR target/95211, PR target/95256]

30 matches

Site Navigation

Mail list logo

Footer information