from:"Marc Glisse"

Re: [PATCH] Don't fold nextafter/nexttoward if -ftrapping-math or -fmath-errno if they produce denormal results (PR c/86420)

2018-07-07 Thread Marc Glisse


On Sat, 7 Jul 2018, Jakub Jelinek wrote:


2018-07-07  Jakub Jelinek  

PR c/86420
* real.c (real_nextafter): Return true if result is denormal.


I have a question on the side: would it be hard / useful, in cases where 
nextafter may set errno or some exception flag, to fold the result to a 
constant while keeping the function call (ignoring the value it returns)? 
To clarify, I mean replace


_2 = nextafter(DBL_DENORM_MIN, 0);

with

nextafter(DBL_DENORM_MIN, 0);
_2 = 0;

I think we already do that for some other calls, although I can't remember 
where. The point would be that we have the value of _2 and can keep 
folding its uses.


--
Marc Glisse

Re: [PATCH] Don't fold nextafter/nexttoward if -ftrapping-math or -fmath-errno if they produce denormal results (PR c/86420)

2018-07-07 Thread Marc Glisse


On Sat, 7 Jul 2018, Jakub Jelinek wrote:


On Sat, Jul 07, 2018 at 11:55:17AM +0200, Marc Glisse wrote:

On Sat, 7 Jul 2018, Jakub Jelinek wrote:


2018-07-07  Jakub Jelinek  

PR c/86420
* real.c (real_nextafter): Return true if result is denormal.


I have a question on the side: would it be hard / useful, in cases where
nextafter may set errno or some exception flag, to fold the result to a
constant while keeping the function call (ignoring the value it returns)? To
clarify, I mean replace

_2 = nextafter(DBL_DENORM_MIN, 0);

with

nextafter(DBL_DENORM_MIN, 0);
_2 = 0;

I think we already do that for some other calls, although I can't remember
where. The point would be that we have the value of _2 and can keep folding
its uses.


For errno purposes alone that would be possible, but the function is marked
#define ATTR_MATHFN_ERRNO (flag_errno_math ? \
   ATTR_NOTHROW_LEAF_LIST : ATTR_CONST_NOTHROW_LEAF_LIST)
and thus with -ftrapping-math -fno-math-errno I'm afraid we'd immediately
DCE the call in the second form (without lhs).


That looks like a problem we'll have to fix eventually. But not as part of 
this patch indeed.


--
Marc Glisse

Re: Fold pointer range checks with equal spans

2018-07-20 Thread Marc Glisse


On Fri, 20 Jul 2018, Richard Sandiford wrote:


--- gcc/match.pd2018-07-18 18:44:22.565914281 +0100
+++ gcc/match.pd2018-07-20 11:24:33.692045585 +0100
@@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (inverse_conditions_p (@0, @2)
&& element_precision (type) == element_precision (op_type))
(view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1)))
+
+/* For pointers @0 and @2 and unsigned constant offset @1, look for
+   expressions like:
+
+   A: (@0 + @1 < @2) | (@2 + @1 < @0)
+   B: (@0 + @1 <= @2) | (@2 + @1 <= @0)
+
+   If pointers are known not to wrap, B checks whether @1 bytes starting at
+   @0 and @2 do not overlap, while A tests the same thing for @1 + 1 bytes.
+   A is more efficiently tested as:
+
+   ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2)
+
+   as long as @1 * 2 doesn't overflow.  B is the same with @1 replaced
+   with @1 - 1.  */
+(for ior (truth_orif truth_or bit_ior)
+ (for cmp (le lt)
+  (simplify
+   (ior (cmp (pointer_plus:s @0 INTEGER_CST@1) @2)
+   (cmp (pointer_plus:s @2 @1) @0))


Do you want :c on cmp, in case it appears as @2 > @0 + @1 ? (may need some 
care with "cmp == LE_EXPR" below)

Do you want :s on cmp as well?


+   (if (!flag_trapv && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))


Don't you want TYPE_OVERFLOW_UNDEFINED?


+/* Convert the B form to the A form.  */
+(with { offset_int off = wi::to_offset (@1) - (cmp == LE_EXPR ? 1 : 0); }
+ /* Always fails for negative values.  */
+ (if (wi::min_precision (off, UNSIGNED) * 2 <= TYPE_PRECISION (sizetype))
+  /* It doesn't matter whether we use @2 - @0 or @0 - @2, so let
+tree_swap_operands_p pick a canonical order.  */
+  (with { tree ptr1 = @0, ptr2 = @2;
+ if (tree_swap_operands_p (ptr1, ptr2))
+   std::swap (ptr1, ptr2); }
+   (gt (plus (minus (convert:sizetype { ptr1; })
+   (convert:sizetype { ptr2; }))
+{ wide_int_to_tree (sizetype, off); })
+      { wide_int_to_tree (sizetype, off * 2); }


--
Marc Glisse

optimize std::vector move assignment

2018-07-25 Thread Marc Glisse


Hello,

I talked about this last year 
(https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01063.html and thread), 
this tweaks std::vector move assignment to help gcc generate better code 
for it.


For this code

#include 
#include 
typedef std::vector V;
void f(V&a,V&b){a=std::move(b);}

with -O2 -fdump-tree-optimized on powerpc64le-unknown-linux-gnu, we 
currently have


  _5 = MEM[(int * &)a_3(D)];
  MEM[(int * &)a_3(D)] = 0B;
  MEM[(int * &)a_3(D) + 8] = 0B;
  MEM[(int * &)a_3(D) + 16] = 0B;
  _6 = MEM[(int * &)b_2(D)];
  MEM[(int * &)a_3(D)] = _6;
  MEM[(int * &)b_2(D)] = 0B;
  _7 = MEM[(int * &)a_3(D) + 8];
  _8 = MEM[(int * &)b_2(D) + 8];
  MEM[(int * &)a_3(D) + 8] = _8;
  MEM[(int * &)b_2(D) + 8] = _7;
  _9 = MEM[(int * &)a_3(D) + 16];
  _10 = MEM[(int * &)b_2(D) + 16];
  MEM[(int * &)a_3(D) + 16] = _10;
  MEM[(int * &)b_2(D) + 16] = _9;
  if (_5 != 0B)

with the patch, we go down to

  _3 = MEM[(const struct _Vector_impl_data &)a_4(D)]._M_start;
  _5 = MEM[(const struct _Vector_impl_data &)b_2(D)]._M_start;
  MEM[(struct _Vector_impl_data *)a_4(D)]._M_start = _5;
  _6 = MEM[(const struct _Vector_impl_data &)b_2(D)]._M_finish;
  MEM[(struct _Vector_impl_data *)a_4(D)]._M_finish = _6;
  _7 = MEM[(const struct _Vector_impl_data &)b_2(D)]._M_end_of_storage;
  MEM[(struct _Vector_impl_data *)a_4(D)]._M_end_of_storage = _7;
  MEM[(struct _Vector_impl_data *)b_2(D)]._M_start = 0B;
  MEM[(struct _Vector_impl_data *)b_2(D)]._M_finish = 0B;
  MEM[(struct _Vector_impl_data *)b_2(D)]._M_end_of_storage = 0B;
  if (_3 != 0B)

removing 2 reads and 3 writes. At -O3 we also get more vectorization.

The main idea is to give the compiler more type information. Currently, 
the only type information the compiler cares about is the type used for a 
memory read/write. Using std::swap as before, the reads/writes are done on 
int&. Doing it directly, they are done on _Vector_impl_data::_M_start, a 
more precise information. Maybe some day the compiler will get stricter 
and be able to deduce the same information, but not yet.


The second point is to rotate { new, old, tmp } in an order that's simpler 
for the compiler. I was going to remove the swaps and use _M_copy_data 
directly, but since doing the swaps in a different order works...


_M_copy_data is not really needed, we could add a defaulted assignment 
operator, or remove the move constructor (and call a new _M_clear() from 
the 2 users), etc. However, it seemed the least intrusive, the least 
likely to have weird consequences.


I didn't add a testcase because I don't see any dg-final scan-tree-dump in 
the libstdc++ testsuite. The closest would be g++.dg/pr83239.C, 
g++.dg/vect/pr64410.cc, g++.dg/vect/pr33426-ivdep-4.cc that include 
, but from previous experience (already with vector), adding 
libstdc++ testcases to the C++ testsuite is not very popular.


Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2018-07-25  Marc Glisse  

* include/bits/stl_vector.h (_Vector_impl_data::_M_copy_data): New.
(_Vector_impl_data::_M_swap_data): Use _M_copy_data.
(vector::_M_move_assign): Reorder the swaps.

--
Marc GlisseIndex: libstdc++-v3/include/bits/stl_vector.h
===
--- libstdc++-v3/include/bits/stl_vector.h	(revision 262948)
+++ libstdc++-v3/include/bits/stl_vector.h	(working copy)
@@ -96,25 +96,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	{ }
 
 #if __cplusplus >= 201103L
 	_Vector_impl_data(_Vector_impl_data&& __x) noexcept
 	: _M_start(__x._M_start), _M_finish(__x._M_finish),
 	  _M_end_of_storage(__x._M_end_of_storage)
 	{ __x._M_start = __x._M_finish = __x._M_end_of_storage = pointer(); }
 #endif
 
 	void
+	_M_copy_data(_Vector_impl_data const& __x) _GLIBCXX_NOEXCEPT
+	{
+	  _M_start = __x._M_start;
+	  _M_finish = __x._M_finish;
+	  _M_end_of_storage = __x._M_end_of_storage;
+	}
+
+	void
 	_M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT
 	{
-	  std::swap(_M_start, __x._M_start);
-	  std::swap(_M_finish, __x._M_finish);
-	  std::swap(_M_end_of_storage, __x._M_end_of_storage);
+	  // Do not use std::swap(_M_start, __x._M_start), etc as it looses
+	  // information used by TBAA.
+	  _Vector_impl_data __tmp;
+	  __tmp._M_copy_data(*this);
+	  _M_copy_data(__x);
+	  __x._M_copy_data(__tmp);
 	}
   };
 
   struct _Vector_impl
 	: public _Tp_alloc_type, public _Vector_impl_data
   {
 	_Vector_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Tp_alloc_type()) )
 	: _Tp_alloc_type()
 	{ }
 
@@ -1724,22 +1735,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
 #if __cplusplus >= 201103L
 private:
   // Constant-time move assignment when source object's memory can be
   // moved, either because the source's allocator will move too
   // or because the allocators are equal.
   void
   _M_move_assign(vector&& __x, true

Re: optimize std::vector move assignment

2018-07-25 Thread Marc Glisse


On Wed, 25 Jul 2018, Jonathan Wakely wrote:

_M_copy_data is not really needed, we could add a defaulted assignment 
operator, or remove the move constructor (and call a new _M_clear() from 
the 2 users), etc. However, it seemed the least intrusive, the least likely 
to have weird consequences.


Yes, the alternative would be:

--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -100,14 +100,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  : _M_start(__x._M_start), _M_finish(__x._M_finish),
_M_end_of_storage(__x._M_end_of_storage)
  { __x._M_start = __x._M_finish = __x._M_end_of_storage = pointer(); }
+
+   _Vector_impl_data(const _Vector_impl_data&) = default;
+   _Vector_impl_data& oeprator=(const _Vector_impl_data&) = default;
#endif

  void
  _M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT
  {
- std::swap(_M_start, __x._M_start);
- std::swap(_M_finish, __x._M_finish);
- std::swap(_M_end_of_storage, __x._M_end_of_storage);
+ _Vector_impl_data __tmp = *this;
+ *this = __x;
+ __x = __tmp;


Or just std::swap(*this, __x).


  }
 };

Your _M_copy_data seems fine. It avoids unintentional copies, because
the copy constructor and copy assignment operator remain deleted (at
least in C++11).

I didn't add a testcase because I don't see any dg-final scan-tree-dump in 
the libstdc++ testsuite. The closest would be g++.dg/pr83239.C, 
g++.dg/vect/pr64410.cc, g++.dg/vect/pr33426-ivdep-4.cc that include 
, but from previous experience (already with vector), adding 
libstdc++ testcases to the C++ testsuite is not very popular.


Yes, C++ tests using std::vector are sometimes a bit fragile.

I don't see any reason we can't use scan-tree-dump in the libstdc++
testsuite, if you wanted to add one. We do have other dg-final tests.


The others only test for the presence of some name in assembly. But I may 
try it later.


--
Marc Glisse

Re: Share ebo helper throughout lib

2018-07-25 Thread Marc Glisse


On Wed, 25 Jul 2018, François Dumont wrote:

    It has already been noticed that there are 2 ebo helpers in the lib. Here 
is a patch to use 1.



    * include/bits/ebo_helper.h: New.
    * include/Makefile.am: Add latter.
    * include/Makefile.in: Regenerate.
    * include/bits/hashtable_policy.h: Adapt.
    * include/bits/shared_ptr_base.h: Adapt.

Tested under linux x86_64.

Ok to commit ?


I don't think we support [[no_unique_address]] yet, but assuming we soon 
will and we enable it also for C++03 (at least with the __attribute__ 
syntax and/or in system headers), do you know if some similar helper will 
still be necessary, with a simpler implementation, or if the attribute 
will magically get rid of it?


(I haven't looked at it at all, the answer may be obvious)

--
Marc Glisse

Re: [PATCH] Add malloc predictor (PR middle-end/83023).

2018-07-26 Thread Marc Glisse


On Thu, 26 Jul 2018, Martin Liška wrote:


Following patch implements new predictors that annotates malloc-like functions.
These almost every time return a non-null value.


Out of curiosity (the __builtin_expect there doesn't hurt and we don't 
need to remove it), does it make __builtin_expect unnecessary in the 
implementation of operator new (libstdc++-v3/libsupc++/new_op.cc)? It 
looks like


  while (__builtin_expect ((p = malloc (sz)) == 0, false))
{
  new_handler handler = std::get_new_handler ();
  if (! handler)
_GLIBCXX_THROW_OR_ABORT(bad_alloc());
  handler ();
}

where being in a loop may trigger opposite heuristics.

--
Marc Glisse

Re: [PATCH] Add malloc predictor (PR middle-end/83023).

2018-07-27 Thread Marc Glisse


On Fri, 27 Jul 2018, Martin Liška wrote:


So answer is yes, the builtin can be then removed.


Good, thanks. While looking at how widely it is going to apply, I noticed 
that the default, throwing operator new has attribute malloc and 
everything, but the non-throwing variant declared in  doesn't, so it 
won't benefit from the new predictor. I don't know if there is a good 
reason for this disparity...


--
Marc Glisse

Re: Fold pointer range checks with equal spans

2018-07-31 Thread Marc Glisse


On Tue, 31 Jul 2018, Richard Biener wrote:


Also, when @2 == @0 + (@1+1) then the original condition is true but
((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2) is not?
   (sizetype) @0 - (sizetype) (@0 + @1 + 1) + @1 > @1 * 2
-> -1 > @1 * 2

which is false.  So I can't really see how you can apply this transform in
general (it would be fine for generating alias checks but a bit more
pessimistic).

But maybe I am missing something?


It relies on sizetype being unsigned: (sizetype)-1 > @1 * 2 is true.


Hmm, so mathematically this is

 (@0 - @2) % modreduce + @1 > @1 * 2

then, but I don't want to think too much about this since Marc didn't
object here ;)


We already transform abs(x)<=3 into (unsigned)x+3u<=6u, that's the usual
way we do range checking so I didn't pay much attention to that part.
(tempted to say: "I didn't want to think too much about this since
Richard was going to do it anyway" ;-)

Turning multiple comparisons of the form P + cst CMP Q + cst into a
range check on P - Q sounds good (we don't really have to restrict to
the case where the range is symmetric). Actually, just turning P + cst
CMP Q + cst into P - Q CMP cst should do it, we should already have code
to handle range checking on integers (modulo the issue of CSE P-Q and
Q-P). But I don't know if a couple :s is sufficient to make this
transformation a good canonicalization.

If we start from a comparison of pointer_plus, I think it would make
sense to use pointer_diff.

I believe currently we try to use pointer operations (pointer_plus,
pointer_diff, lt) only for related pointers and we cast to some integer
type for wilder cases (implementation of std::less in C++ for instance).
On the other hand, in an alias check, the 2 pointers are possibly
unrelated, so maybe the code emitted for an alias check should be
changed.

--
Marc Glisse

Re: [PATCH] Remove unnecessary "& 1" in year_month_day_last::day()

2023-11-05 Thread Marc Glisse


On Sun, 5 Nov 2023, Cassio Neri wrote:


When year_month_day_last::day() was implemented, Dr. Matthias Kretz realised
that the operation "& 1" wasn't necessary but we did not patch it at that
time. This patch removes the unnecessary operation.


Is there an entry in gcc's bugzilla about having the optimizer handle this 
kind of optimization?


unsigned f(unsigned x){
  if(x>=32)__builtin_unreachable();
  return 30|(x&1); // --> 30|x
}

(that optimization would come in addition to your patch, doing the 
optimization by hand is still a good idea)


It looks like the criterion would be a|(b&c) when the possible 1 bits of b 
are included in the certainly 1 bits of a|c.


--
Marc Glisse

Re: [PATCH] match.pd: Use single_use for (T)(A) + CST -> (T)(A + CST) [PR95798]

2021-02-24 Thread Marc Glisse


On Wed, 24 Feb 2021, Jakub Jelinek via Gcc-patches wrote:


The following patch adds single_use case which restores these testcases
but keeps the testcases the patch meant to improve as is.


Hello,

I wonder if :s would be sufficient here? I don't have an opinion on which 
one is better for this particular transformation (don't change the patch 
because of my comment), we just seem to be getting more and more uses of 
single_use in match.pd, maybe at some point we need to revisit the meaning 
of :s or introduce a stronger :S.


--
Marc Glisse

Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]

2021-08-06 Thread Marc Glisse


On Fri, 6 Aug 2021, Victor Tong wrote:


Thanks for the feedback. Here's the updated pattern:

 /* X - (X - Y) --> Y */
 (simplify
   (minus (convert1? @0) (convert2? (minus@2 (convert3? @@0) @1)))
   (if (ANY_INTEGRAL_TYPE_P (type)
   && TYPE_OVERFLOW_UNDEFINED(type)
   && !TYPE_OVERFLOW_SANITIZED(type)
   && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@2))
   && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@2))
   && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@2))
   && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0))
   && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0))
   && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0))
   && TYPE_PRECISION (TREE_TYPE (@2)) <= TYPE_PRECISION (type)
   && TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type))
   (convert @1)))

I kept the TYPE_OVERFLOW_SANITIZED checks because I saw other patterns that 
leverage undefined overflows check for it. I think this new pattern shouldn't 
be applied if overflow sanitizer checks are enabled.


why is this testing TREE_TYPE (@0)?


I'm checking the type of @0 because I'm concerned that there could be a case 
where @0's type isn't an integer type with undefined overflow. I tried creating 
a test case and couldn't seem to create one where this is violated but I kept 
the checks to avoid causing a regression. If I'm being overcautious and you 
feel that the type checks on @0 aren't needed, I can remove them. I think the 
precision check on TREE_TYPE(@0) is needed to avoid truncation cases though.


It doesn't matter if @0 has undefined overflow, but it can matter that it 
be signed (yes, the 2 are correlated...) if it has the same precision as 
@2. Otherwise (int64_t)(-1u)-(int64_t)((int)(-1u)-0) is definitely not 0 
and it has type:int64_t, @2:int, @0:unsigned.


Ignoring the sanitizer, the confusing double matching of constants, and 
restricting to scalars, I think the tightest condition (without vrp) that 
works is


TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2)) ||
 TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@2)) &&
  (TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (TREE_TYPE (@2)) ||
   TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@2)) &&
   !TYPE_UNSIGNED (TREE_TYPE (@0))
  )

(where implicitly undefined => signed) but of course it is ok to handle
only a subset. It is too late for me to think about @0 vs @@0.

The patch you posted seems to accept the wrong 
(int128t)LLONG_MAX-(int128_t)((int)LLONG_MAX-0) -> (int128_t)0


Using ANY_INTEGRAL_TYPE_P allows vectors, and TYPE_PRECISION mustn't be 
used for vectors (maybe we should add more checking to catch such uses), 
and they may not be very happy with convert either...



Once we'd "inline" nop_convert genmatch would complain about this.


Maybe the new transform could be about scalars, and we could restrict the 
old one to vectors, to simplify the code, but that wouldn't help genmatch 
with the fact that the pattern x-(x-y) would appear twice. Is it really 
that bad? It is suspicious, but can be justified.



Is someone working on inlining nop_convert? I'd like to avoid breaking someone 
else's work if that's being worked on right now.

I've also added some extra tests to cover this new pattern. I've attached a 
patch with my latest changes.


From: Richard Biener 
Sent: Wednesday, July 28, 2021 2:59 AM
To: Victor Tong 
Cc: Marc Glisse ; gcc-patches@gcc.gnu.org 

Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176] 
 

On Tue, Jun 29, 2021 at 1:10 AM Victor Tong  wrote:


Thanks Richard and Marc.

I wrote the following test case to compare the outputs of fn1() and fn1NoOpt() 
below with my extra pattern being applied. I tested the two functions with all 
of the integers from INT_MIN to INT_MAX.

long
fn1 (int x)
{
   return 42L - (long)(42 - x);
}

#pragma GCC push_options
#pragma GCC optimize ("O0")
long
fn1NoOpt (int x)
{
   volatile int y = (42 - x);
   return 42L - (long)y;
}
#pragma GCC pop_options

int main ()
{
 for (long i=INT_MIN; i<=INT_MAX;i++)
 {
 auto valNoOpt = fn1NoOpt(i);
 auto valOpt = fn1(i);
 if (valNoOpt != valOpt)
 printf("valOpt=%ld, valNoOpt=%ld\n", valOpt, valNoOpt);
 }
 return 0;
}

I saw that the return values of fn1() and fn1NoOpt() differed when the input 
was between INT_MIN and INT_MIN+42 inclusive. When passing values in this range 
to fn1NoOpt(), a signed overflow is triggered which causes the value to differ 
(undefined behavior). This seems to go in line with what Marc described and I 
think the transformation is correct in the scenario above. I do think that type 
casts that result in truncation (i.e. from a higher precision to a lower one) 
or wit

Re: [PATCH] Implement constant-folding simplifications of reductions.

2022-02-21 Thread Marc Glisse


On Mon, 21 Feb 2022, Roger Sayle wrote:


+/* Fold REDUC (@0 op VECTOR_CST) as REDUC (@0) op REDUC (VECTOR_CST).  */
+(for reduc (IFN_REDUC_PLUS IFN_REDUC_MAX IFN_REDUC_MIN IFN_REDUC_FMAX
+IFN_REDUC_FMIN IFN_REDUC_AND IFN_REDUC_IOR IFN_REDUC_XOR)
+ op (plus max min IFN_FMAX IFN_FMIN bit_and bit_ior bit_xor)
+  (simplify (reduc (op @0 VECTOR_CST@1))
+(op (reduc:type @0) (reduc:type @1


I wonder if we need to test flag_associative_math for the 'plus' case,
or if the presence of IFN_REDUC_PLUS is enough to justify the possible
loss of precision.

--
Marc Glisse

Re: [PATCH] tree-optimization/103514 Missing XOR-EQ-AND Optimization

2021-12-04 Thread Marc Glisse


+/* (a & b) ^ (a == b) -> !(a | b) */
+/* (a & b) == (a ^ b) -> !(a | b) */
+(for first_op (bit_xor eq)
+ second_op (eq bit_xor)
+ (simplify
+  (first_op:c (bit_and:c @0 @1) (second_op:c @0 @1))
+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+&& types_match (TREE_TYPE (@0), TREE_TYPE (@1)))
+(convert (bit_not (bit_ior @0 @1))

I don't think you need types_match, if both are operands of bit_and, their 
types must already match.


It isn't clear what the INTEGRAL_TYPE_P test is for. Your 2 
transformations don't seem that similar to me. The first one requires that 
a and b have the same type as the result of ==, so they are boolean-like. 
The second one makes sense for more general integers, but then it looks 
like it should produce (a|b)==0.


It doesn't look like we have a canonical representation between a^b and 
a!=b for booleans :-(


(sorry for the broken thread, I was automatically unsubscribed because 
mailman doesn't like greylisting)


--
Marc Glisse

Re: [Patch] Inline optimization for tanh(x)/sinh(x) -> 1.0/cosh(x)

2020-01-31 Thread Marc Glisse


On Thu, 30 Jan 2020, Vitor Guidi wrote:


+ /* Simplify tanh (x) / sinh (x) -> 1.0 / cosh (x). */
+ (simplify
+   (rdiv (TANH @0) (SINH @0))
+   (rdiv {build_one_cst (type);} (COSH @0)))


The existing

 (simplify
  (rdiv (SINH:s @0) (COSH:s @0))
   (TANH @0))

has :s (which AFAIK are ignored because the output is a single insn) but 
not this new one, where it would not be ignored. That's not very 
consistent.


--
Marc Glisse

Re: [RFC/PATCH] C++ constexpr vs. floating point exceptions.

2021-09-21 Thread Marc Glisse


On Tue, 21 Sep 2021, Jason Merrill via Gcc-patches wrote:


On Tue, Sep 21, 2021 at 4:30 PM Joseph Myers 
wrote:


On Tue, 21 Sep 2021, Jakub Jelinek via Gcc-patches wrote:


On Tue, Sep 21, 2021 at 02:15:59PM +0100, Roger Sayle wrote:

Can you double check?  Integer division by zero is undefined, but

isn't floating point

division by zero defined by the appropriate IEEE standards?


https://eel.is/c++draft/expr.mul#4 doesn't make the division by zero
behavior conditional on integral types.
C has similar wording.


The position for C is that Annex F semantics take precedence over all the
ways in which floating-point arithmetic has undefined behavior in the main
body of the standard.  So floating-point overflow and division by zero are
fully defined in the presence of Annex F support, while out-of-range
conversions from floating point to integer types produce an unspecified
value (not necessarily the same unspecified value for different executions
of the conversion in the abstract machine - as discussed in bug 93806, GCC
can get that wrong and act as if a single execution of such a conversion
in the abstract machine produces more than one result).

In C, as specified in Annex F, initializers for floating-point objects
with static or thread storage duration are evaluated with exceptions
discarded and the default rounding mode in effect; 7.0 / 0.0 is a fully
valid initializer for such an object to initialize it to positive
infinity.  As I understand it, the question for this thread is whether C++
constexpr should have a similar rule to C static initializers (which ought
to apply to 1.0 / 3.0, raising inexact, just as much as to 7.0 / 0.0).



The C rules seem to be

F.8.2 Translation
During translation the IEC 60559 default modes are in effect:
— The rounding direction mode is rounding to nearest.
— The rounding precision mode (if supported) is set so that results are
not shortened.
— Trapping or stopping (if supported) is disabled on all floating-point
exceptions.
Recommended practice:
The implementation should produce a diagnostic message for each
translation-time floating-point exception, other than “inexact”; the
implementation should then proceed with the translation of the program.

I think following the same rules for C++ would be appropriate in a


I agree that looking at the C standard is more interesting, C++ is very 
bad at specifying anything float related.



diagnosing context: warn and continue.  In a template argument deduction
(SFINAE) context, where errors become silent substitution failures, it's
probably better to treat them as non-constant.


I am trying to imagine a sfinae example affected by whether 1./0. is 
constant. Does that mean A<0.,__builtin_inf()> would fail to use the 
specialization in


templatestruct A{};
templatestruct A{};

? I don't like that, I believe it should use the specialization. With 
ieee754, 1./0. is perfectly well defined as +inf, the only question is 
whether it should also set a flag at runtime, which is not relevant in a 
manifestly consteval context (fetestexcept, etc are not constexpr, that 
should be enough to catch mistakes). If some user wants to forbid 
FE_DIVBYZERO, FE_INEXACT, FE_INVALID, FE_OVERFLOW or FE_UNDERFLOW in 
compile-time operations, that looks like it could be part of a separate 
compiler flag or pragma, like C's FENV_ROUND can affect the rounding mode 
in static initializers (of course C++ templates make pragmas less 
convenient).


--
Marc Glisse

Re: [PATCH] Some further match.pd optimizations with nop_convert (PR tree-optimization/92734)

2019-12-05 Thread Marc Glisse


On Wed, 4 Dec 2019, Jakub Jelinek wrote:


--- gcc/match.pd.jj 2019-12-03 10:20:00.244913801 +0100
+++ gcc/match.pd2019-12-03 13:36:01.084435697 +0100
@@ -2159,20 +2159,26 @@ (define_operator_list COND_TERNARY
  /* A - (A +- B)   -> -+ B */
  /* A +- (B -+ A)  ->  +- B */
  (simplify
-(minus (plus:c @0 @1) @0)
-@1)
+   (minus (nop_convert (plus:c (nop_convert @0) @1)) @0)
+   (view_convert @1))


I know I introduced nop_convert, and it can be convenient, but IIRC it
also has some limitations.

int f(int b,unsigned c){
  int a=c;
  int d=a+b;
  return d-a;
}
int g(int a, int b){
  int d=(unsigned)a+b;
  return d-a;
}
int h(int b,int a){
  int d=a+b;
  return d-a;
}

While g and h are properly optimized during forwprop1, f isn't, because 
nop_convert sees that 'a' comes from a conversion, and only returns d 
(unlike 'convert?' which would try both a and d).


When inside nop_convert we have an operation, say (nop_convert (plus
...)), there is no ambiguity, so we are fine. With just (nop_convert @0), 
less so.


It happens that during EVRP, for some reason (different valuization 
function?), nop_convert does not match the conversion, and we do optimize 
f. But that almost looks like an accident.


convert? with explicit checks would probably work better for the inner 
conversion, except that handling the vector view_convert case may become 
painful. I didn't think too hard about possible fancy tricks like a second 
nop_convert:


(minus (nop_convert (plus:c (nop_convert @0) @1)) (nop_convert @0))

--
Marc Glisse

Re: [PATCH] Some further match.pd optimizations with nop_convert (PR tree-optimization/92734)

2019-12-06 Thread Marc Glisse


On Fri, 6 Dec 2019, Richard Biener wrote:


nop_convert sees that 'a' comes from a conversion, and only returns d
(unlike 'convert?' which would try both a and d).


Maybe I should have formulated it as: nop_convert works kind of like a 
strip_nops.



If use gets more and more we can make nop_convert a first class citizen and 
allow a? Variant.


One reason I did not specifically push for that is that nop_convert is 
seldom the right condition. It is convenient because it is usually easy 
enough to check that it is correct, but in most cases one of narrowing / 
zero-extension / sign-extension also works. Still, it is better to handle 
just NOPs than no conversion at all, so I guess making that easy is still 
good.



Like the attached (need to adjust docs & rename some functions still)
which generalizes
[digit]? parsing.  This allows you to write (nop_convert? ...)


I guess once this is in, we should replace all (most?) 'nop_convert' with 
'nop_convert?' (and possibly a digit in some places) and remove the last 
alternative in the definition of nop_convert.


Although that will increase the code size. In case, we could still have 
both a nop_convert and a strip_nop predicates. Just thinking:


(match (nop_convert @0)
 (convert @0)
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)
(match (nop_convert @0)
 (view_convert @0)
 (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0))
  && known_eq (TYPE_VECTOR_SUBPARTS (type),
   TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)))
  && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0))

(match (strip_nops @0)
 (nop_convert? @0))

but that relies on the fact that when there is an optional operation, the 
machinery first tries with the operation, and then without, the order 
matters. Maybe being explicit on the priorities is safer


(match (strip_nops @0)
 (nop_convert @0))
(match (strip_nops @0)
 @0)


It works (generated files are unchanged), so I guess I'll push it
after polishing.

It also extends the 1/2/3 grouping to be able to group like (plus
(nop_convert2? @0) (convert2? @1))
so either both will be present or none (meaning that when grouping you
can choose the "cheaper"
test for one in case you know the conversions will be the same).


Nice.

--
Marc Glisse

Re: [PATCH] Some further match.pd optimizations with nop_convert (PR tree-optimization/92734)

2019-12-06 Thread Marc Glisse


On Fri, 6 Dec 2019, Richard Biener wrote:


Although that will increase the code size. In case, we could still have both a
nop_convert and a strip_nop predicates. Just thinking:


We should measure it, yes, I hope it won't be too bad ;)  In theory
making genmatch emitted code more like a graph rather than a tree
(with shared leafs) might save us a bit here.


Indeed, it isn't worth hacking this specific case. If we really want those 
savings, making it automatic at a higher level is the right way to go.



On Fri, 6 Dec 2019, Richard Biener wrote:


@@ -1684,7 +1681,7 @@ (define_operator_list COND_TERNARY
/* For equality, this is also true with wrapping overflow.  */
(for op (eq ne)
 (simplify
-  (op:c (nop_convert@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1))
+  (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1))
  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
   && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))
   || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))


Possible future clean-up: those convert1 and convert2 have (strange) 
associated tree_nop_conversion_p below and look like candidates to become 
nop_convert.



@@ -2159,25 +2156,25 @@ (define_operator_list COND_TERNARY
  /* A - (A +- B)   -> -+ B */
  /* A +- (B -+ A)  ->  +- B */
  (simplify
-   (minus (nop_convert (plus:c (nop_convert @0) @1)) @0)
+   (minus (nop_convert1? (plus:c (nop_convert2? @0) @1)) @0)
   (view_convert @1))


I was wondering if we could use nop_convert for both (instead of 
nop_convert1 and 2), but for constants that wouldn't work, so this looks 
good.


--
Marc Glisse

Re: [PATCH] Canonicalize fancy ways of expressing blend operation into COND_EXPR (PR tree-optimization/92834)

2019-12-09 Thread Marc Glisse


On Fri, 6 Dec 2019, Jakub Jelinek wrote:


--- gcc/match.pd.jj 2019-12-06 14:07:26.877749065 +0100
+++ gcc/match.pd2019-12-06 15:06:08.042953309 +0100
@@ -2697,6 +2697,31 @@ (define_operator_list COND_TERNARY
  (cmp (minmax @0 INTEGER_CST@1) INTEGER_CST@2)
  (comb (cmp @0 @2) (cmp @1 @2

+/* Undo fancy way of writing max/min or other ?: expressions,
+   like a - ((a - b) & -(a < b)), in this case into (a < b) ? b : a.
+   People normally use ?: and that is what we actually try to optimize.  */
+(for cmp (simple_comparison)
+ (simplify
+  (minus @0 (bit_and:c (minus @0 @1)
+  (convert? (negate@4 (convert? (cmp@5 @2 @3))
+  (if (INTEGRAL_TYPE_P (type)
+   && INTEGRAL_TYPE_P (TREE_TYPE (@4))
+   && TREE_CODE (TREE_TYPE (@4)) != BOOLEAN_TYPE
+   && INTEGRAL_TYPE_P (TREE_TYPE (@5))
+   && (TYPE_PRECISION (TREE_TYPE (@4)) >= TYPE_PRECISION (type)
+  || !TYPE_UNSIGNED (TREE_TYPE (@4
+   (cond (cmp @2 @3) @1 @0)))


I was going to suggest
 (cond @5 @1 @0)

and possibly replacing (cmp@5 @2 @3) with truth_valued_p@5, before 
remembering that COND_EXPR embeds the comparison, and that not 
transforming when we don't see the comparison is likely on purpose. Plus, 
if @5 was in a signed 1-bit type, it may look more like -1 than 1 and 
break the transformation (is that forbidden as return type of a 
comparion?).


--
Marc Glisse

Re: [PATCH] Optimize x < 0 ? ~y : y to (x >> 31) ^ y in match.pd

2021-05-23 Thread Marc Glisse


On Sun, 23 May 2021, apinski--- via Gcc-patches wrote:


+(for cmp (ge lt)
+/* x < 0 ? ~y : y into (x >> (prec-1)) ^ y. */
+/* x >= 0 ? ~y : y into ~((x >> (prec-1)) ^ y). */
+ (simplify
+  (cond (cmp @0 integer_zerop) (bit_not @1) @1)
+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+&& !TYPE_UNSIGNED (TREE_TYPE (@0))
+&& TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type))


Is there a risk that x is signed char (precision 8) and y is a vector with 
8 elements?


--
Marc Glisse

Re: [ARM] PR66791: Replace calls to builtin in vmul_n (a, b) intrinsics with a * b

2021-05-26 Thread Marc Glisse


On Wed, 26 May 2021, Prathamesh Kulkarni via Gcc-patches wrote:


The attached patch removes calls to builtins in vmul_n* (a, b) with __a * __b.


I am not familiar with neon, but are __a and __b unsigned here? Otherwise, 
is vmul_n already undefined in case of overflow?


--
Marc Glisse

Re: [PATCH] Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738]

2021-06-01 Thread Marc Glisse


On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:


Hi:
 This patch is about to simplify (view_convert:type ~a) < 0 to
(view_convert:type a) >= 0 when type is signed integer. Similar for
(view_convert:type ~a) >= 0.
 Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
 Ok for the trunk?

gcc/ChangeLog:

   PR middle-end/100738
   * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
   (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
   simplification.


We already have

/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
(for cmp (simple_comparison)
 scmp (swapped_simple_comparison)
 (simplify
  (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
  (if (single_use (@2)
   && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
   (scmp @0 (bit_not @1)

Would it make sense to try and generalize it a bit, say with

(cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)

(scmp (view_convert:XXX @0) (bit_not @1))

(I still believe that it is a bad idea that SSA_NAMEs are strongly typed, 
encoding the type in operations would be more convenient, but I think the 
time for that choice has long gone)


--
Marc Glisse

Re: [PATCH] Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738]

2021-06-04 Thread Marc Glisse


On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote:


On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse  wrote:


On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote:


Hi:
 This patch is about to simplify (view_convert:type ~a) < 0 to
(view_convert:type a) >= 0 when type is signed integer. Similar for
(view_convert:type ~a) >= 0.
 Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
 Ok for the trunk?

gcc/ChangeLog:

   PR middle-end/100738
   * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0,
   (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE
   simplification.


We already have

/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
(for cmp (simple_comparison)
  scmp (swapped_simple_comparison)
  (simplify
   (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1)
   (if (single_use (@2)
&& (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST))
(scmp @0 (bit_not @1)

Would it make sense to try and generalize it a bit, say with

(cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P)

(scmp (view_convert:XXX @0) (bit_not @1))


Thanks for your advice, it looks great.
And can I use *view_convert1?* instead of *nop_convert1?* here,
because the original case is view_convert, and nop_convert would fail
to simplify the case.


Near the top of match.pd, you can see

/* With nop_convert? combine convert? and view_convert? in one pattern
   plus conditionalize on tree_nop_conversion_p conversions.  */
(match (nop_convert @0)
 (convert @0)
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)
(match (nop_convert @0)
 (view_convert @0)
 (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0))
  && known_eq (TYPE_VECTOR_SUBPARTS (type),
   TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)))
  && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0))

So at least the intention is that it can handle both NOP_EXPR for scalars 
and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way 
in some places in match.pd, like


(simplify
 (negate (nop_convert? (bit_not @0)))
 (plus (view_convert @0) { build_each_one_cst (type); }))

(simplify
 (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1)
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
  (bit_not (bit_xor (view_convert @0) @1

(the 'if' seems redundant for this one)

 (simplify
  (negate (nop_convert? (negate @1)))
  (if (!TYPE_OVERFLOW_SANITIZED (type)
   && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1)))
   (view_convert @1)))

etc.


At some point this got some genmatch help, to handle '?' and numbers, so I 
don't remember all the details, but following these examples should work.


--
Marc Glisse

Re: [PATCH] Fold bswap32(x) != 0 to x != 0 (and related transforms)

2021-07-18 Thread Marc Glisse


On Sun, 18 Jul 2021, Roger Sayle wrote:


+(if (GIMPLE || !TREE_SIDE_EFFECTS (@0))


I don't think you need to worry about that, the general genmatch machinery 
is already supposed to take care of it. All the existing cases in match.pd 
are about cond_expr, where counting the occurrences of each @i is not 
reliable.


--
Marc Glisse

Re: [PATCH] Fold (X<

2021-07-26 Thread Marc Glisse


On Mon, 26 Jul 2021, Roger Sayle wrote:

The one aspect that's a little odd is that each transform is paired with 
a convert@1 variant, using the efficient match machinery to expose any 
zero extension to fold-const.c's tree_nonzero_bits functionality.


Copying the first transform for context

+(for op (bit_ior bit_xor)
+ (simplify
+  (op (mult:s@0 @1 INTEGER_CST@2)
+  (mult:s@3 @1 INTEGER_CST@4))
+  (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type)
+   && (tree_nonzero_bits (@0) & tree_nonzero_bits (@3)) == 0)
+   (mult @1
+{ wide_int_to_tree (type, wi::to_wide (@2) + wi::to_wide (@4)); })))
+ (simplify
+  (op (mult:s@0 (convert@1 @2) INTEGER_CST@3)
+  (mult:s@4 (convert@1 @2) INTEGER_CST@5))
+  (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type)
+   && (tree_nonzero_bits (@0) & tree_nonzero_bits (@4)) == 0)
+   (mult @1
+{ wide_int_to_tree (type, wi::to_wide (@3) + wi::to_wide (@5)); })))

Could you explain how the convert helps exactly?

--
Marc Glisse

Re: [PATCH] phiopt: Optimize (x <=> y) cmp z [PR94589]

2021-05-05 Thread Marc Glisse


On Tue, 4 May 2021, Jakub Jelinek via Gcc-patches wrote:


2) the pr94589-2.C testcase should be matching just 12 times each, but runs
into operator>=(strong_ordering, unspecified) being defined as
(_M_value&1)==_M_value
rather than _M_value>=0.  When not honoring NaNs, the 2 case should be
unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0,
but is not a single use but two uses.  I'll need to pattern match that case
specially.


Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0, 
that could be worth doing already in GIMPLE.


--
Marc Glisse

Re: [PATCH] phiopt: Optimize (x <=> y) cmp z [PR94589]

2021-05-06 Thread Marc Glisse


On Thu, 6 May 2021, Jakub Jelinek via Gcc-patches wrote:


Though, (x&1) == x is equivalent to both (x&~1)==0 and to x < 2U
and from the latter two it isn't obvious which one is better/more canonical.
On aarch64 I don't see differences in number of insns nor in their size:
 10:13001c00sxtbw0, w0
 14:721f781ftst w0, #0xfffe
 18:1a9f17e0csetw0, eq  // eq = none
 1c:d65f03c0ret
vs.
 20:12001c00and w0, w0, #0xff
 24:7100041fcmp w0, #0x1
 28:1a9f87e0csetw0, ls  // ls = plast
 2c:d65f03c0ret
On x86_64 same number of insns, but the comparison is shorter (note, the
spaceship result is a struct with signed char based enum):
 10:31 c0   xor%eax,%eax
 12:81 e7 fe 00 00 00   and$0xfe,%edi
 18:0f 94 c0sete   %al
 1b:c3  retq
 1c:0f 1f 40 00 nopl   0x0(%rax)
vs.
 20:31 c0   xor%eax,%eax
 22:40 80 ff 01 cmp$0x1,%dil
 26:0f 96 c0setbe  %al
 29:c3  retq
Generally, I'd think that the comparison should be better because it
will be most common in user code that way and VRP will be able to do the
best thing for it.


We can probably do it in 2 steps, first something like

(for cmp (eq ne)
 (simplify
  (cmp (bit_and:c @0 @1) @0)
  (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })))

to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a 
mask 111...000 (I thought we already had a function to detect such masks, 
or the 000...111, but I can't find them anymore).


I agree that the comparison seems preferable, although if X is signed, the 
way GIMPLE represents types will add an inconvenient cast. And I think VRP 
already manages to use the bit test to derive a range.


--
Marc Glisse

Re: [PATCH] match.pd: Optimize (x & y) == x into (x & ~y) == 0 [PR94589]

2021-05-11 Thread Marc Glisse


On Tue, 11 May 2021, Jakub Jelinek via Gcc-patches wrote:


On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote:

We can probably do it in 2 steps, first something like

(for cmp (eq ne)
 (simplify
  (cmp (bit_and:c @0 @1) @0)
  (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })))

to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a
mask 111...000 (I thought we already had a function to detect such masks, or
the 000...111, but I can't find them anymore).


Ok, here is the first step then.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Or should it have cmp:c too given that == and != are commutative too?


Ah, yes, you are right, good point on the cmp:c, thank you.


2021-05-11  Jakub Jelinek  
    Marc Glisse  

PR tree-optimization/94589
* match.pd ((X & Y) == Y -> (X & ~Y) == 0,

   ^

X?


(X | Y) == Y -> (X & ~Y) == 0): New GIMPLE simplifications.

* gcc.dg/tree-ssa/pr94589-1.c: New test.

--- gcc/match.pd.jj 2021-04-27 14:46:56.583716888 +0200
+++ gcc/match.pd2021-05-10 22:31:49.726870421 +0200
@@ -4764,6 +4764,18 @@ (define_operator_list COND_TERNARY
  (cmp:c (bit_xor:c @0 @1) @0)
  (cmp @1 { build_zero_cst (TREE_TYPE (@1)); }))

+#if GIMPLE
+ /* (X & Y) == X becomes (X & ~Y) == 0.  */
+ (simplify
+  (cmp (bit_and:c @0 @1) @0)
+  (cmp (bit_and @0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); }))
+
+ /* (X | Y) == Y becomes (X & ~Y) == 0.  */
+ (simplify
+  (cmp (bit_ior:c @0 @1) @1)
+  (cmp (bit_and @0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); }))
+#endif
+
 /* (X ^ C1) op C2 can be rewritten as X op (C1 ^ C2).  */
 (simplify
  (cmp (convert?@3 (bit_xor @0 INTEGER_CST@1)) INTEGER_CST@2)
--- gcc/testsuite/gcc.dg/tree-ssa/pr94589-1.c.jj2021-05-10 
22:36:10.574130179 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr94589-1.c   2021-05-10 22:36:17.789054362 
+0200
@@ -0,0 +1,21 @@
+/* PR tree-optimization/94589 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int x)
+{
+  return (x & 23) == x;
+/* { dg-final { scan-tree-dump " & -24;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " & 23;" "optimized" } } */
+/* { dg-final { scan-tree-dump " == 0" "optimized" } } */
+}
+
+int
+bar (int x)
+{
+  return (x | 137) != 137;
+/* { dg-final { scan-tree-dump " & -138;" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \\| 137;" "optimized" } } */
+/* { dg-final { scan-tree-dump " != 0" "optimized" } } */
+}


Jakub


--
Marc Glisse

Re: [PATCH] phiopt: Optimize (x <=> y) cmp z [PR94589]

2021-05-15 Thread Marc Glisse


On Fri, 14 May 2021, Jakub Jelinek via Gcc-patches wrote:


On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote:

We can probably do it in 2 steps, first something like

(for cmp (eq ne)
 (simplify
  (cmp (bit_and:c @0 @1) @0)
  (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })))

to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a
mask 111...000 (I thought we already had a function to detect such masks, or
the 000...111, but I can't find them anymore).

I agree that the comparison seems preferable, although if X is signed, the
way GIMPLE represents types will add an inconvenient cast. And I think VRP
already manages to use the bit test to derive a range.


I've tried the second step, but it unfortunately regresses
+FAIL: gcc.dg/ipa/propbits-2.c scan-tree-dump-not optimized "fail_test"
+FAIL: gcc.dg/tree-ssa/loop-42.c scan-tree-dump-not ivcanon "under assumptions "
so maybe it is better to keep these cases as the users wrote them.


Thank you for trying it, the patch looks good (it would probably also work 
on GENERIC), but indeed it is just a canonicalization, not necessarily 
worth breaking other stuff. This seems to point to another missed 
optimization. Looking at propbits-2.c, if I rewrite the condition in f1 as


  if ((unsigned)x <= 0xf)

I see later in VRP

  # RANGE [0, 4294967295] NONZERO 15
  x.2_1 = (unsigned intD.9) x_4(D);
  if (x.2_1 <= 15)

where we fail to derive a range from the nonzero bits. Maybe VRP expects 
IPA-CP should have done it already and doesn't bother recomputing it.


I understand this may not be a priority though, that's fine.

I didn't look at loop-42.c.

--
Marc Glisse

Re: libstdc++: Turn memmove to memcpy in vector reallocations

2023-11-21 Thread Marc Glisse


On Tue, 21 Nov 2023, Jonathan Wakely wrote:


CC Marc Glisse who added the relocation support. He might recall why
we use memmove when all uses are for newly-allocated storage, which
cannot overlap the existing storage.


Going back a bit:

https://gcc.gnu.org/pipermail/gcc-patches/2019-April/520658.html

"I think the call to memmove in __relocate_a_1 could probably be
memcpy (I don't remember why I chose memmove)"

Going back a bit further:

https://gcc.gnu.org/pipermail/gcc-patches/2018-September/505800.html

"I had to add a special case for trivial types, using memmove, to avoid
perf regressions, since relocation takes precedence over the old path that
is specialized to call memmove."

So the reason seems to be because vector already used memmove before my 
patch. You can dig further if you want to check why that is ;-)


--
Marc Glisse

Re: libstdc++: Speed up push_back

2023-11-24 Thread Marc Glisse


On Thu, 23 Nov 2023, Jonathan Wakely wrote:


That's why we need -fsane-operator-new


Although the standard forbids it, some of us just provide an inline 
implementation


inline void* operator new(std::size_t n){return malloc(n);}
inline void operator delete(void*p)noexcept{free(p);}
inline void operator delete(void*p,std::size_t)noexcept{free(p);}

(I could certainly add a check to abort if malloc returns 0 or other 
details, depending on what the application calls for)


It used to enable a number of optimizations, for instance in gcc-9

auto f(){ return std::vector(4096); }

was optimized to just one call to calloc (someone broke that in gcc-10).

Using LTO on libsupc++ is related.

I don't know if we want to define "sane" operators new/delete, or just 
have a flag that promises that we won't try to replace the default ones.


--
Marc Glisse

Re: [PATCH 3/3] Match: Add pattern for `(a ? b : 0) | (a ? 0 : c)` into `a ? b : c` [PR103660]

2024-08-26 Thread Marc Glisse


--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2339,6 +2339,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (INTEGRAL_TYPE_P (type))
   (bit_and @0 @1)))

+/* Fold `(a ? b : 0) | (a ? 0 : c)` into (a ? b : c).
+Handle also ^ and + in replacement of `|`. */
+(for cnd (cond vec_cond)
+ (for op (bit_ior bit_xor plus)
+  (simplify
+   (op:c
+(cnd:s @0 @00 integer_zerop)
+(cnd:s @0 integer_zerop @01))
+   (cnd @0 @00 @01


Wouldn't it fall into something more generic like

(for cnd (cond vec_cond)
 (for op (any_binary)
  (simplify
   (op
(cnd:s @0 @1 @2)
(cnd:s @0 @3 @4))
   (cnd @0 (op! @1 @3) (op! @2 @4)

?

The example given in the doc for the use of '!' is pretty close

@smallexample
(simplify
  (plus (vec_cond:s @@0 @@1 @@2) @@3)
  (vec_cond @@0 (plus! @@1 @@3) (plus! @@2 @@3)))
@end smallexample

--
Marc Glisse

Re: [PATCH v2] MATCH: Add simplification for MAX and MIN to match.pd [PR109878]

2024-07-23 Thread Marc Glisse

A few ideas in case you want to generalize the transformation (these are 
not requirements to get your patch in, and this is not a review):


On Fri, 19 Jul 2024, Eikansh Gupta wrote:


--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4321,6 +4321,32 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
@0
@2)))

+/* min (a & CST0, a & CST1) -> a & CST0 IFF CST0 & CST1 == CST0 */
+/* max (a & CST0, a & CST1) -> a & CST0 IFF CST0 & CST1 == CST1 */
+/* If signed a, then both the constants should have same sign. */
+(for minmax (min max)
+ (simplify
+  (minmax (bit_and@3 @0 INTEGER_CST@1) (bit_and@4 @0 INTEGER_CST@2))
+   (if (TYPE_UNSIGNED (type)
+|| (tree_int_cst_sgn (@1) == tree_int_cst_sgn (@2)))
+(with { auto andvalue = wi::to_wide (@1) & wi::to_wide (@2); }
+ (if (andvalue == ((minmax == MIN_EXPR)
+   ? wi::to_wide (@1) : wi::to_wide (@2)))
+  @3
+  (if (andvalue == ((minmax != MIN_EXPR)
+? wi::to_wide (@1) : wi::to_wide (@2)))
+   @4))


Since max(a&1,a&3) is a&3, I think in the signed case we could also 
replace max(a&N,a&3) with a&3 if N is 1 | sign-bit (i.e. -1u/2+2). Indeed, 
either a>=0 and a&N is a&1, or a<0 and a&N < 0 <= a&3.



+/* min (a, a & CST) --> a & CST */
+/* max (a, a & CST) --> a */
+(for minmax (min max)
+ (simplify
+  (minmax @0 (bit_and@1 @0 INTEGER_CST@2))


Why do you require that @2 be a constant?


+   (if (TYPE_UNSIGNED(type))
+(if (minmax == MIN_EXPR)
+ @1
+ @0


Do we already have the corresponding transformations for comparisons?

a&b <= a --> true (if unsigned)
etc

Ideally, we would have **1** transformation for max(X,Y) that tries to 
fold X<=Y and if it folds to true then simplifies to Y. This way the 
transformations would only need to be written for comparisons, not minmax.


--
Marc Glisse

Re: vector conditional expression with scalar arguments

2013-08-23 Thread Marc Glisse


On Sun, 4 Aug 2013, Gerald Pfeifer wrote:


On Sat, 13 Jul 2013, Marc Glisse wrote:

2013-07-14  Marc Glisse  

gcc/cp/
* call.c (build_conditional_expr_1): Handle the case with 1 vector
and 2 scalars. Call save_expr before building a vector.
* typeck.c (cp_build_binary_op): Check complain before complaining.


Shouldn't this be documented somewhere (gcc/doc/*.texi and our web based 
release notes)?


Yes, it should. I had posted this some time ago:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00664.html

Here is an updated version that also mentions the new scalar case:

2013-08-23  Marc Glisse  

* doc/extend.texi (Vector Extensions): Document ?: in C++.

--
Marc GlisseIndex: doc/extend.texi
===
--- doc/extend.texi (revision 201948)
+++ doc/extend.texi (working copy)
@@ -7023,20 +7023,33 @@ otherwise. Consider the following exampl
 typedef int v4si __attribute__ ((vector_size (16)));
 
 v4si a = @{1,2,3,4@};
 v4si b = @{3,2,1,4@};
 v4si c;
 
 c = a >  b; /* The result would be @{0, 0,-1, 0@}  */
 c = a == b; /* The result would be @{0,-1, 0,-1@}  */
 @end smallexample
 
+In C++, the ternary operator @code{?:} is available. @code{a?b:c}, where
+@code{b} and @code{c} are vectors of the same type and @code{a} is an
+integer vector of the same size and number of elements as @code{b} and
+@code{c}, computes all three arguments and creates a vector
+@code{@{a[0]?b[0]:c[0], a[1]?b[1]:c[1], @dots{}@}}.  Note that unlike in
+OpenCL, @code{a} is thus interpreted as @code{a != 0} and not @code{a < 0}.
+As in the case of binary operations, this syntax is also accepted when
+one of @code{b} or @code{c} is a scalar that is then transformed into a
+vector. If both @code{b} and @code{c} are scalars and the type of
+@code{true?b:c} has the same size as the element type of @code{a}, then
+@code{b} and @code{c} are converted to a vector type whose elements have
+this type and with the same number of elements as @code{a}.
+
 Vector shuffling is available using functions
 @code{__builtin_shuffle (vec, mask)} and
 @code{__builtin_shuffle (vec0, vec1, mask)}.
 Both functions construct a permutation of elements from one or two
 vectors and return a vector of the same type as the input vector(s).
 The @var{mask} is an integral vector with the same width (@var{W})
 and element count (@var{N}) as the output vector.
 
 The elements of the input vectors are numbered in memory ordering of
 @var{vec0} beginning at 0 and @var{vec1} beginning at @var{N}.  The

Re: Request to merge Undefined Behavior Sanitizer in

2013-08-23 Thread Marc Glisse


On Thu, 8 Aug 2013, Joseph S. Myers wrote:


On Fri, 26 Jul 2013, Marc Glisse wrote:


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57324
(that is using llvm's sanitizer)

and for a first patch (unreviewed):

http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01466.html
(started at http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01402.html )


That patch is OK.


Applied at r201953 after retesting, thanks.

--
Marc Glisse

Re: RFA: Consider int and same-size short as equivalent vector components

2013-08-26 Thread Marc Glisse


On Mon, 26 Aug 2013, Joern Rennecke wrote:


avr currently shows the following failure:
FAIL: c-c++-common/vector-scalar.c  -Wc++-compat  (test for excess errors)
Excess errors:
/home/amylaar/atmel/4.8/unisrc-mainline/gcc/testsuite/c-c++-common/vector-scalar
.c:9:34: error: invalid operands to binary | (have '__vector(8) int' and 
'veci')


The issue is that when we compute the result of an operatiopn of a veci and 
an int, we get a __vector(8) int result (int is the same size as short),


Maybe we could change that?


yet the typechecking later does not accept the vectors as compatible.
Fixed by relaxing the latter for the case that int and short are the same 
size.


If you do this, maybe rename the function, or at least expand the comment, 
to make it clear that it should only be used for comparison operators with 
vectors?


The issue seems larger than just short/int. On x86, (lcompile for a vector of long, with lthat seems wrong.


--
Marc Glisse

Re: RFA: Consider int and same-size short as equivalent vector components

2013-08-26 Thread Marc Glisse


On Mon, 26 Aug 2013, Joern Rennecke wrote:


Quoting Marc Glisse :


The issue seems larger than just short/int. On x86, (l

I don't understand what you mean here.  Could you post the actual code 
sample?


typedef long vl __attribute__((vector_size(16)));

void f(vl l){
  (l(btw, my sentence was ambiguous, what I find wrong is the failure to 
compile, not the type of l

--
Marc Glisse

Re: Folding (a ? b : c) op d

2013-08-30 Thread Marc Glisse



Ping:

http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01624.html

and the related:

http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00230.html

--
Marc Glisse

Re: Folding (a ? b : c) op d

2013-08-30 Thread Marc Glisse


On Fri, 30 Aug 2013, Richard Biener wrote:


On Sat, Jun 29, 2013 at 9:02 AM, Marc Glisse  wrote:

Hello,

this patch changes the test deciding whether to fold "op d" with both
branches in (a ? b : c) op d. I don't know if the new test is right, it
gives what I expect on the new testcase, but I may have missed important
cases. Cc: Eric for comments as the author of the previous conditions.

Passes bootstrap+testsuite on x86_64-unknown-linux-gnu.

2013-06-29  Marc Glisse  

PR tree-optimization/57755
gcc/
* fold-const.c (fold_binary_op_with_conditional_arg): Change
condition under which the transformation is performed.

gcc/testsuite/
* gcc.dg/pr57755.c: New testcase.
* gcc.dg/binop-notand1a.c: Remove xfail.
* gcc.dg/binop-notand4a.c: Likewise.

--
Marc Glisse
Index: fold-const.c
===
--- fold-const.c(revision 200556)
+++ fold-const.c(working copy)
@@ -6097,26 +6097,33 @@ constant_boolean_node (bool value, tree
given here), it is the second argument.  TYPE is the type of the
original expression.  Return NULL_TREE if no simplification is
possible.  */

 static tree
 fold_binary_op_with_conditional_arg (location_t loc,
 enum tree_code code,
 tree type, tree op0, tree op1,
 tree cond, tree arg, int cond_first_p)
 {
-  tree cond_type = cond_first_p ? TREE_TYPE (op0) : TREE_TYPE (op1);
-  tree arg_type = cond_first_p ? TREE_TYPE (op1) : TREE_TYPE (op0);
+  /* ??? This transformation is only worthwhile if we don't have
+ to wrap ARG in a SAVE_EXPR.  */
+  if (TREE_SIDE_EFFECTS (arg))
+return NULL_TREE;
+
+  /* Avoid exponential recursion.  */
+  static int depth = 0;
+  if (depth > 1)
+return NULL_TREE;
+


I don't like this kind of measures ... which one again is the transform that
undoes what this function does?


There is no undoing here (you may be thinking of 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57286 ), it is just a 
recursion that gets very slow for nested conditions:


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55219


   tree test, true_value, false_value;
   tree lhs = NULL_TREE;
   tree rhs = NULL_TREE;
-  enum tree_code cond_code = COND_EXPR;

   if (TREE_CODE (cond) == COND_EXPR
   || TREE_CODE (cond) == VEC_COND_EXPR)
 {
   test = TREE_OPERAND (cond, 0);
   true_value = TREE_OPERAND (cond, 1);
   false_value = TREE_OPERAND (cond, 2);
   /* If this operand throws an expression, then it does not make
 sense to try to perform a logical or arithmetic operation
 involving it.  */
@@ -6126,54 +6133,49 @@ fold_binary_op_with_conditional_arg (loc
rhs = false_value;
 }
   else
 {
   tree testtype = TREE_TYPE (cond);
   test = cond;
   true_value = constant_boolean_node (true, testtype);
   false_value = constant_boolean_node (false, testtype);
 }

-  if (TREE_CODE (TREE_TYPE (test)) == VECTOR_TYPE)
-cond_code = VEC_COND_EXPR;
-
-  /* This transformation is only worthwhile if we don't have to wrap ARG
- in a SAVE_EXPR and the operation can be simplified without recursing
- on at least one of the branches once its pushed inside the COND_EXPR.
*/
-  if (!TREE_CONSTANT (arg)
-  && (TREE_SIDE_EFFECTS (arg)
- || TREE_CODE (arg) == COND_EXPR || TREE_CODE (arg) ==
VEC_COND_EXPR
- || TREE_CONSTANT (true_value) || TREE_CONSTANT (false_value)))
-return NULL_TREE;
+  tree cond_type = cond_first_p ? TREE_TYPE (op0) : TREE_TYPE (op1);
+  tree arg_type = cond_first_p ? TREE_TYPE (op1) : TREE_TYPE (op0);

   arg = fold_convert_loc (loc, arg_type, arg);
+  ++depth;
   if (lhs == 0)
 {
   true_value = fold_convert_loc (loc, cond_type, true_value);
   if (cond_first_p)
lhs = fold_build2_loc (loc, code, type, true_value, arg);
   else
lhs = fold_build2_loc (loc, code, type, arg, true_value);
 }
   if (rhs == 0)
 {
   false_value = fold_convert_loc (loc, cond_type, false_value);
   if (cond_first_p)
rhs = fold_build2_loc (loc, code, type, false_value, arg);
   else
rhs = fold_build2_loc (loc, code, type, arg, false_value);
 }
+  --depth;

   /* Check that we have simplified at least one of the branches.  */
-  if (!TREE_CONSTANT (arg) && !TREE_CONSTANT (lhs) && !TREE_CONSTANT (rhs))
+  if (!TREE_CONSTANT (lhs) && !TREE_CONSTANT (rhs))
 return NULL_TREE;

+  enum tree_code cond_code = VECTOR_TYPE_P (TREE_TYPE (test))
+? VEC_COND_EXPR : COND_EXPR;
   return fold_build3_loc (loc, cond_code, type, test, lhs, rhs);
 }


 /* Subroutine of fold() that checks for the addition of +/- 0.0.

If !NEGATE, return true if ADDEND is +/-0.0 and, for all X of type
TYPE, X + ADDEND is the same

Re: folding (vec_)cond_expr in a binary operation

2013-09-02 Thread Marc Glisse


On Fri, 30 Aug 2013, Richard Biener wrote:


On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse  wrote:

First, the fold-const bit causes an assertion failure (building libjava) in
combine_cond_expr_cond, which calls:

  t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);

and then checks:

  /* Require that we got a boolean type out if we put one in.  */
  gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));

which makes sense... Note that the 2 relevant types are:


well, the check is probably old and can be relaxed to also allow all
compatible types.


Ok. Maybe it could even be removed then, we shouldn't check in every 
function that calls fold_binary_loc that it returns something of the type 
that was asked for.



Second, the way the forwprop transformation is done, it can be necessary to
run the forwprop pass several times in a row, which is not nice. It takes:

stmt_cond
stmt_binop

and produces:

stmt_folded
stmt_cond2

But one of the arguments of stmt_folded could be an ssa_name defined by a
cond_expr, which could now be propagated into stmt_folded (there may be
other new possible transformations). However, that cond_expr has already
been visited and won't be again. The combine part of the pass uses a PLF to
revisit statements, but the forwprop part doesn't have any specific
mechanism.


forwprop is designed to re-process stmts it has folded to catch cascading
effects.  Now, as it as both a forward and a backward run you don't catch
2nd-order effects with iterating them.  On my TODO is to only do one kind,
either forward or backward propagation.


My impression is that even internally in the forward part, the
reprocessing doesn't really work, but that'll disappear anyway when the
forward part disappears.


Btw, as for the patch I don't like that you basically feed everything into
fold.  Yes, I know we do that for conditions because that's quite important
and nobody has re-written the condition folding as gimple pattern matcher.
I doubt that this transform is important enough to warrant another 
exception ;)


I am not sure what you want here. When I notice the pattern (a?b:c) op d, 
I need to check whether b op d and c op d are likely to simplify before 
transforming it to a?(b op d):(c op d). And currently I can't see any way 
to do that other than feeding (b op d) to fold. Even if/when we get a 
gimple folder, we will still need to call it in about the same way.


--
Marc Glisse

Re: folding (vec_)cond_expr in a binary operation

2013-09-03 Thread Marc Glisse


(attaching the latest version. I only added comments since regtesting it)

On Tue, 3 Sep 2013, Richard Biener wrote:


Btw, as for the patch I don't like that you basically feed everything into
fold.  Yes, I know we do that for conditions because that's quite
important
and nobody has re-written the condition folding as gimple pattern matcher.
I doubt that this transform is important enough to warrant another
exception ;)



I am not sure what you want here. When I notice the pattern (a?b:c) op d, I
need to check whether b op d and c op d are likely to simplify before
transforming it to a?(b op d):(c op d). And currently I can't see any way to
do that other than feeding (b op d) to fold. Even if/when we get a gimple
folder, we will still need to call it in about the same way.


With a gimple folder you'd avoid building trees.


Ah, so the problem is the cost of building those 2 trees? It will indeed
be better when we can avoid it, but I really don't expect the cost to be
that much...


You are testing for
a quite complex pattern here - (a?b:c) & (d?e:f).


But it is really handled in several steps. IIRC:
(a?1:0)&x becomes a?(x&1):0.
Since x is d?1:0, x&1 becomes d?1:0.
a?(d?1:0):0 is not (yet?) simplified further.

Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0
then a?(d?-1:0):0 and finally a?d:0.

Each step can also trigger individually.


It seems to be that
for example

+  vec c=(a>3)?o:z;
+  vec d=(b>2)?o:z;
+  vec e=c&d;

would be better suited in the combine phase (you are interested in
combining the comparisons).  That is, look for a [&|^] b where
a and b are [VEC_]COND_EXPRs with the same values.


Hmm, I am already looking for a binary op which has at least one operand
which is a [VEC_]COND_EXPR, in the combine (=backward) part of
tree-ssa-forwprop.  Isn't that almost what you are suggesting?

Heh, and it's not obvious to me with looking for a minute what this 
should be simplified to ...


(a?x:y)&(b?x:y) doesn't really simplify in general.


(so the code and the testcase needs some
comments on what you are trying to catch ...)


aIndex: tree-ssa-forwprop.c
===
--- tree-ssa-forwprop.c (revision 202185)
+++ tree-ssa-forwprop.c (working copy)
@@ -363,23 +363,20 @@ combine_cond_expr_cond (gimple stmt, enu
   gcc_assert (TREE_CODE_CLASS (code) == tcc_comparison);
 
   fold_defer_overflow_warnings ();
   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
   if (!t)
 {
   fold_undefer_overflow_warnings (false, NULL, 0);
   return NULL_TREE;
 }
 
-  /* Require that we got a boolean type out if we put one in.  */
-  gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
-
   /* Canonicalize the combined condition for use in a COND_EXPR.  */
   t = canonicalize_cond_expr_cond (t);
 
   /* Bail out if we required an invariant but didn't get one.  */
   if (!t || (invariant_only && !is_gimple_min_invariant (t)))
 {
   fold_undefer_overflow_warnings (false, NULL, 0);
   return NULL_TREE;
 }
 
@@ -1135,20 +1132,131 @@ forward_propagate_comparison (gimple_stm
 
   /* Remove defining statements.  */
   return remove_prop_source_from_use (name);
 
 bailout:
   gsi_next (defgsi);
   return false;
 }
 
 
+/* Combine the binary operation defined in *GSI with one of its arguments
+   that is a condition, i.e. (A ? B : C) op D becomes A ? (B op D) : (C op D).
+   Returns 1 if there were any changes made, 2 if cfg-cleanup needs to
+   run.  Else it returns 0.  */
+
+static int
+combine_binop_with_condition (gimple_stmt_iterator *gsi)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  gimple def_stmt;
+  enum tree_code defcode;
+  bool trueok = false;
+  bool falseok = false;
+  tree true_op, false_op;
+  location_t loc = gimple_location (stmt);
+
+  if (TREE_CODE (rhs1) != SSA_NAME)
+goto second_op;
+
+  def_stmt = get_prop_source_stmt (rhs1, true, NULL);
+  if (!def_stmt || !can_propagate_from (def_stmt))
+goto second_op;
+
+  defcode = gimple_assign_rhs_code (def_stmt);
+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
+goto second_op;
+
+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2 (def_stmt),
+gimple_assign_rhs2 (stmt));
+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3 (def_stmt),
+ gimple_assign_rhs2 (stmt));
+
+  if (is_gimple_val (true_op))
+trueok = true;
+  if (is_gimple_val (false_op))
+falseok = true;
+  /* At least one of them has to simplify, or it isn't worth it.  */
+  if (!trueok && !falseok)
+goto second_op;
+  if (!trueok)
+{
+  if (!valid_gimple_rhs_p (true_op))
+   goto second_op;
+  gimple g = gimple_build_assign (make_ssa_name (type, NULL),

operator new returns nonzero

2013-09-07 Thread Marc Glisse


Hello,

this patch teaches the compiler that operator new, when it can throw, 
isn't allowed to return a null pointer. Note that it doesn't work for 
placement new (that one may have to be handled in the front-end or the 
inliner), and it will not work on user-defined operator new if they are 
inlined. I guess it would be possible to teach the inliner to insert an 
assert_expr or something when inlining such a function, but that's not for 
now. The code I removed from vrp_visit_stmt was duplicated in 
stmt_interesting_for_vrp and thus useless.


I ran the c,c++ testsuite on a non-bootstrap compiler. I'll do more proper 
testing when trunk bootstraps again.


2013-09-07  Marc Glisse  

PR c++/19476
gcc/
* fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new.
* tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp):
Likewise.
(vrp_visit_stmt): Remove duplicated code.

gcc/testsuite/
* g++.dg/tree-ssa/pr19476-1.C: New file.
* g++.dg/tree-ssa/pr19476-2.C: Likewise.

--
Marc GlisseIndex: fold-const.c
===
--- fold-const.c(revision 202351)
+++ fold-const.c(working copy)
@@ -16171,21 +16171,29 @@ tree_expr_nonzero_warnv_p (tree t, bool
 case MODIFY_EXPR:
 case BIND_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1),
strict_overflow_p);
 
 case SAVE_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0),
strict_overflow_p);
 
 case CALL_EXPR:
-  return alloca_call_p (t);
+  {
+   tree fn = CALL_EXPR_FN (t);
+   if (TREE_CODE (fn) != ADDR_EXPR) return false;
+   tree fndecl = TREE_OPERAND (fn, 0);
+   if (TREE_CODE (fndecl) != FUNCTION_DECL) return false;
+   if (DECL_IS_OPERATOR_NEW (fndecl) && !TREE_NOTHROW (fndecl))
+ return true;
+   return alloca_call_p (t);
+  }
 
 default:
   break;
 }
   return false;
 }
 
 /* Return true when T is an address and is known to be nonzero.
Handle warnings about undefined signed overflow.  */
 
Index: testsuite/g++.dg/tree-ssa/pr19476-1.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-1.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-1.C   (revision 0)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-ccp1" } */
+
+#include 
+
+int f(){
+  return 33 + (0 == new(std::nothrow) int);
+}
+int g(){
+  return 42 + (0 == new int[50]);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */
+/* { dg-final { cleanup-tree-dump "ccp1" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C
___
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: testsuite/g++.dg/tree-ssa/pr19476-2.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-2.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-2.C   (revision 0)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#include 
+
+int f(){
+  int *p = new(std::nothrow) int;
+  return 33 + (0 == p);
+}
+int g(){
+  int *p = new int[50];
+  return 42 + (0 == p);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C
___
Added: svn:eol-style
   + native
Added: svn:keywords
   + Author Date Id Revision URL

Index: tree-vrp.c
===
--- tree-vrp.c  (revision 202351)
+++ tree-vrp.c  (working copy)
@@ -1047,21 +1047,27 @@ gimple_assign_nonzero_warnv_p (gimple st
*STRICT_OVERFLOW_P.*/
 
 static bool
 gimple_stmt_nonzero_warnv_p (gimple stmt, bool *strict_overflow_p)
 {
   switch (gimple_code (stmt))
 {
 case GIMPLE_ASSIGN:
   return gimple_assign_nonzero_warnv_p (stmt, strict_overflow_p);
 case GIMPLE_CALL:
-  return gimple_alloca_call_p (stmt);
+  {
+   tree fndecl = gimple_call_fndecl (stmt);
+   if (!fndecl) return false;
+   if (DECL_IS_OPERATOR_NEW (fndecl) && !TREE_NOTHROW (fndecl))
+ return true;
+   return gimple_alloca_call_p (stmt);
+  }
 default:
   gcc_unreachable ();
 }
 }
 
 /* Like tree_expr_nonzero_warnv_p, but this function uses value ranges
obtained so far.  */

Re: operator new returns nonzero

2013-09-07 Thread Marc Glisse


On Sat, 7 Sep 2013, Basile Starynkevitch wrote:


On Sat, 2013-09-07 at 12:33 +0200, Marc Glisse wrote:

Hello,

this patch teaches the compiler that operator new, when it can throw,
isn't allowed to return a null pointer. Note that it doesn't work for
placement new (that one may have to be handled in the front-end or the
inliner), and it will not work on user-defined operator new if they are
inlined. I guess it would be possible to teach the inliner to insert an
assert_expr or something when inlining such a function, but that's not for
now. The code I removed from vrp_visit_stmt was duplicated in
stmt_interesting_for_vrp and thus useless.



Interesting patch. However, I feel that we should give advanced users
the ability to say that their new operator is never returning null...


Easy, don't specify noexcept on it and that's what the patch already does, 
as long as the function is not inlined.



In particular, if I define my own new operator which never returns nil,
I find strange that it would be less optimized than the system's
operator new.

Perhaps we need an attribute `nonnullresult'  which whould means that.
(we already the nonnull attribute for function arguments)


There is already a PR about that, linked from this PR. But even if we 
create this attribute, we will still want to be able to guess that new 
should have it even if it isn't written.


--
Marc Glisse

Re: operator new returns nonzero

2013-09-07 Thread Marc Glisse


On Sat, 7 Sep 2013, Marc Glisse wrote:

this patch teaches the compiler that operator new, when it can throw, isn't 
allowed to return a null pointer. Note that it doesn't work for placement new 
(that one may have to be handled in the front-end or the inliner),


Placement new is a completely different thing. For one, it is nothrow (so 
the test doesn't apply). But mostly, it is a condition on the argument 
more than the result. Using the nonnull attribute would make sense, but 
the compiler doesn't seem clever enough to use that information. The 
easiest seems to be in the library:


--- o/new   2013-09-07 11:11:17.388756320 +0200
+++ i/new   2013-09-07 18:00:32.204797291 +0200
@@ -144,9 +144,9 @@

 // Default placement versions of operator new.
 inline void* operator new(std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT
-{ return __p; }
+{ if (!__p) __builtin_unreachable(); return __p; }
 inline void* operator new[](std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT
-{ return __p; }
+{ if (!__p) __builtin_unreachable(); return __p; }

 // Default placement versions of operator delete.
 inline void operator delete  (void*, void*) _GLIBCXX_USE_NOEXCEPT { }



Though I am not sure what the policy is for this kind of thing. And that's 
assuming it is indeed undefined to pass a null pointer to it, I don't have 
a good reference for that.




and it will not work on user-defined operator new if they are inlined.


But then if that operator new is inlined, it may already contain a line 
like if(p==0)throw(); which lets the compiler know all it needs.


I guess it 
would be possible to teach the inliner to insert an assert_expr or something 
when inlining such a function, but that's not for now. The code I removed 
from vrp_visit_stmt was duplicated in stmt_interesting_for_vrp and thus 
useless.


I ran the c,c++ testsuite on a non-bootstrap compiler. I'll do more proper 
testing when trunk bootstraps again.


2013-09-07  Marc Glisse  

PR c++/19476
gcc/
* fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new.
* tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp):
Likewise.
(vrp_visit_stmt): Remove duplicated code.

gcc/testsuite/
* g++.dg/tree-ssa/pr19476-1.C: New file.
* g++.dg/tree-ssa/pr19476-2.C: Likewise.


--
Marc Glisse

Re: operator new returns nonzero

2013-09-07 Thread Marc Glisse


On Sat, 7 Sep 2013, Gabriel Dos Reis wrote:


On Sat, Sep 7, 2013 at 11:42 AM, Marc Glisse  wrote:

On Sat, 7 Sep 2013, Marc Glisse wrote:


this patch teaches the compiler that operator new, when it can throw,
isn't allowed to return a null pointer. Note that it doesn't work for
placement new (that one may have to be handled in the front-end or the
inliner),



Placement new is a completely different thing. For one, it is nothrow (so
the test doesn't apply). But mostly, it is a condition on the argument more
than the result. Using the nonnull attribute would make sense, but the
compiler doesn't seem clever enough to use that information. The easiest
seems to be in the library:

--- o/new   2013-09-07 11:11:17.388756320 +0200
+++ i/new   2013-09-07 18:00:32.204797291 +0200
@@ -144,9 +144,9 @@

 // Default placement versions of operator new.
 inline void* operator new(std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT
-{ return __p; }
+{ if (!__p) __builtin_unreachable(); return __p; }
 inline void* operator new[](std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT
-{ return __p; }
+{ if (!__p) __builtin_unreachable(); return __p; }

 // Default placement versions of operator delete.
 inline void operator delete  (void*, void*) _GLIBCXX_USE_NOEXCEPT { }



Though I am not sure what the policy is for this kind of thing. And that's
assuming it is indeed undefined to pass a null pointer to it, I don't have a
good reference for that.




and it will not work on user-defined operator new if they are inlined.



But then if that operator new is inlined, it may already contain a line like
if(p==0)throw(); which lets the compiler know all it needs.


I am not sure I like this version of the patch.


The 2 patches are really independent, one (in the compiler) for regular 
operator new, and one (in the library) for the placement version.


I mostly like this second patch because it is so easy ;-)
But I will be happy if someone posts a better solution.


I think the appropriate patch should be in the compiler, not
here.  C++ has several places where certain parameters cannot
have non-null values. For example, the implicit parameter 'this'
in non-static member functions. This is another instance.


Indeed, I didn't check how gcc handles "this" being non-0.


Furthermore, I do think that the compiler should have special nodes
for both standard placement new and the global operator new functions,


That's one way to do it. Since this is the first time I look at those, I 
don't really see the advantage compared to the current status, but I 
trust you. What would you do with this special-node placement new? Keep it 
as is until after vrp so we can use the !=0 property and then expand it to 
its first argument? Or expand it early to the equivalent of the library 
code I wrote above?



as I explained in previous messages.


I couldn't find them, sorry if they contained information that makes this 
post irrelevant.


--
Marc Glisse

Re: operator new returns nonzero

2013-09-07 Thread Marc Glisse


On Sat, 7 Sep 2013, Mike Stump wrote:


On Sep 7, 2013, at 3:33 AM, Marc Glisse  wrote:

this patch teaches the compiler that operator new, when it can throw, isn't 
allowed to return a null pointer.


You sure:

@item -fcheck-new
@opindex fcheck-new
Check that the pointer returned by @code{operator new} is non-null
before attempting to modify the storage allocated.  This check is
normally unnecessary because the C++ standard specifies that
@code{operator new} only returns @code{0} if it is declared
@samp{throw()}, in which case the compiler always checks the
return value even without this option.  In all other cases, when
@code{operator new} has a non-empty exception specification, memory
exhaustion is signalled by throwing @code{std::bad_alloc}.  See also
@samp{new (nothrow)}.

?


Thanks, I didn't know that option. But it doesn't do the same. -fcheck-new 
is about the front-end inserting a test !=0 between malloc and the 
constructor. My patch is about teaching the middle-end that the value is 
not zero (so even user-coded comparisons with 0 can be simplified).


Now flag_check_new should probably disable this optimization... The 3 
-fcheck-new testcases in the testsuite probably only pass because they 
don't have -O2.


--
Marc Glisse

Re: operator new returns nonzero

2013-09-07 Thread Marc Glisse


On Sat, 7 Sep 2013, Mike Stump wrote:



On Sep 7, 2013, at 12:37 PM, Gabriel Dos Reis  
wrote:


On Sat, Sep 7, 2013 at 2:27 PM, Marc Glisse  wrote:

On Sat, 7 Sep 2013, Mike Stump wrote:


On Sep 7, 2013, at 3:33 AM, Marc Glisse  wrote:


this patch teaches the compiler that operator new, when it can throw,
isn't allowed to return a null pointer.



You sure:

@item -fcheck-new
@opindex fcheck-new
Check that the pointer returned by @code{operator new} is non-null
before attempting to modify the storage allocated.  This check is
normally unnecessary because the C++ standard specifies that
@code{operator new} only returns @code{0} if it is declared
@samp{throw()}, in which case the compiler always checks the
return value even without this option.  In all other cases, when
@code{operator new} has a non-empty exception specification, memory
exhaustion is signalled by throwing @code{std::bad_alloc}.  See also
@samp{new (nothrow)}.

?



Thanks, I didn't know that option. But it doesn't do the same.


Indeed.


Can this throw:

void *operator new (long unsigned int s) {
 return 0;
}

?  Is this allowed to return 0?


I think using this function is illegal. It isn't marked noexcept, so it 
isn't allowed to return 0.


--
Marc Glisse

Re: operator new returns nonzero

2013-09-07 Thread Marc Glisse


On Sat, 7 Sep 2013, Mike Stump wrote:


On Sep 7, 2013, at 12:27 PM, Marc Glisse  wrote:

Now flag_check_new should probably disable this optimization…


Yes, this why my point.


Ok, here it is (again, no proper testing until bootstrap is fixed)

2013-09-07  Marc Glisse  

PR c++/19476
gcc/
* fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new.
* tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp):
Likewise.
(vrp_visit_stmt): Remove duplicated code.

gcc/testsuite/
* g++.dg/tree-ssa/pr19476-1.C: New file.
* g++.dg/tree-ssa/pr19476-2.C: Likewise.
* g++.dg/tree-ssa/pr19476-3.C: Likewise.

--
Marc GlisseIndex: testsuite/g++.dg/tree-ssa/pr19476-1.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-1.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-1.C   (revision 0)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-ccp1" } */
+
+#include 
+
+int f(){
+  return 33 + (0 == new(std::nothrow) int);
+}
+int g(){
+  return 42 + (0 == new int[50]);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */
+/* { dg-final { cleanup-tree-dump "ccp1" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C
___
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: testsuite/g++.dg/tree-ssa/pr19476-2.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-2.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-2.C   (revision 0)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#include 
+
+int f(){
+  int *p = new(std::nothrow) int;
+  return 33 + (0 == p);
+}
+int g(){
+  int *p = new int[50];
+  return 42 + (0 == p);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C
___
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: testsuite/g++.dg/tree-ssa/pr19476-3.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-3.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-3.C   (revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcheck-new -fdump-tree-optimized" } */
+
+#include 
+
+int g(){
+  return 42 + (0 == new int);
+}
+
+/* { dg-final { scan-tree-dump-not "return 42" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-3.C
___
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: fold-const.c
===
--- fold-const.c(revision 202351)
+++ fold-const.c(working copy)
@@ -16171,21 +16171,31 @@ tree_expr_nonzero_warnv_p (tree t, bool
 case MODIFY_EXPR:
 case BIND_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1),
strict_overflow_p);
 
 case SAVE_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0),
strict_overflow_p);
 
 case CALL_EXPR:
-  return alloca_call_p (t);
+  {
+   tree fn = CALL_EXPR_FN (t);
+   if (TREE_CODE (fn) != ADDR_EXPR) return false;
+   tree fndecl = TREE_OPERAND (fn, 0);
+   if (TREE_CODE (fndecl) != FUNCTION_DECL) return false;
+   if (!flag_check_new
+   && DECL_IS_OPERATOR_NEW (fndecl)
+   && !TREE_NOTHROW (fndecl))
+ return true;
+   return alloca_call_p (t);
+  }
 
 default:
   break;
 }
   return false;
 }
 
 /* Return true when T is an address and is known to be nonzero.
Handle warnings about undefined signed overflow.  */
 
Index: tree-vrp.c
===
--- tree-vrp.c  (revision 202351)
+++ tree-vrp.c  (working copy)
@@ -1047,21 +1047,29 @@ gimple_assign_nonzero_warnv_p (gimple st
*STRICT_OVERFLOW_P.*/
 
 static bool
 gimple_stmt_nonzero_warnv_p (gimple stmt, bool *strict_overflow_p)
 {
   switch (gimple_code (stmt))
 {
 case GIMPLE_ASSIGN:
   return gimple_assign_nonzero_warnv_p (stmt, strict_overflow_p);
 case

Re: operator new returns nonzero

2013-09-07 Thread Marc Glisse


On Sat, 7 Sep 2013, Marc Glisse wrote:


On Sat, 7 Sep 2013, Mike Stump wrote:


Can this throw:

void *operator new (long unsigned int s) {
 return 0;
}

?  Is this allowed to return 0?


I think using this function is illegal. It isn't marked noexcept, so it isn't 
allowed to return 0.


And if I compile your code with gcc, I get nice warnings (though I get 
them twice and the column number is not so good):


m.cc: In function 'void* operator new(long unsigned int)':
m.cc:2:12: warning: 'operator new' must not return NULL unless it is 
declared 'throw()' (or -fcheck-new is in effect) [enabled by default]

 return 0;
^
m.cc: At global scope:
m.cc:1:7: warning: unused parameter 's' [-Wunused-parameter]
 void *operator new (long unsigned int s) {
   ^


--
Marc Glisse

Re: operator new returns nonzero

2013-09-09 Thread Marc Glisse


On Mon, 9 Sep 2013, Richard Biener wrote:


On Sat, Sep 7, 2013 at 11:00 PM, Marc Glisse  wrote:

On Sat, 7 Sep 2013, Mike Stump wrote:


On Sep 7, 2013, at 12:27 PM, Marc Glisse  wrote:


Now flag_check_new should probably disable this optimization…



Yes, this why my point.



Ok, here it is (again, no proper testing until bootstrap is fixed)


I wonder what happens on targets where 0 is a valid address of an object
(stated by !flag_delete_null_pointer_checks)?


I am not at all familiar with those targets (I thought you had to write
asm to access 0 so the compiler doesn't mess with your code), but it
makes sense to me to test (flag_delete_null_pointer_checks &&
!flag_check_new) instead of just !flag_check_new. Consider the patch
changed this way. (we have so many options, I wouldn't be surprised if
there is yet another one to check...)

--
Marc Glisse

Re: operator new returns nonzero

2013-09-09 Thread Marc Glisse


I have now tested bootstrap+testsuite and there was no regression.

2013-09-07  Marc Glisse  

PR c++/19476
gcc/
* fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new.
* tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp):
Likewise.
(vrp_visit_stmt): Remove duplicated code.

gcc/testsuite/
* g++.dg/tree-ssa/pr19476-1.C: New file.
* g++.dg/tree-ssa/pr19476-2.C: Likewise.
* g++.dg/tree-ssa/pr19476-3.C: Likewise.
* g++.dg/tree-ssa/pr19476-4.C: Likewise.

--
Marc GlisseIndex: fold-const.c
===
--- fold-const.c(revision 202413)
+++ fold-const.c(working copy)
@@ -16171,21 +16171,31 @@ tree_expr_nonzero_warnv_p (tree t, bool
 case MODIFY_EXPR:
 case BIND_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1),
strict_overflow_p);
 
 case SAVE_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0),
strict_overflow_p);
 
 case CALL_EXPR:
-  return alloca_call_p (t);
+  {
+   tree fn = CALL_EXPR_FN (t);
+   if (TREE_CODE (fn) != ADDR_EXPR) return false;
+   tree fndecl = TREE_OPERAND (fn, 0);
+   if (TREE_CODE (fndecl) != FUNCTION_DECL) return false;
+   if (flag_delete_null_pointer_checks && !flag_check_new
+   && DECL_IS_OPERATOR_NEW (fndecl)
+   && !TREE_NOTHROW (fndecl))
+ return true;
+   return alloca_call_p (t);
+  }
 
 default:
   break;
 }
   return false;
 }
 
 /* Return true when T is an address and is known to be nonzero.
Handle warnings about undefined signed overflow.  */
 
Index: testsuite/g++.dg/tree-ssa/pr19476-1.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-1.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-1.C   (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-ccp1" } */
+
+#include 
+
+int f(){
+  return 33 + (0 == new(std::nothrow) int);
+}
+int g(){
+  return 42 + (0 == new int[50]);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */
+/* { dg-final { cleanup-tree-dump "ccp1" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C
___
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Index: testsuite/g++.dg/tree-ssa/pr19476-2.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-2.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-2.C   (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#include 
+
+int f(){
+  int *p = new(std::nothrow) int;
+  return 33 + (0 == p);
+}
+int g(){
+  int *p = new int[50];
+  return 42 + (0 == p);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C
___
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Index: testsuite/g++.dg/tree-ssa/pr19476-3.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-3.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-3.C   (working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcheck-new -fdump-tree-optimized" } */
+
+#include 
+
+int g(){
+  return 42 + (0 == new int);
+}
+
+/* { dg-final { scan-tree-dump-not "return 42" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-3.C
___
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Index: testsuite/g++.dg/tree-ssa/pr19476-4.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-4.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-4.C   (working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-delete-null-pointer-chec

Re: folding (vec_)cond_expr in a binary operation

2013-09-11 Thread Marc Glisse


Any other comments on this patch?

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00129.html

On Tue, 3 Sep 2013, Marc Glisse wrote:


(attaching the latest version. I only added comments since regtesting it)

On Tue, 3 Sep 2013, Richard Biener wrote:

Btw, as for the patch I don't like that you basically feed everything 
into

fold.  Yes, I know we do that for conditions because that's quite
important
and nobody has re-written the condition folding as gimple pattern 
matcher.

I doubt that this transform is important enough to warrant another
exception ;)



I am not sure what you want here. When I notice the pattern (a?b:c) op d, 
I

need to check whether b op d and c op d are likely to simplify before
transforming it to a?(b op d):(c op d). And currently I can't see any way 
to

do that other than feeding (b op d) to fold. Even if/when we get a gimple
folder, we will still need to call it in about the same way.


With a gimple folder you'd avoid building trees.


Ah, so the problem is the cost of building those 2 trees? It will indeed
be better when we can avoid it, but I really don't expect the cost to be
that much...


You are testing for
a quite complex pattern here - (a?b:c) & (d?e:f).


But it is really handled in several steps. IIRC:
(a?1:0)&x becomes a?(x&1):0.
Since x is d?1:0, x&1 becomes d?1:0.
a?(d?1:0):0 is not (yet?) simplified further.

Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0
then a?(d?-1:0):0 and finally a?d:0.

Each step can also trigger individually.


It seems to be that
for example

+  vec c=(a>3)?o:z;
+  vec d=(b>2)?o:z;
+  vec e=c&d;

would be better suited in the combine phase (you are interested in
combining the comparisons).  That is, look for a [&|^] b where
a and b are [VEC_]COND_EXPRs with the same values.


Hmm, I am already looking for a binary op which has at least one operand
which is a [VEC_]COND_EXPR, in the combine (=backward) part of
tree-ssa-forwprop.  Isn't that almost what you are suggesting?

Heh, and it's not obvious to me with looking for a minute what this should 
be simplified to ...


(a?x:y)&(b?x:y) doesn't really simplify in general.


(so the code and the testcase needs some
comments on what you are trying to catch ...)


a

--
Marc Glisse

Re: vector conditional expression with scalar arguments

2013-09-13 Thread Marc Glisse


Ping
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01381.html

On Fri, 23 Aug 2013, Marc Glisse wrote:


On Sun, 4 Aug 2013, Gerald Pfeifer wrote:


On Sat, 13 Jul 2013, Marc Glisse wrote:

2013-07-14  Marc Glisse  

gcc/cp/
* call.c (build_conditional_expr_1): Handle the case with 1 vector
and 2 scalars. Call save_expr before building a vector.
* typeck.c (cp_build_binary_op): Check complain before complaining.


Shouldn't this be documented somewhere (gcc/doc/*.texi and our web based 
release notes)?


Yes, it should. I had posted this some time ago:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00664.html

Here is an updated version that also mentions the new scalar case:

2013-08-23  Marc Glisse  

* doc/extend.texi (Vector Extensions): Document ?: in C++.


--
Marc Glisse

[v3] More noexcept for vectors

2013-09-14 Thread Marc Glisse


Hello,

this patch passes bootstrap+testsuite. The guarantees given by the 
standard on allocators are a bit weird, but I see there is already DR2016 
taking care of it.


2013-09-14  Marc Glisse  

PR libstdc++/58338
* include/bits/stl_vector.h
(_Vector_impl::_Vector_impl(_Tp_alloc_type const&),
_Vector_impl::_Vector_impl(_Tp_alloc_type&&),
_Vector_impl::_M_swap_data,
_Vector_base::_Vector_base(const allocator_type&),
_Vector_base::_Vector_base(allocator_type&&),
_Vector_base::_Vector_base(_Vector_base&&),
vector::vector(const allocator_type&), vector::operator[],
vector::operator[] const, vector::front, vector::front const,
vector::back, vector::back const, vector::pop_back,
vector::_M_erase_at_end): Mark as noexcept.
(vector::~vector): Remove useless noexcept.


--
Marc GlisseIndex: include/bits/stl_vector.h
===
--- include/bits/stl_vector.h   (revision 202588)
+++ include/bits/stl_vector.h   (working copy)
@@ -80,32 +80,32 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   : public _Tp_alloc_type
   {
pointer _M_start;
pointer _M_finish;
pointer _M_end_of_storage;
 
_Vector_impl()
: _Tp_alloc_type(), _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 
-   _Vector_impl(_Tp_alloc_type const& __a)
+   _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT
: _Tp_alloc_type(__a), _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 
 #if __cplusplus >= 201103L
-   _Vector_impl(_Tp_alloc_type&& __a)
+   _Vector_impl(_Tp_alloc_type&& __a) noexcept
: _Tp_alloc_type(std::move(__a)),
  _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 #endif
 
-   void _M_swap_data(_Vector_impl& __x)
+   void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT
{
  std::swap(_M_start, __x._M_start);
  std::swap(_M_finish, __x._M_finish);
  std::swap(_M_end_of_storage, __x._M_end_of_storage);
}
   };
   
 public:
   typedef _Alloc allocator_type;
 
@@ -117,36 +117,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT
   { return *static_cast(&this->_M_impl); }
 
   allocator_type
   get_allocator() const _GLIBCXX_NOEXCEPT
   { return allocator_type(_M_get_Tp_allocator()); }
 
   _Vector_base()
   : _M_impl() { }
 
-  _Vector_base(const allocator_type& __a)
+  _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT
   : _M_impl(__a) { }
 
   _Vector_base(size_t __n)
   : _M_impl()
   { _M_create_storage(__n); }
 
   _Vector_base(size_t __n, const allocator_type& __a)
   : _M_impl(__a)
   { _M_create_storage(__n); }
 
 #if __cplusplus >= 201103L
-  _Vector_base(_Tp_alloc_type&& __a)
+  _Vector_base(_Tp_alloc_type&& __a) noexcept
   : _M_impl(std::move(__a)) { }
 
-  _Vector_base(_Vector_base&& __x)
+  _Vector_base(_Vector_base&& __x) noexcept
   : _M_impl(std::move(__x._M_get_Tp_allocator()))
   { this->_M_impl._M_swap_data(__x._M_impl); }
 
   _Vector_base(_Vector_base&& __x, const allocator_type& __a)
   : _M_impl(__a)
   {
if (__x.get_allocator() == __a)
  this->_M_impl._M_swap_data(__x._M_impl);
else
  {
@@ -246,21 +246,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*  @brief  Default constructor creates no elements.
*/
   vector()
   : _Base() { }
 
   /**
*  @brief  Creates a %vector with no elements.
*  @param  __a  An allocator object.
*/
   explicit
-  vector(const allocator_type& __a)
+  vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT
   : _Base(__a) { }
 
 #if __cplusplus >= 201103L
   /**
*  @brief  Creates a %vector with default constructed elements.
*  @param  __n  The number of elements to initially create.
*  @param  __a  An allocator.
*
*  This constructor fills the %vector with @a __n default
*  constructed elements.
@@ -404,21 +404,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  _M_initialize_dispatch(__first, __last, _Integral());
}
 #endif
 
   /**
*  The dtor only erases the elements, and note that if the
*  elements themselves are pointers, the pointed-to memory is
*  not touched in any way.  Managing the pointer is the user's
*  responsibility.
*/
-  ~vector() _GLIBCXX_NOEXCEPT
+  ~vector()
   { std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish,
  _M_get_Tp_allocator()); }
 
   /**
*  @brief  %Vector assignment oper

Re: [v3] More noexcept for vectors

2013-09-14 Thread Marc Glisse


On Sat, 14 Sep 2013, Paolo Carlini wrote:

... what about debug-mode (and profile-mode)? Is in principle possible 
to detect the noexcepts? If we can't exclude that, we should probably 
change at the same time the special modes too. Otherwise seems just 
matter of consistency?


I was going one file at a time, and the priority is clearly 
"release"-mode, since this is about performance. I don't think there is 
anything wrong with debug-mode having a different exception specification 
(it is already binary-incompatible with the default mode) but I understand 
why people may want to keep some consistency. I can do vector in the other 
modes next if you want.


--
Marc Glisse

Re: [v3] More noexcept for vectors

2013-09-15 Thread Marc Glisse

I had to separate the constructor that takes an allocator from the default 
constructor in debug/profile, since in release the noexcept only applies 
to one of them (and the testsuite asserts that release and debug agree on 
this). An alternative would be to make the release vector default 
constructor conditionally noexcept (depending on the allocator). Or to use 
explicit vector(const Allocator& = Allocator()); also in normal mode 
instead of splitting it in two.


I also removed a noexcept in profile mode where the release mode doesn't 
have noexcept.


Tested, no regression.

2013-09-15  Marc Glisse  

PR libstdc++/58338
* include/bits/stl_vector.h
(_Vector_impl::_Vector_impl(_Tp_alloc_type const&),
_Vector_impl::_Vector_impl(_Tp_alloc_type&&),
_Vector_impl::_M_swap_data,
_Vector_base::_Vector_base(const allocator_type&),
_Vector_base::_Vector_base(allocator_type&&),
_Vector_base::_Vector_base(_Vector_base&&),
vector::vector(const allocator_type&), vector::operator[],
vector::operator[] const, vector::front, vector::front const,
vector::back, vector::back const, vector::pop_back,
vector::_M_erase_at_end): Mark as noexcept.
(vector::~vector): Remove useless noexcept.
* include/debug/vector (vector::vector()): Split.
(vector::vector(const _Allocator&), vector::operator[],
vector::operator[] const, vector::front, vector::front const,
vector::back, vector::back const, vector::pop_back,
_M_requires_reallocation, _M_update_guaranteed_capacity,
_M_invalidate_after_nth): Mark as noexcept.
(vector::~vector): Remove useless noexcept.
* include/profile/vector (vector::vector()): Split.
(vector::vector(const _Allocator&), vector::operator[],
vector::operator[] const, vector::front, vector::front const,
vector::back, vector::back const): Mark as noexcept.
(vector::vector(vector&&, const _Allocator&)): Remove wrong noexcept.
(vector::~vector): Remove useless noexcept.

--
Marc GlisseIndex: include/bits/stl_vector.h
===
--- include/bits/stl_vector.h   (revision 202588)
+++ include/bits/stl_vector.h   (working copy)
@@ -80,32 +80,32 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   : public _Tp_alloc_type
   {
pointer _M_start;
pointer _M_finish;
pointer _M_end_of_storage;
 
_Vector_impl()
: _Tp_alloc_type(), _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 
-   _Vector_impl(_Tp_alloc_type const& __a)
+   _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT
: _Tp_alloc_type(__a), _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 
 #if __cplusplus >= 201103L
-   _Vector_impl(_Tp_alloc_type&& __a)
+   _Vector_impl(_Tp_alloc_type&& __a) noexcept
: _Tp_alloc_type(std::move(__a)),
  _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 #endif
 
-   void _M_swap_data(_Vector_impl& __x)
+   void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT
{
  std::swap(_M_start, __x._M_start);
  std::swap(_M_finish, __x._M_finish);
  std::swap(_M_end_of_storage, __x._M_end_of_storage);
}
   };
   
 public:
   typedef _Alloc allocator_type;
 
@@ -117,36 +117,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT
   { return *static_cast(&this->_M_impl); }
 
   allocator_type
   get_allocator() const _GLIBCXX_NOEXCEPT
   { return allocator_type(_M_get_Tp_allocator()); }
 
   _Vector_base()
   : _M_impl() { }
 
-  _Vector_base(const allocator_type& __a)
+  _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT
   : _M_impl(__a) { }
 
   _Vector_base(size_t __n)
   : _M_impl()
   { _M_create_storage(__n); }
 
   _Vector_base(size_t __n, const allocator_type& __a)
   : _M_impl(__a)
   { _M_create_storage(__n); }
 
 #if __cplusplus >= 201103L
-  _Vector_base(_Tp_alloc_type&& __a)
+  _Vector_base(_Tp_alloc_type&& __a) noexcept
   : _M_impl(std::move(__a)) { }
 
-  _Vector_base(_Vector_base&& __x)
+  _Vector_base(_Vector_base&& __x) noexcept
   : _M_impl(std::move(__x._M_get_Tp_allocator()))
   { this->_M_impl._M_swap_data(__x._M_impl); }
 
   _Vector_base(_Vector_base&& __x, const allocator_type& __a)
   : _M_impl(__a)
   {
if (__x.get_allocator() == __a)
  this->_M_impl._M_swap_data(__x._M_impl);
else
  {
@@ -246,21 +246,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*  @brief  Default constructor creates no elements.
*/
   vector()
   :

Re: [v3] More noexcept for vectors

2013-09-16 Thread Marc Glisse


On Mon, 16 Sep 2013, Paolo Carlini wrote:


On 09/15/2013 11:12 AM, Marc Glisse wrote:
I had to separate the constructor that takes an allocator from the default 
constructor in debug/profile, since in release the noexcept only applies to 
one of them (and the testsuite asserts that release and debug agree on 
this). An alternative would be to make the release vector default 
constructor conditionally noexcept (depending on the allocator). Or to use 
explicit vector(const Allocator& = Allocator()); also in normal mode 
instead of splitting it in two.
Thanks a lot. Now I'm wondering if we shouldn't really do the latter: the 
issue is, if I remember correctly, in C++11, at variance with C++98, 
allocators aren't necessarily default constructible, thus by explicit 
instantiatiation the user can easily tell whether that constructor is split 
or not. What do you think?


Shouldn't it just be illegal to explicitly instantiate a full class like 
std::vector?


But ok, I'll post that version as soon as I can test it.

--
Marc Glisse

Re: [v3] More noexcept for vectors

2013-09-16 Thread Marc Glisse


On Mon, 16 Sep 2013, Jonathan Wakely wrote:


Are you sure the noexcept on the destructors is useless?


Ok, I am putting it back in.

--
Marc Glisse

Re: [v3] More noexcept for vectors

2013-09-16 Thread Marc Glisse


New version that passed testing.

2013-09-16  Marc Glisse  

PR libstdc++/58338
* include/bits/stl_vector.h (vector::vector(),
vector::vector(const allocator_type&)): Merge.
(_Vector_impl::_Vector_impl(_Tp_alloc_type const&),
_Vector_impl::_Vector_impl(_Tp_alloc_type&&),
_Vector_impl::_M_swap_data,
_Vector_base::_Vector_base(const allocator_type&),
_Vector_base::_Vector_base(allocator_type&&),
_Vector_base::_Vector_base(_Vector_base&&), _Vector_base::~_Vector_base,
vector::vector(const allocator_type&), vector::operator[],
vector::operator[] const, vector::front, vector::front const,
vector::back, vector::back const, vector::pop_back,
vector::_M_erase_at_end): Mark as noexcept.
* include/debug/vector (vector::vector(const _Allocator&),
vector::operator[], vector::operator[] const, vector::front,
vector::front const, vector::back, vector::back const, vector::pop_back,
_M_requires_reallocation, _M_update_guaranteed_capacity,
_M_invalidate_after_nth): Mark as noexcept.
* include/profile/vector (vector::vector(const _Allocator&),
vector::operator[], vector::operator[] const, vector::front,
vector::front const, vector::back, vector::back const): Mark as
noexcept.
(vector::vector(vector&&, const _Allocator&)): Remove wrong noexcept.
* testsuite/23_containers/vector/requirements/dr438/assign_neg.cc:
Adjust line number.
* testsuite/23_containers/vector/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/
constructor_2_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/insert_neg.cc:
Likewise.

--
Marc GlisseIndex: include/bits/stl_vector.h
===
--- include/bits/stl_vector.h   (revision 202625)
+++ include/bits/stl_vector.h   (working copy)
@@ -80,32 +80,32 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   : public _Tp_alloc_type
   {
pointer _M_start;
pointer _M_finish;
pointer _M_end_of_storage;
 
_Vector_impl()
: _Tp_alloc_type(), _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 
-   _Vector_impl(_Tp_alloc_type const& __a)
+   _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT
: _Tp_alloc_type(__a), _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 
 #if __cplusplus >= 201103L
-   _Vector_impl(_Tp_alloc_type&& __a)
+   _Vector_impl(_Tp_alloc_type&& __a) noexcept
: _Tp_alloc_type(std::move(__a)),
  _M_start(0), _M_finish(0), _M_end_of_storage(0)
{ }
 #endif
 
-   void _M_swap_data(_Vector_impl& __x)
+   void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT
{
  std::swap(_M_start, __x._M_start);
  std::swap(_M_finish, __x._M_finish);
  std::swap(_M_end_of_storage, __x._M_end_of_storage);
}
   };
   
 public:
   typedef _Alloc allocator_type;
 
@@ -117,53 +117,53 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT
   { return *static_cast(&this->_M_impl); }
 
   allocator_type
   get_allocator() const _GLIBCXX_NOEXCEPT
   { return allocator_type(_M_get_Tp_allocator()); }
 
   _Vector_base()
   : _M_impl() { }
 
-  _Vector_base(const allocator_type& __a)
+  _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT
   : _M_impl(__a) { }
 
   _Vector_base(size_t __n)
   : _M_impl()
   { _M_create_storage(__n); }
 
   _Vector_base(size_t __n, const allocator_type& __a)
   : _M_impl(__a)
   { _M_create_storage(__n); }
 
 #if __cplusplus >= 201103L
-  _Vector_base(_Tp_alloc_type&& __a)
+  _Vector_base(_Tp_alloc_type&& __a) noexcept
   : _M_impl(std::move(__a)) { }
 
-  _Vector_base(_Vector_base&& __x)
+  _Vector_base(_Vector_base&& __x) noexcept
   : _M_impl(std::move(__x._M_get_Tp_allocator()))
   { this->_M_impl._M_swap_data(__x._M_impl); }
 
   _Vector_base(_Vector_base&& __x, const allocator_type& __a)
   : _M_impl(__a)
   {
if (__x.get_allocator() == __a)
  this->_M_impl._M_swap_data(__x._M_impl);
else
  {
size_t __n = __x._M_impl._M_finish - __x._M_impl._M_start;
_M_create_storage(__n);
  }
   }
 #endif
 
-  ~_Vector_base()
+  ~_Vector_base() _GLIBCXX_NOEXCEPT
   { _M_deallocate(this->_M_impl._M_start, this->_M_impl._M_end_of_storage
  - this->_M_impl._M_start); }
 
 public:
   _Vector_impl _M_impl;
 
   pointer
   _M_allocat

Re: operator new returns nonzero

2013-09-16 Thread Marc Glisse

Nobody has expressed concern for a week, so it may be worth doing an 
official review ;-)


http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00676.html

On Mon, 9 Sep 2013, Marc Glisse wrote:


I have now tested bootstrap+testsuite and there was no regression.

2013-09-07  Marc Glisse  

PR c++/19476
gcc/
* fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new.
* tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp):
Likewise.
(vrp_visit_stmt): Remove duplicated code.

gcc/testsuite/
* g++.dg/tree-ssa/pr19476-1.C: New file.
* g++.dg/tree-ssa/pr19476-2.C: Likewise.
* g++.dg/tree-ssa/pr19476-3.C: Likewise.
* g++.dg/tree-ssa/pr19476-4.C: Likewise.


--
Marc Glisse

[v3] More noexcept for lists

2013-09-17 Thread Marc Glisse


Hello,

after vectors, lists. I didn't touch the throw we were discussing earlier 
today for now. There will be an inconsistency with debug list iterators 
because they use a general wrapper:
- I would need François to tell if that wrapper is ever used with 
iterators that can throw,
- the same wrapper is used for several containers, so unless we change all 
containers at once it can't stay consistent.


Bootstrap+testsuite ok.

2013-09-18  Marc Glisse  

PR libstdc++/58338
* include/bits/list.tcc (_List_base::_M_clear, list::erase): Mark as
noexcept.
* include/bits/stl_list.h (_List_iterator) [_List_iterator,
_M_const_cast, operator*, operator->, operator++, operator--,
operator==, operator!=]: Likewise.
(_List_const_iterator) [_List_const_iterator, _M_const_cast, operator*,
operator->, operator++, operator--, operator==, operator!=]: Likewise.
(operator==(const _List_iterator&, const _List_const_iterator&),
operator!=(const _List_iterator&, const _List_const_iterator&)):
Likewise.
(_List_impl) [_List_impl(const _Node_alloc_type&),
_List_impl(_Node_alloc_type&&)]: Likewise.
(_List_base) [_M_put_node, _List_base(const _Node_alloc_type&),
_List_base(_List_base&&), _M_clear, _M_init]: Likewise.
(list) [list(), list(const allocator_type&)]: Merge.
(list) [list(const allocator_type&), front, back, pop_front, pop_back,
erase, _M_erase]: Mark as noexcept.
* include/debug/list (list) [list(const _Allocator&), front, back,
pop_front, pop_back, _M_erase, erase]: Likewise.
* include/profile/list (list) [list(const _Allocator&), front, back,
pop_front, pop_back, erase]: Likewise.
* testsuite/23_containers/list/requirements/dr438/assign_neg.cc:
Adjust line number.
* testsuite/23_containers/list/requirements/dr438/constructor_1_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/constructor_2_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/insert_neg.cc:
Likewise.

--
Marc GlisseIndex: include/bits/list.tcc
===
--- include/bits/list.tcc   (revision 202655)
+++ include/bits/list.tcc   (working copy)
@@ -56,21 +56,21 @@
 #ifndef _LIST_TCC
 #define _LIST_TCC 1
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   template
 void
 _List_base<_Tp, _Alloc>::
-_M_clear()
+_M_clear() _GLIBCXX_NOEXCEPT
 {
   typedef _List_node<_Tp>  _Node;
   _Node* __cur = static_cast<_Node*>(_M_impl._M_node._M_next);
   while (__cur != &_M_impl._M_node)
{
  _Node* __tmp = __cur;
  __cur = static_cast<_Node*>(__cur->_M_next);
 #if __cplusplus >= 201103L
  _M_get_Node_allocator().destroy(__tmp);
 #else
@@ -138,21 +138,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
return __it;
  }
return __position._M_const_cast();
   }
 #endif
 
   template
 typename list<_Tp, _Alloc>::iterator
 list<_Tp, _Alloc>::
 #if __cplusplus >= 201103L
-erase(const_iterator __position)
+erase(const_iterator __position) noexcept
 #else
 erase(iterator __position)
 #endif
 {
   iterator __ret = iterator(__position._M_node->_M_next);
   _M_erase(__position._M_const_cast());
   return __ret;
 }
 
 #if __cplusplus >= 201103L
Index: include/bits/stl_list.h
===
--- include/bits/stl_list.h (revision 202655)
+++ include/bits/stl_list.h (working copy)
@@ -126,76 +126,76 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 {
   typedef _List_iterator<_Tp>_Self;
   typedef _List_node<_Tp>_Node;
 
   typedef ptrdiff_t  difference_type;
   typedef std::bidirectional_iterator_tagiterator_category;
   typedef _Tpvalue_type;
   typedef _Tp*   pointer;
   typedef _Tp&   reference;
 
-  _List_iterator()
+  _List_iterator() _GLIBCXX_NOEXCEPT
   : _M_node() { }
 
   explicit
-  _List_iterator(__detail::_List_node_base* __x)
+  _List_iterator(__detail::_List_node_base* __x) _GLIBCXX_NOEXCEPT
   : _M_node(__x) { }
 
   _Self
-  _M_const_cast() const
+  _M_const_cast() const _GLIBCXX_NOEXCEPT
   { return *this; }
 
   // Must downcast from _List_node_base to _List_node to get to _M_data.
   reference
-  operator*() const
+  operator*() const _GLIBCXX_NOEXCEPT
   { return static_cast<_Node*>(_M_node)->_M_data; }
 
   pointer
-

Re: [v3] More noexcept for lists

2013-09-18 Thread Marc Glisse


On Wed, 18 Sep 2013, Paolo Carlini wrote:

On 09/17/2013 08:44 PM, Marc Glisse wrote:
after vectors, lists. I didn't touch the throw we were discussing earlier 
today for now. There will be an inconsistency with debug list iterators 
because they use a general wrapper:
- I would need François to tell if that wrapper is ever used with iterators 
that can throw,
- the same wrapper is used for several containers, so unless we change all 
containers at once it can't stay consistent.
Thus the idea is changing first all the containers and eventually go back to 
__gnu_debug::_Safe_iterator and consistently add the noexcepts there?


Yes.

Let's not forget that! (or alternately leave out all the iterators 
related bits for the time being ;)


That would mean one mega-patch doing all the iterators at once :-(


Patch is otherwise Ok with me, thanks.


Ok, I'll commit it now and move to other containers.

--
Marc Glisse

[v3] More noexcept -- 3rd

2013-09-18 Thread Marc Glisse


Hello,

some more containers...

In debug array, we already have throw in noexcept functions, but if I
understand correctly it is only because of syntax limitations for constexpr
functions and aborts before throwing, although the use of
_GLIBCXX_THROW_OR_ABORT is suspicious. In any case, I am not changing this
with my patch.

I replaced throw with abort in list, as discussed, and thus removed the
corresponding testcase.

bootstrap+testsuite ok.

2013-09-19  Marc Glisse  

PR libstdc++/58338
* include/bits/stl_iterator.h (__normal_iterator) [__normal_iterator,
_M_const_cast, operator*, operator->, operator++, operator--,
operator[], operator+=, operator+, operator-=, operator-, base]:
Mark as noexcept.
(operator==(const __normal_iterator&, const __normal_iterator&),
operator!=(const __normal_iterator&, const __normal_iterator&),
operator<(const __normal_iterator&, const __normal_iterator&),
operator>(const __normal_iterator&, const __normal_iterator&),
operator<=(const __normal_iterator&, const __normal_iterator&),
operator>=(const __normal_iterator&, const __normal_iterator&),
operator-(const __normal_iterator&, const __normal_iterator&),
operator+(difference_type, const __normal_iterator&)): Likewise.
* include/bits/stl_list.h (list) [splice, _M_check_equal_allocators]:
Likewise.
(list::_M_check_equal_allocators): Abort instead of throwing.
* include/debug/array (array) [operator[], front, back]: Mark as
noexcept.
* include/profile/array (array) [operator[], front, back]: Likewise.
* include/std/array (array) [operator[], front, back]: Likewise.
* include/debug/list (list::splice): Likewise.
* include/profile/list (list::splice): Likewise.
* testsuite/23_containers/list/operations/5.cc: Remove file.
* testsuite/23_containers/list/operations/5.h: Likewise.

--
Marc GlisseIndex: include/bits/stl_iterator.h
===
--- include/bits/stl_iterator.h (revision 202699)
+++ include/bits/stl_iterator.h (working copy)
@@ -714,216 +714,232 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef iterator_traits<_Iterator>   __traits_type;
 
 public:
   typedef _Iteratoriterator_type;
   typedef typename __traits_type::iterator_category iterator_category;
   typedef typename __traits_type::value_type   value_type;
   typedef typename __traits_type::difference_type  difference_type;
   typedef typename __traits_type::referencereference;
   typedef typename __traits_type::pointer  pointer;
 
-  _GLIBCXX_CONSTEXPR __normal_iterator() : _M_current(_Iterator()) { }
+  _GLIBCXX_CONSTEXPR __normal_iterator() _GLIBCXX_NOEXCEPT
+  : _M_current(_Iterator()) { }
 
   explicit
-  __normal_iterator(const _Iterator& __i) : _M_current(__i) { }
+  __normal_iterator(const _Iterator& __i) _GLIBCXX_NOEXCEPT
+  : _M_current(__i) { }
 
   // Allow iterator to const_iterator conversion
   template
 __normal_iterator(const __normal_iterator<_Iter,
  typename __enable_if<
   (std::__are_same<_Iter, typename _Container::pointer>::__value),
- _Container>::__type>& __i)
+ _Container>::__type>& __i) _GLIBCXX_NOEXCEPT
 : _M_current(__i.base()) { }
 
 #if __cplusplus >= 201103L
   __normal_iterator
-  _M_const_cast() const
+  _M_const_cast() const noexcept
   {
using _PTraits = std::pointer_traits;
return __normal_iterator
  (_PTraits::pointer_to(const_cast
(*_M_current)));
   }
 #else
   __normal_iterator
   _M_const_cast() const
   { return *this; }
 #endif
 
   // Forward iterator requirements
   reference
-  operator*() const
+  operator*() const _GLIBCXX_NOEXCEPT
   { return *_M_current; }
 
   pointer
-  operator->() const
+  operator->() const _GLIBCXX_NOEXCEPT
   { return _M_current; }
 
   __normal_iterator&
-  operator++()
+  operator++() _GLIBCXX_NOEXCEPT
   {
++_M_current;
return *this;
   }
 
   __normal_iterator
-  operator++(int)
+  operator++(int) _GLIBCXX_NOEXCEPT
   { return __normal_iterator(_M_current++); }
 
   // Bidirectional iterator requirements
   __normal_iterator&
-  operator--()
+  operator--() _GLIBCXX_NOEXCEPT
   {
--_M_current;
return *this;
   }
 
   __normal_iterator
-  operator--(int)
+  operator--(int) _GLIBCXX_NOEXCEPT
   { return __normal_iterator(_M_current--); }
 
   // Rand

Re: [v3] More noexcept -- 3rd

2013-09-18 Thread Marc Glisse


On Wed, 18 Sep 2013, Paolo Carlini wrote:

On 09/18/2013 05:51 PM, Marc Glisse wrote:

In debug array, we already have throw in noexcept functions, but if I
understand correctly it is only because of syntax limitations for constexpr
functions and aborts before throwing, although the use of
_GLIBCXX_THROW_OR_ABORT is suspicious. In any case, I am not changing this
with my patch.
If I remember correctly, somebody invented that mild hack and suggested it to 
indeed have a check as part of a constexpr function, not a trivial task. Jon 
participated to that discussion. After a while I resurrected that old 
discussion, tested the code and it appeared to work well. In practice, are 
you experiencing specific problems with it?


No, no problem. For some reason I thought there would be issues when the 
macro expands to __builtin_abort(), but there aren't, great.


Any other comments on the patch? (Jon's "great" doesn't really sound like 
an "ok")


--
Marc Glisse

[v3] More noexcept -- 4th

2013-09-18 Thread Marc Glisse


Hello,

I did not touch the regular basic_string because Paulo usually says not to 
touch it, but I could do it as well if wanted. I didn't add noexcept to 
the debug string swap (and move assignments) because the regular 
basic_string swap can currently throw if the allocators are distinct.


bootstrap+testsuite ok.

2013-09-19  Marc Glisse  

PR libstdc++/58338
* include/bits/stl_tree.h (_Rb_tree_node_base) [_S_minimum, _S_maximum]:
Mark as noexcept.
(_Rb_tree_iterator) [_Rb_tree_iterator, operator*, operator->,
operator++, operator--, operator==, operator!=]: Likewise.
(_Rb_tree_const_iterator) [_Rb_tree_const_iterator, _M_const_cast,
operator*, operator->, operator++, operator--, operator==, operator!=]:
Likewise.
(operator==(const _Rb_tree_iterator&, const _Rb_tree_const_iterator&),
operator!=(const _Rb_tree_iterator&, const _Rb_tree_const_iterator&)):
Likewise.
(_Rb_tree) [_M_put_node, _M_destroy_node, _M_root, _M_leftmost,
_M_rightmost, _M_begin, _M_end, _S_left, _S_right, _S_minimum,
_S_maximum]: Likewise.
* include/debug/string (basic_string) [basic_string(const _Allocator&),
shrink_to_fit, operator[], pop_back]: Likewise.
* include/ext/vstring.h (__versa_string) [_M_limit, _M_disjunct,
_M_ibegin, _M_iend, __versa_string(const _Alloc&),
operator=(__versa_string&&), shrink_to_fit, operator[], front,
back, assign(__versa_string&&), swap]: Likewise.
(__versa_string) [__versa_string(), __versa_string(const _Alloc&)]:
Merge.

--
Marc GlisseIndex: include/bits/stl_tree.h
===
--- include/bits/stl_tree.h (revision 202722)
+++ include/bits/stl_tree.h (working copy)
@@ -92,42 +92,42 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 typedef _Rb_tree_node_base* _Base_ptr;
 typedef const _Rb_tree_node_base* _Const_Base_ptr;
 
 _Rb_tree_color _M_color;
 _Base_ptr  _M_parent;
 _Base_ptr  _M_left;
 _Base_ptr  _M_right;
 
 static _Base_ptr
-_S_minimum(_Base_ptr __x)
+_S_minimum(_Base_ptr __x) _GLIBCXX_NOEXCEPT
 {
   while (__x->_M_left != 0) __x = __x->_M_left;
   return __x;
 }
 
 static _Const_Base_ptr
-_S_minimum(_Const_Base_ptr __x)
+_S_minimum(_Const_Base_ptr __x) _GLIBCXX_NOEXCEPT
 {
   while (__x->_M_left != 0) __x = __x->_M_left;
   return __x;
 }
 
 static _Base_ptr
-_S_maximum(_Base_ptr __x)
+_S_maximum(_Base_ptr __x) _GLIBCXX_NOEXCEPT
 {
   while (__x->_M_right != 0) __x = __x->_M_right;
   return __x;
 }
 
 static _Const_Base_ptr
-_S_maximum(_Const_Base_ptr __x)
+_S_maximum(_Const_Base_ptr __x) _GLIBCXX_NOEXCEPT
 {
   while (__x->_M_right != 0) __x = __x->_M_right;
   return __x;
 }
   };
 
   template
 struct _Rb_tree_node : public _Rb_tree_node_base
 {
   typedef _Rb_tree_node<_Val>* _Link_type;
@@ -160,72 +160,72 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef _Tp& reference;
   typedef _Tp* pointer;
 
   typedef bidirectional_iterator_tag iterator_category;
   typedef ptrdiff_t  difference_type;
 
   typedef _Rb_tree_iterator<_Tp>_Self;
   typedef _Rb_tree_node_base::_Base_ptr _Base_ptr;
   typedef _Rb_tree_node<_Tp>*   _Link_type;
 
-  _Rb_tree_iterator()
+  _Rb_tree_iterator() _GLIBCXX_NOEXCEPT
   : _M_node() { }
 
   explicit
-  _Rb_tree_iterator(_Link_type __x)
+  _Rb_tree_iterator(_Link_type __x) _GLIBCXX_NOEXCEPT
   : _M_node(__x) { }
 
   reference
-  operator*() const
+  operator*() const _GLIBCXX_NOEXCEPT
   { return static_cast<_Link_type>(_M_node)->_M_value_field; }
 
   pointer
-  operator->() const
+  operator->() const _GLIBCXX_NOEXCEPT
   { return std::__addressof(static_cast<_Link_type>
(_M_node)->_M_value_field); }
 
   _Self&
-  operator++()
+  operator++() _GLIBCXX_NOEXCEPT
   {
_M_node = _Rb_tree_increment(_M_node);
return *this;
   }
 
   _Self
-  operator++(int)
+  operator++(int) _GLIBCXX_NOEXCEPT
   {
_Self __tmp = *this;
_M_node = _Rb_tree_increment(_M_node);
return __tmp;
   }
 
   _Self&
-  operator--()
+  operator--() _GLIBCXX_NOEXCEPT
   {
_M_node = _Rb_tree_decrement(_M_node);
return *this;
   }
 
   _Self
-  operator--(int)
+  operator--(int) _GLIBCXX_NOEXCEPT
   {
_Self __tmp = *this;
_M_node = _Rb_tree_decrement(_M_node);
return __tmp;
   }
 
   bool
-  operator==(const _Self& __x) const
+  operator==(co

Re: [v3] More noexcept -- 5th

2013-09-20 Thread Marc Glisse


On Fri, 20 Sep 2013, Paolo Carlini wrote:


On 09/20/2013 09:46 AM, Marc Glisse wrote:

Hello,

for basic_string, I tried not to add lies about exceptions, but I didn't
remove existing ones.
Of course we should not have lies, I thought we didn't, besides maybe special 
cases having to do with the FULLY_DYNAMIC string thing, really a C++98 legacy 
wa, which will not exist in the future. Can you please send an updated patch 
fixing those?


Would you mind if we did that as a separate follow-up patch, unless there 
are other problems with the patch? One is adding noexcept for 
optimization, the other one would be removing some (no intersection) for 
correctness. I'll do it this WE. I'll also need to remove the 
corresponding noexcept from debug/profile mode...


--
Marc Glisse

Re: [v3] More noexcept -- 5th

2013-09-20 Thread Marc Glisse


On Fri, 20 Sep 2013, Paolo Carlini wrote:

By the way, I would be curious at some point to actually see with my eyes the 
effect of those optimizations in the assembly: is it easy to produce 
examples? Even at say -O2?


If you use "if(noexcept(container.shrink_to_fit()))", you can easily cause 
different code to be used. Now whether for regular code it somehow 
produces fewer implicit try-catch or some optimization like that, I have 
no idea. If it ever makes code worse, please beat the core- people who 
required std::terminate with a screen showing the benchmarks (I keep 
wanting to introduce -fno-abort -fno-terminate flags to turn those 2 calls 
into __builtin_unreachable).


Note that I am still a proponent of noexcept(auto), if it can't be the 
default. If someone feels like implementing it as an extension, we could 
use it in the library.


--
Marc Glisse

Re: [patch] Make vector::at() assertion message more useful (try #2)

2013-09-23 Thread Marc Glisse


On Mon, 23 Sep 2013, Paolo Carlini wrote:


There are a lot of targets using unsigned int for size_t, which would
have been uncovered by proper testing.


We can't test all patches on 3-4 different targets... It wasn't obvious 
this patch could be that sensitive to the target.


That's true, just remember to test *both* -m32 and -m64, for non trivial 
changes.


So how do you do that in practice ? Is it done by default if multilib is 
enabled? You also mentioned doing something special to check debug/profile 
modes recently, is there a make target to really perform all the tests 
necessary for a submission?


http://gcc.gnu.org/contribute.html has an outdated section on testing. It 
mentions that you should do a bootstrap for a change to the C front-end 
(should also be for the C++ front-end and I guess libstdc++ even if it 
isn't used much inside gcc).


--
Marc Glisse

[v3] Less noexcept

2013-09-23 Thread Marc Glisse


Hello,

this patch was tested on x86_64 with a bootstrap and a simple make -k 
check, without regression. Note that it doesn't completely fix 56166, it

merely admits that we may currently throw and avoids turning that into
std::terminate.

2013-09-24  Marc Glisse  

PR libstdc++/58338
PR libstdc++/56166
* include/bits/basic_string.h (basic_string)
[basic_string(basic_string&&)]: Make the noexcept conditional.
[operator=(basic_string&&), assign(basic_string&&)]: Link to PR 58265.
[begin(), end(), rbegin(), rend(), clear]: Remove noexcept.
* include/debug/string (basic_string) [basic_string(const _Allocator&),
basic_string(basic_string&&), begin(), end(), rbegin(), rend(), clear,
operator[](size_type), pop_back]: Comment out noexcept, until vstring
replaces basic_string.

--
Marc GlisseIndex: include/bits/basic_string.h
===
--- include/bits/basic_string.h (revision 202831)
+++ include/bits/basic_string.h (working copy)
@@ -502,21 +502,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   basic_string(size_type __n, _CharT __c, const _Alloc& __a = _Alloc());
 
 #if __cplusplus >= 201103L
   /**
*  @brief  Move construct string.
*  @param  __str  Source string.
*
*  The newly-created string contains the exact contents of @a __str.
*  @a __str is a valid, but unspecified string.
**/
-  basic_string(basic_string&& __str) noexcept
+  basic_string(basic_string&& __str)
+#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0
+  noexcept
+#endif
   : _M_dataplus(__str._M_dataplus)
   {
 #if _GLIBCXX_FULLY_DYNAMIC_STRING == 0
__str._M_data(_S_empty_rep()._M_refdata());
 #else
__str._M_data(_S_construct(size_type(), _CharT(), get_allocator()));
 #endif
   }
 
   /**
@@ -574,20 +577,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
 #if __cplusplus >= 201103L
   /**
*  @brief  Move assign the value of @a str to this string.
*  @param  __str  Source string.
*
*  The contents of @a str are moved into this string (without copying).
*  @a str is a valid, but unspecified string.
**/
+  // PR 58265, this should be noexcept.
   basic_string&
   operator=(basic_string&& __str)
   {
// NB: DR 1204.
this->swap(__str);
return *this;
   }
 
   /**
*  @brief  Set value to string constructed from initializer %list.
@@ -600,78 +604,78 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return *this;
   }
 #endif // C++11
 
   // Iterators:
   /**
*  Returns a read/write iterator that points to the first character in
*  the %string.  Unshares the string.
*/
   iterator
-  begin() _GLIBCXX_NOEXCEPT
+  begin() // Can throw.
   {
_M_leak();
return iterator(_M_data());
   }
 
   /**
*  Returns a read-only (constant) iterator that points to the first
*  character in the %string.
*/
   const_iterator
   begin() const _GLIBCXX_NOEXCEPT
   { return const_iterator(_M_data()); }
 
   /**
*  Returns a read/write iterator that points one past the last
*  character in the %string.  Unshares the string.
*/
   iterator
-  end() _GLIBCXX_NOEXCEPT
+  end() // Can throw.
   {
_M_leak();
return iterator(_M_data() + this->size());
   }
 
   /**
*  Returns a read-only (constant) iterator that points one past the
*  last character in the %string.
*/
   const_iterator
   end() const _GLIBCXX_NOEXCEPT
   { return const_iterator(_M_data() + this->size()); }
 
   /**
*  Returns a read/write reverse iterator that points to the last
*  character in the %string.  Iteration is done in reverse element
*  order.  Unshares the string.
*/
   reverse_iterator
-  rbegin() _GLIBCXX_NOEXCEPT
+  rbegin() // Can throw.
   { return reverse_iterator(this->end()); }
 
   /**
*  Returns a read-only (constant) reverse iterator that points
*  to the last character in the %string.  Iteration is done in
*  reverse element order.
*/
   const_reverse_iterator
   rbegin() const _GLIBCXX_NOEXCEPT
   { return const_reverse_iterator(this->end()); }
 
   /**
*  Returns a read/write reverse iterator that points to one before the
*  first character in the %string.  Iteration is done in reverse
*  element order.  Unshares the string.
*/
   reverse_iterator
-  rend() _GLIBCXX_NOEXCEPT
+  rend() // Can throw.
   { return reverse_iterator(this->begin()); }
 
   /**
*  Returns a read-only (constant) rev

Re: [v3] Less noexcept

2013-09-23 Thread Marc Glisse


On Mon, 23 Sep 2013, Paolo Carlini wrote:


On 9/23/13 10:55 AM, Marc Glisse wrote:

Hello,

this patch was tested on x86_64 with a bootstrap and a simple make -k 
check, without regression. Note that it doesn't completely fix 56166, it

merely admits that we may currently throw and avoids turning that into
std::terminate.

Of course.

Patch basically Ok, can you be a little less terse in the comments before the 
noexcepts which you are removing and in fact must be there for conformance, 
aren't just QoI? Point to an existing bug, when it exists, like 56166, add a 
FIXME C++11 at least.


Like this?

It is funny that with fully dynamic strings, the copy constructor is 
"better" than the move constructor: faster, doesn't throw, etc. I think we 
should remove the move constructor in that case, or at least make it act 
the same as the copy constructor. I didn't mark the copy constructor as 
noexcept, but without checking the code it seems likely we could.


Also, __versa_string actually has the same issues as basic_string, if you 
choose the reference counted base... I guess I have to do a patch to 
remove noexcept there now :-( Has someone started work on some branch to 
get a real C++11 basic_string, or are we waiting until the "lack of 
manpower" argument convinces everyone to forget about trying to preserve 
any ABI compatibility?


2013-09-24  Marc Glisse  

PR libstdc++/58338
PR libstdc++/56166
* include/bits/basic_string.h (basic_string)
[basic_string(basic_string&&)]: Make the noexcept conditional.
[operator=(basic_string&&), assign(basic_string&&)]: Link to PR 58265.
[begin(), end(), rbegin(), rend(), clear]: Remove noexcept.
[pop_back]: Comment on the lack of noexcept.
* include/debug/string (basic_string) [basic_string(const _Allocator&),
basic_string(basic_string&&), begin(), end(), rbegin(), rend(), clear,
operator[](size_type), pop_back]: Comment out noexcept, until vstring
replaces basic_string.

--
Marc GlisseIndex: include/bits/basic_string.h
===
--- include/bits/basic_string.h (revision 202838)
+++ include/bits/basic_string.h (working copy)
@@ -502,21 +502,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   basic_string(size_type __n, _CharT __c, const _Alloc& __a = _Alloc());
 
 #if __cplusplus >= 201103L
   /**
*  @brief  Move construct string.
*  @param  __str  Source string.
*
*  The newly-created string contains the exact contents of @a __str.
*  @a __str is a valid, but unspecified string.
**/
-  basic_string(basic_string&& __str) noexcept
+  basic_string(basic_string&& __str)
+#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0
+  noexcept // FIXME C++11: should always be noexcept.
+#endif
   : _M_dataplus(__str._M_dataplus)
   {
 #if _GLIBCXX_FULLY_DYNAMIC_STRING == 0
__str._M_data(_S_empty_rep()._M_refdata());
 #else
__str._M_data(_S_construct(size_type(), _CharT(), get_allocator()));
 #endif
   }
 
   /**
@@ -574,20 +577,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
 #if __cplusplus >= 201103L
   /**
*  @brief  Move assign the value of @a str to this string.
*  @param  __str  Source string.
*
*  The contents of @a str are moved into this string (without copying).
*  @a str is a valid, but unspecified string.
**/
+  // PR 58265, this should be noexcept.
   basic_string&
   operator=(basic_string&& __str)
   {
// NB: DR 1204.
this->swap(__str);
return *this;
   }
 
   /**
*  @brief  Set value to string constructed from initializer %list.
@@ -600,78 +604,78 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return *this;
   }
 #endif // C++11
 
   // Iterators:
   /**
*  Returns a read/write iterator that points to the first character in
*  the %string.  Unshares the string.
*/
   iterator
-  begin() _GLIBCXX_NOEXCEPT
+  begin() // FIXME C++11: should be noexcept.
   {
_M_leak();
return iterator(_M_data());
   }
 
   /**
*  Returns a read-only (constant) iterator that points to the first
*  character in the %string.
*/
   const_iterator
   begin() const _GLIBCXX_NOEXCEPT
   { return const_iterator(_M_data()); }
 
   /**
*  Returns a read/write iterator that points one past the last
*  character in the %string.  Unshares the string.
*/
   iterator
-  end() _GLIBCXX_NOEXCEPT
+  end() // FIXME C++11: should be noexcept.
   {
_M_leak();
return iterator(_M_data() + this->size());
   }
 
   /**
*  Returns a read-only (constant) iterator that points one

Re: [v3] Less noexcept

2013-09-23 Thread Marc Glisse


On Mon, 23 Sep 2013, Paolo Carlini wrote:


Hi again,
It is funny that with fully dynamic strings, the copy constructor is 
"better" than the move constructor: faster, doesn't throw, etc. I think we 
should remove the move constructor in that case, or at least make it act 
the same as the copy constructor. I didn't mark the copy constructor as 
noexcept, but without checking the code it seems likely we could.


I took a look at the code. First I got a headache because of the allocator 
stuff. And once the paracetamol started helping, I noticed that the 
is_leaked mechanism means it can throw anyway, so I was wrong.


We could, but in my opinion fiddling with those isn't worth the trouble, 
because the whole "fully dynamic string" thing is just a workaround for 
issues of the current reference counted implementation vs the statically 
allocated empty string on some targets.
Well, I have a second (practical) thought about this part. If you are willing 
to spend a little more time on this, and can confirm your preliminary 
analysis about copy-constructor vs move-constructor, first blush it 
definitely makes sense to me, I'm certainly not against your proposal of 
having the move-constructor identical to the copy-constructor in that case. 
In 4.9.x some targets, not Linux, would benefit from it.


Sorry.

--
Marc Glisse

[v3] plus

2013-09-24 Thread Marc Glisse


Hello,

this was only minimally tested since trunk is broken at the moment. There 
don't seem to be specific tests for the existing functors, so a couple 
tests for the new specializations seem good enough.


2013-09-25  Marc Glisse  

* include/bits/stl_function.h: Include  for std::forward.
(bit_not): New class.
(plus, minus ,multiplies, divides,
modulus, negate, equal_to, not_equal_to,
greater, less, greater_equal, less_equal,
logical_and, logical_or, logical_not, bit_and,
bit_or, bit_xor, bit_not): New specializations.
* testsuite/20_util/headers/functional/generic_function_objects.cc:
New file.

--
Marc GlisseIndex: include/bits/stl_function.h
===
--- include/bits/stl_function.h (revision 202872)
+++ include/bits/stl_function.h (working copy)
@@ -49,20 +49,22 @@
  */
 
 /** @file bits/stl_function.h
  *  This is an internal header file, included by other library headers.
  *  Do not attempt to use it directly. @headername{functional}
  */
 
 #ifndef _STL_FUNCTION_H
 #define _STL_FUNCTION_H 1
 
+#include  // for std::forward
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // 20.3.1 base classes
   /** @defgroup functors Function Objects
* @ingroup utilities
*
*  Function objects, or @e functors, are objects with an @c operator()
*  defined and accessible.  They can be passed as arguments to algorithm
@@ -129,201 +131,514 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @ingroup functors
*
*  Because basic math often needs to be done during an algorithm,
*  the library provides functors for those operations.  See the
*  documentation for @link functors the base classes@endlink
*  for examples of their use.
*
*  @{
*/
   /// One of the @link arithmetic_functors math functors@endlink.
-  template
+  template
 struct plus : public binary_function<_Tp, _Tp, _Tp>
 {
   _Tp
   operator()(const _Tp& __x, const _Tp& __y) const
   { return __x + __y; }
 };
 
+#if __cplusplus >= 201103L
+  template<>
+struct plus
+{
+  template 
+  constexpr auto
+  operator()(_Tp&& __x, _Up&& __y) const
+  noexcept(noexcept(std::forward<_Tp>(__x) + std::forward<_Up>(__y)))
+  -> decltype(std::forward<_Tp>(__x) + std::forward<_Up>(__y))
+  {
+   return std::forward<_Tp>(__x) + std::forward<_Up>(__y);
+  }
+  typedef void is_transparent;
+  };
+#endif
+
   /// One of the @link arithmetic_functors math functors@endlink.
-  template
+  template
 struct minus : public binary_function<_Tp, _Tp, _Tp>
 {
   _Tp
   operator()(const _Tp& __x, const _Tp& __y) const
   { return __x - __y; }
 };
 
+#if __cplusplus >= 201103L
+  template<>
+struct minus
+{
+  template 
+  constexpr auto
+  operator()(_Tp&& __x, _Up&& __y) const
+  noexcept(noexcept(std::forward<_Tp>(__x) - std::forward<_Up>(__y)))
+  -> decltype(std::forward<_Tp>(__x) - std::forward<_Up>(__y))
+  {
+   return std::forward<_Tp>(__x) - std::forward<_Up>(__y);
+  }
+  typedef void is_transparent;
+  };
+#endif
+
   /// One of the @link arithmetic_functors math functors@endlink.
-  template
+  template
 struct multiplies : public binary_function<_Tp, _Tp, _Tp>
 {
   _Tp
   operator()(const _Tp& __x, const _Tp& __y) const
   { return __x * __y; }
 };
 
+#if __cplusplus >= 201103L
+  template<>
+struct multiplies
+{
+  template 
+  constexpr auto
+  operator()(_Tp&& __x, _Up&& __y) const
+  noexcept(noexcept(std::forward<_Tp>(__x) * std::forward<_Up>(__y)))
+  -> decltype(std::forward<_Tp>(__x) * std::forward<_Up>(__y))
+  {
+   return std::forward<_Tp>(__x) * std::forward<_Up>(__y);
+  }
+  typedef void is_transparent;
+  };
+#endif
+
   /// One of the @link arithmetic_functors math functors@endlink.
-  template
+  template
 struct divides : public binary_function<_Tp, _Tp, _Tp>
 {
   _Tp
   operator()(const _Tp& __x, const _Tp& __y) const
   { return __x / __y; }
 };
 
+#if __cplusplus >= 201103L
+  template<>
+struct divides
+{
+  template 
+  constexpr auto
+  operator()(_Tp&& __x, _Up&& __y) const
+  noexcept(noexcept(std::forward<_Tp>(__x) / std::forward<_Up>(__y)))
+  -> decltype(std::forward<_Tp>(__x) / std::forward<_Up>(__y))
+  {
+   return std::forward<_Tp>(__x) / std::forward<_Up>(__y);
+  }
+  typedef void is_transparent;
+  };
+#endif
+
   /// One of the @link arithme

Re: [v3] plus

2013-09-24 Thread Marc Glisse


On Wed, 25 Sep 2013, Jonathan Wakely wrote:


I've had this sitting in my tree waiting to do something with,


I did ask last week if someone had done it already...

I'm about to go to sleep so didn't check if the test covers anything 
yours doesn't.


In the test you have basic cover for all functors, and I cover only 2 
cases (more specifically though, since I look at the return type cv-qual).


In my patch, I added constexpr and noexcept, I couldn't see any reason not 
to for such basic utilities. Yes, I did read the wiki and noticed the vote 
yesterday about constexpr, but imho that's wrong.



It looks like your patch adds the default template argument even in
C++98 mode, I avoided that by putting forward declarations at the top
of the file, in a #if block.


This only lets me write:
std::plus<> *p;
in C++98 mode (std::plus<> p; gives an error), doesn't seem that bad.

And I also add the void specializations in C++11 mode, as an extension.

Well, let's forget my patch and go with yours, though at least adding 
noexcept seems like a good idea.


(too bad we don't have noexcept(auto) here)
(too bad we can't use decltype(auto) for the return type, as that would 
disable sfinae, it is a bad sign when the standard turns its nose up at 
its own dog food...)


--
Marc Glisse

Re: operator new returns nonzero

2013-10-02 Thread Marc Glisse

It isn't a front-end patch, but it is still a C++ patch, maybe Jason will 
have comments? Anyone else?


On Mon, 16 Sep 2013, Marc Glisse wrote:

Nobody has expressed concern for a week, so it may be worth doing an official 
review ;-)


http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00676.html

On Mon, 9 Sep 2013, Marc Glisse wrote:


I have now tested bootstrap+testsuite and there was no regression.

2013-09-07  Marc Glisse  

PR c++/19476
gcc/
* fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new.
* tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp):
Likewise.
(vrp_visit_stmt): Remove duplicated code.

gcc/testsuite/
* g++.dg/tree-ssa/pr19476-1.C: New file.
* g++.dg/tree-ssa/pr19476-2.C: Likewise.
* g++.dg/tree-ssa/pr19476-3.C: Likewise.
* g++.dg/tree-ssa/pr19476-4.C: Likewise.


--
Marc Glisse

Re: operator new returns nonzero

2013-10-02 Thread Marc Glisse


On Wed, 2 Oct 2013, Jakub Jelinek wrote:


On Mon, Sep 09, 2013 at 10:49:40PM +0200, Marc Glisse wrote:

--- fold-const.c(revision 202413)
+++ fold-const.c(working copy)
@@ -16171,21 +16171,31 @@ tree_expr_nonzero_warnv_p (tree t, bool
 case MODIFY_EXPR:
 case BIND_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1),
strict_overflow_p);

 case SAVE_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0),
strict_overflow_p);

 case CALL_EXPR:
-  return alloca_call_p (t);
+  {
+   tree fn = CALL_EXPR_FN (t);
+   if (TREE_CODE (fn) != ADDR_EXPR) return false;
+   tree fndecl = TREE_OPERAND (fn, 0);
+   if (TREE_CODE (fndecl) != FUNCTION_DECL) return false;
+   if (flag_delete_null_pointer_checks && !flag_check_new
+   && DECL_IS_OPERATOR_NEW (fndecl)
+   && !TREE_NOTHROW (fndecl))
+ return true;
+   return alloca_call_p (t);


Not commenting on what this patch does, but how:
why don't you use tree fndecl = get_callee_fndecl (t);
if (fndecl && ...) return true; instead?


Because I copied the code from alloca_call_p ;-)
get_callee_fndecl does look better indeed.


Perhaps alloca_call_p should use it too.


Thanks, I'll prepare a new patch.

--
Marc Glisse

Make the 2 versions of delete more similar

2013-10-02 Thread Marc Glisse


Hello,

I don't understand why those 2 files differ by more than 1 extra argument, 
so I am changing that.


Bootstrap and testsuite on x86_64.

2013-10-03  Marc Glisse  

* libsupc++/del_op.cc (operator delete): Don't test for 0 before free.
* libsupc++/del_opnt.cc (free): Only declare if freestanding.
(operator delete): Qualify free with std::.

--
Marc GlisseIndex: libsupc++/del_op.cc
===
--- libsupc++/del_op.cc (revision 203101)
+++ libsupc++/del_op.cc (working copy)
@@ -36,13 +36,12 @@ _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 #else
 # include 
 #endif
 
 #include "new"
 
 _GLIBCXX_WEAK_DEFINITION void
 operator delete(void* ptr) _GLIBCXX_USE_NOEXCEPT
 {
-  if (ptr)
-std::free(ptr);
+  std::free(ptr);
 }
Index: libsupc++/del_opnt.cc
===
--- libsupc++/del_opnt.cc   (revision 203101)
+++ libsupc++/del_opnt.cc   (working copy)
@@ -17,19 +17,31 @@
 // Under Section 7 of GPL version 3, you are granted additional
 // permissions described in the GCC Runtime Library Exception, version
 // 3.1, as published by the Free Software Foundation.
 
 // You should have received a copy of the GNU General Public License and
 // a copy of the GCC Runtime Library Exception along with this program;
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // <http://www.gnu.org/licenses/>.
 
 #include 
-#include "new"
 
-extern "C" void free (void *);
+#if !_GLIBCXX_HOSTED
+// A freestanding C runtime may not provide "free" -- but there is no
+// other reasonable way to implement "operator delete".
+namespace std
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+  extern "C" void free(void*);
+_GLIBCXX_END_NAMESPACE_VERSION
+} // namespace
+#else
+# include 
+#endif
+
+#include "new"
 
 _GLIBCXX_WEAK_DEFINITION void
 operator delete (void *ptr, const std::nothrow_t&) _GLIBCXX_USE_NOEXCEPT
 {
-  free (ptr);
+  std::free(ptr);
 }

Re: operator new returns nonzero

2013-10-02 Thread Marc Glisse


New version after Jakub's comment, bootstrap and testsuite on x86_64.

2013-10-03  Marc Glisse  

PR c++/19476
gcc/
* calls.c (alloca_call_p): Use get_callee_fndecl.
* fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new.
* tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp):
Likewise.
(vrp_visit_stmt): Remove duplicated code.

gcc/testsuite/
* g++.dg/tree-ssa/pr19476-1.C: New file.
* g++.dg/tree-ssa/pr19476-2.C: Likewise.
* g++.dg/tree-ssa/pr19476-3.C: Likewise.
* g++.dg/tree-ssa/pr19476-4.C: Likewise.


--
Marc GlisseIndex: calls.c
===
--- calls.c (revision 203101)
+++ calls.c (working copy)
@@ -628,25 +628,24 @@ gimple_alloca_call_p (const_gimple stmt)
 return true;
 
   return false;
 }
 
 /* Return true when exp contains alloca call.  */
 
 bool
 alloca_call_p (const_tree exp)
 {
+  tree fndecl;
   if (TREE_CODE (exp) == CALL_EXPR
-  && TREE_CODE (CALL_EXPR_FN (exp)) == ADDR_EXPR
-  && (TREE_CODE (TREE_OPERAND (CALL_EXPR_FN (exp), 0)) == FUNCTION_DECL)
-  && (special_function_p (TREE_OPERAND (CALL_EXPR_FN (exp), 0), 0)
- & ECF_MAY_BE_ALLOCA))
+  && (fndecl = get_callee_fndecl (exp))
+  && (special_function_p (fndecl, 0) & ECF_MAY_BE_ALLOCA))
 return true;
   return false;
 }
 
 /* Return TRUE if FNDECL is either a TM builtin or a TM cloned
function.  Return FALSE otherwise.  */
 
 static bool
 is_tm_builtin (const_tree fndecl)
 {
Index: fold-const.c
===
--- fold-const.c(revision 203101)
+++ fold-const.c(working copy)
@@ -16215,21 +16215,29 @@ tree_expr_nonzero_warnv_p (tree t, bool
 case MODIFY_EXPR:
 case BIND_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1),
strict_overflow_p);
 
 case SAVE_EXPR:
   return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0),
strict_overflow_p);
 
 case CALL_EXPR:
-  return alloca_call_p (t);
+  {
+   tree fndecl = get_callee_fndecl (t);
+   if (!fndecl) return false;
+   if (flag_delete_null_pointer_checks && !flag_check_new
+   && DECL_IS_OPERATOR_NEW (fndecl)
+   && !TREE_NOTHROW (fndecl))
+ return true;
+   return alloca_call_p (t);
+  }
 
 default:
   break;
 }
   return false;
 }
 
 /* Return true when T is an address and is known to be nonzero.
Handle warnings about undefined signed overflow.  */
 
Index: testsuite/g++.dg/tree-ssa/pr19476-1.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-1.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-1.C   (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-ccp1" } */
+
+#include 
+
+int f(){
+  return 33 + (0 == new(std::nothrow) int);
+}
+int g(){
+  return 42 + (0 == new int[50]);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */
+/* { dg-final { cleanup-tree-dump "ccp1" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C
___
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Index: testsuite/g++.dg/tree-ssa/pr19476-2.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-2.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-2.C   (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#include 
+
+int f(){
+  int *p = new(std::nothrow) int;
+  return 33 + (0 == p);
+}
+int g(){
+  int *p = new int[50];
+  return 42 + (0 == p);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C
___
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: testsuite/g++.dg/tree-ssa/pr19476-3.C
===
--- testsuite/g++.dg/tree-ssa/pr19476-3.C   (revision 0)
+++ testsuite/g++.dg/tree-ssa/pr19476-3.C   (working copy)
@@ -0,0 +1,11

Re: operator new returns nonzero

2013-10-02 Thread Marc Glisse


On Wed, 2 Oct 2013, Jason Merrill wrote:


On 10/02/2013 08:33 AM, Marc Glisse wrote:

+   if (flag_delete_null_pointer_checks && !flag_check_new


You can't use flag_check_new in language-independent code without moving it 
from c.opt to common.opt.


Thanks, that makes sense and I'll do that, but I am surprised that the 
fortran and java compilers built without complaining.


--
Marc Glisse

Re: Make the 2 versions of delete more similar

2013-10-02 Thread Marc Glisse


On Wed, 2 Oct 2013, Christopher Jefferson wrote:


On 2 October 2013 13:28, Marc Glisse  wrote:

Hello,

I don't understand why those 2 files differ by more than 1 extra argument,
so I am changing that.

Bootstrap and testsuite on x86_64.

2013-10-03  Marc Glisse  

* libsupc++/del_op.cc (operator delete): Don't test for 0 before
free.


Just checking, for the nervous:

Is the plan that this change will not effect any code behaviour (as
correct implementations of free are happy to take a NULL pointer, and
not do anything)?


Yes. As far as I can tell from the logs, there was a massive change from 
if(p)free(p) to just free(p) that missed this file, while the patch on how 
to declare free missed the other.


--
Marc Glisse

Re: vec_cond_expr adjustments

2012-09-28 Thread Marc Glisse


On Fri, 28 Sep 2012, Richard Guenther wrote:


On Fri, Sep 28, 2012 at 12:42 AM, Marc Glisse  wrote:

Hello,

I have been experimenting with generating VEC_COND_EXPR from the front-end,
and these are just a couple things I noticed.

1) optabs.c requires that the first argument of vec_cond_expr be a
comparison, but verify_gimple_assign_ternary only checks is_gimple_condexpr,
like for COND_EXPR. In the long term, it seems better to also allow ssa_name
and vector_cst (thus match the is_gimple_condexpr condition), but for now I
just want to know early if I created an invalid vec_cond_expr.


optabs should be fixed instead, an is_gimple_val condition is implicitely
val != 0.


For vectors, I think it should be val < 0 (with an appropriate cast of val 
to a signed integer vector type if necessary). Or (val & highbit) != 0, 
but that's longer.



The tree.[ch] and gimple-fold.c hunks are ok if tested properly, the
tree-ssa-forwprop.c idea of using TREE_TYPE (cond), too.


Ok, I will retest that way.


I don't like the tree-cfg.c change, instead re-factor optabs.c to
get a decomposed cond for vector_compare_rtx and appropriately
"decompose" a non-comparison-class cond in expand_vec_cond_expr.


So vector_compare_rtx will take as arguments rcode, t_op0, t_op1 instead 
of cond. And in expand_vec_cond_expr, if I have a condition, I pass its 
elements to vector_compare_rtx, and otherwise I use 0 and the code for 
LT_EXPR as the other arguments.



If we for example have

predicate = a < b;
x = predicate ? d : e;
y = predicate ? f : g;

we ideally want to re-use the predicate computation on targets where
that would be optimal (and combine should be able to recover the
case where it is not).


That I don't understand. The vcond instruction implemented by targets 
takes as arguments d, e, cmp, a, b and emits the comparison itself. I 
don't see how I can avoid sending to the targets both (d,e,<,a,b) and 
(f,g,<,a,b). They will notice eventually that aremove one of the two, but I don't see how to do that in optabs.c. Or I 
can compute x = a < b, use x < 0 as the comparison passed to the targets, 
and expect targets (those for which it is true) to recognize that < 0 is 
useless in a vector condition (PR54700), or is useless on a comparison 
result.


Thanks for the comments,

--
Marc Glisse

Constant-fold vector comparisons

2012-09-29 Thread Marc Glisse


Hello,

this patch does 2 things (I should have split it in 2, but the questions 
go together):


1) it handles constant folding of vector comparisons,

2) it fixes another place where vectors are not expected (I'll probably 
wait to have front-end support and testcases to do more of those, but 
there is something to discuss).


I wasn't sure what integer_truep should test exactly. For integer: == 1 or 
!= 0? For vectors: == -1 or < 0? I chose the one that worked best for the 
forwprop case where I used it.


It seems that before this patch, the middle-end didn't know how comparison 
results were encoded (a good reason for VEC_COND_EXPR to require a 
comparison as its first argument). I am using the OpenCL encoding that 
what matters is the high bit of each vector element. I am not quite sure 
what happens for targets (are there any?) that use a different encoding. 
When expanding vcond, they can do the comparison as they like. When 
expanding an isolated comparison, I expect they have to expand it as 
vcond(asomething.



2012-10-01  Marc Glisse  

gcc/
* tree.c (integer_truep): New function.
* tree.h (integer_truep): Declare.
* tree-ssa-forwprop.c (forward_propagate_into_cond): Call it.
Don't use boolean_type_node for vectors.
* fold-const.c (fold_relational_const): Handle VECTOR_CST.

gcc/testsuite/
* gcc.dg/tree-ssa/foldconst-6.c: New testcase.

--
Marc GlisseIndex: gcc/tree.h
===
--- gcc/tree.h  (revision 191850)
+++ gcc/tree.h  (working copy)
@@ -5272,20 +5272,25 @@ extern int integer_zerop (const_tree);
 
 /* integer_onep (tree x) is nonzero if X is an integer constant of value 1.  */
 
 extern int integer_onep (const_tree);
 
 /* integer_all_onesp (tree x) is nonzero if X is an integer constant
all of whose significant bits are 1.  */
 
 extern int integer_all_onesp (const_tree);
 
+/* integer_truep (tree x) is nonzero if X is an integer constant of value 1,
+   or a vector constant of value < 0.  */
+
+extern bool integer_truep (const_tree);
+
 /* integer_pow2p (tree x) is nonzero is X is an integer constant with
exactly one bit 1.  */
 
 extern int integer_pow2p (const_tree);
 
 /* integer_nonzerop (tree x) is nonzero if X is an integer constant
with a nonzero value.  */
 
 extern int integer_nonzerop (const_tree);
 
Index: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 191850)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -564,46 +564,46 @@ forward_propagate_into_cond (gimple_stmt
   enum tree_code code;
   tree name = cond;
   gimple def_stmt = get_prop_source_stmt (name, true, NULL);
   if (!def_stmt || !can_propagate_from (def_stmt))
return 0;
 
   code = gimple_assign_rhs_code (def_stmt);
   if (TREE_CODE_CLASS (code) == tcc_comparison)
tmp = fold_build2_loc (gimple_location (def_stmt),
   code,
-  boolean_type_node,
+  TREE_TYPE (cond),
   gimple_assign_rhs1 (def_stmt),
   gimple_assign_rhs2 (def_stmt));
   else if ((code == BIT_NOT_EXPR
&& TYPE_PRECISION (TREE_TYPE (cond)) == 1)
   || (code == BIT_XOR_EXPR
-  && integer_onep (gimple_assign_rhs2 (def_stmt
+  && integer_truep (gimple_assign_rhs2 (def_stmt
{
  tmp = gimple_assign_rhs1 (def_stmt);
  swap = true;
}
 }
 
   if (tmp
   && is_gimple_condexpr (tmp))
 {
   if (dump_file && tmp)
{
  fprintf (dump_file, "  Replaced '");
  print_generic_expr (dump_file, cond, 0);
  fprintf (dump_file, "' with '");
  print_generic_expr (dump_file, tmp, 0);
  fprintf (dump_file, "'\n");
}
 
-  if (integer_onep (tmp))
+  if (integer_truep (tmp))
gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs2 (stmt));
   else if (integer_zerop (tmp))
gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs3 (stmt));
   else
{
  gimple_assign_set_rhs1 (stmt, unshare_expr (tmp));
  if (swap)
{
  tree t = gimple_assign_rhs2 (stmt);
  gimple_assign_set_rhs2 (stmt, gimple_assign_rhs3 (stmt));
Index: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0)
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-ccp1" } */
+
+typedef long vec __attribute__ ((vector_size (2 * sizeof(long;
+
+

Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector

2012-09-30 Thread Marc Glisse


On Sat, 29 Sep 2012, Eric Botcazou wrote:


this patch lets the compiler try to rewrite:

(vec_concat (vec_select x [a]) (vec_select x [b]))

as:

vec_select x [a b]

or even just "x" if appropriate.

[...]

OK, but:

1) Always add a comment describing the simplification when you add one,

2) The condition:


+   if (GET_MODE (XEXP (trueop0, 0)) == mode
+   && INTVAL (XVECEXP (XEXP (trueop1, 1), 0, 0))
+  - INTVAL (XVECEXP (XEXP (trueop0, 1), 0, 0)) == 1)
+ return XEXP (trueop0, 0);


can be simplified: if GET_MODE (XEXP (trueop0, 0)) == mode, then XEXP
(trueop0, 0) is a 2-element vector so the only possible case is (0,1).
That would probably even be more correct since you don't test CONST_INT_P for
the indices, while the test is done in the VEC_SELECT case.


It looks like I was trying to be clever by replacing 2 understandable 
tests with a single more obscure one, bad idea.



Why not generalizing to all kinds of VEC_SELECTs instead of just scalar ones?


Ok, I changed the patch a bit to handle arbitrary VEC_SELECTs, and moved 
the identity recognition to VEC_SELECT handling (where it belonged). 
Testing with non-scalar VEC_SELECTs was limited though, because they are 
not that easy to generate. Also, the identity case is the only one where 
it actually optimized. To handle more cases, I'd have to look through 
several layers of VEC_SELECTs, which gets a bit complicated (for instance, 
the permutation 0,1,3,2 will appear as a vec_concat of a 
vec_select(v,[0,1]) and a vec_select(vec_select(v,[2,3]),[1,0]), or worse 
with a vec_concat in the middle). It also didn't optimize 3,2,3,2, 
possibly because that meant substituting the same rtx twice (I didn't go 
that far in gdb). Then there is also the vec_duplicate case (I should try 
to replace vec_duplicate with vec_concat in simplify-rtx to see what 
happens...). Still, the identity case is nice to have.


Thanks for your comments.

bootstrap+testsuite on x86_64-linux-gnu with default languages.

2012-09-09  Marc Glisse  

gcc/
* simplify-rtx.c (simplify_binary_operation_1) :
Detect the identity.
: Handle VEC_SELECTs from the same vector.

gcc/testsuite/
* gcc.target/i386/vect-rebuild.c: New testcase.


--
Marc GlisseIndex: gcc/testsuite/gcc.target/i386/vect-rebuild.c
===
--- gcc/testsuite/gcc.target/i386/vect-rebuild.c(revision 0)
+++ gcc/testsuite/gcc.target/i386/vect-rebuild.c(revision 0)
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mavx -fno-tree-forwprop" } */
+
+typedef double v2df __attribute__ ((__vector_size__ (16)));
+typedef double v4df __attribute__ ((__vector_size__ (32)));
+
+v2df f1 (v2df x)
+{
+  v2df xx = { x[0], x[1] };
+  return xx;
+}
+
+v4df f2 (v4df x)
+{
+  v4df xx = { x[0], x[1], x[2], x[3] };
+  return xx;
+}
+
+v2df g (v2df x)
+{
+  v2df xx = { x[1], x[0] };
+  return xx;
+}
+
+v2df h (v4df x)
+{
+  v2df xx = { x[2], x[3] };
+  return xx;
+}
+
+/* { dg-final { scan-assembler-not "unpck" } } */
+/* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
+/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */

Property changes on: gcc/testsuite/gcc.target/i386/vect-rebuild.c
___
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c  (revision 191865)
+++ gcc/simplify-rtx.c  (working copy)
@@ -3239,20 +3239,37 @@ simplify_binary_operation_1 (enum rtx_co
  rtx x = XVECEXP (trueop1, 0, i);
 
  gcc_assert (CONST_INT_P (x));
  RTVEC_ELT (v, i) = CONST_VECTOR_ELT (trueop0,
   INTVAL (x));
}
 
  return gen_rtx_CONST_VECTOR (mode, v);
}
 
+ /* Recognize the identity.  */
+ if (GET_MODE (trueop0) == mode)
+   {
+ bool maybe_ident = true;
+ for (int i = 0; i < XVECLEN (trueop1, 0); i++)
+   {
+ rtx j = XVECEXP (trueop1, 0, i);
+ if (!CONST_INT_P (j) || INTVAL (j) != i)
+   {
+ maybe_ident = false;
+ break;
+   }
+   }
+ if (maybe_ident)
+   return trueop0;
+   }
+
  /* If we build {a,b} then permute it, build the result directly.  */
  if (XVECLEN (trueop1, 0) == 2
  && CONST_INT_P (XVECEXP (trueop1, 0, 0))
  && CONST_INT_P (XVECEXP (trueop1, 0, 1))
  && GET_CODE (trueop0) == VEC_CONCAT
  && GET_CODE (XEXP (trueop0,

Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector

2012-10-01 Thread Marc Glisse


On Mon, 1 Oct 2012, Eric Botcazou wrote:


2012-09-09  Marc Glisse  

gcc/
* simplify-rtx.c (simplify_binary_operation_1) :
Detect the identity.
: Handle VEC_SELECTs from the same vector.

gcc/testsuite/
* gcc.target/i386/vect-rebuild.c: New testcase.


OK if you adjust the above date and add the missing space at the end of:

/* Try to merge 2 VEC_SELECTs from the same vector into a single one. */


I was trying to avoid splitting in 2 lines, but ok I'll split.

Thank you for the quick reply,

--
Marc Glisse

Re: vec_cond_expr adjustments

2012-10-01 Thread Marc Glisse


[merging both threads, thanks for the answers]

On Mon, 1 Oct 2012, Richard Guenther wrote:


optabs should be fixed instead, an is_gimple_val condition is implicitely
val != 0.


For vectors, I think it should be val < 0 (with an appropriate cast of val
to a signed integer vector type if necessary). Or (val & highbit) != 0, but
that's longer.


I don't think so.  Throughout the compiler we generally assume false == 0
and anything else is true.  (yes, for FP there is STORE_FLAG_VALUE, but
it's scope is quite limited - if we want sth similar for vectors we'd have to
invent it).


See below.


If we for example have

predicate = a < b;
x = predicate ? d : e;
y = predicate ? f : g;

we ideally want to re-use the predicate computation on targets where
that would be optimal (and combine should be able to recover the
case where it is not).


That I don't understand. The vcond instruction implemented by targets takes
as arguments d, e, cmp, a, b and emits the comparison itself. I don't see
how I can avoid sending to the targets both (d,e,<,a,b) and (f,g,<,a,b).
They will notice eventually that a

But that's a limitation of how vcond works.  ISTR there is/was a vselect
instruction as well, taking a "mask" and two vectors to select from.  At least
that's how vcond works internally for some sub-targets.


vselect seems to only appear in config/. Would it be defined as:
vselect(m,a,b)=(a&m)|(b&~m) ? I would almost be tempted to just define a 
pattern in .md files and let combine handle it, although it might be one 
instruction too long for that (and if m is x=y).

Or would it match the OpenCL select: "For each component of a vector type,
result[i] = if MSB of c[i] is set ? b[i] : a[i]."? Or the pattern with &
and | but with a precondition that the value of each element of the mask
must be 0 or ±1?

I don't find vcond that bad, as long as targets check for trivial 
comparisons in the expansion (what trivial means may depend on the 
platform). It is quite flexible for targets.



On Mon, 1 Oct 2012, Richard Guenther wrote:


tmp = fold_build2_loc (gimple_location (def_stmt),
   code,
-  boolean_type_node,
+  TREE_TYPE (cond),


That's obvious.


Ok, I'll test and commit that line separately.


+  if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST)
+{
+  int count = VECTOR_CST_NELTS (op0);
+  tree *elts =  XALLOCAVEC (tree, count);
+  gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
+
+  for (int i = 0; i < count; i++)
+   {
+ tree elem_type = TREE_TYPE (type);
+ tree elem0 = VECTOR_CST_ELT (op0, i);
+ tree elem1 = VECTOR_CST_ELT (op1, i);
+
+ elts[i] = fold_relational_const (code, elem_type,
+  elem0, elem1);
+
+ if(elts[i] == NULL_TREE)
+   return NULL_TREE;
+
+ elts[i] = fold_negate_const (elts[i], elem_type);


I think you need to invent something new similar to STORE_FLAG_VALUE
or use STORE_FLAG_VALUE here.  With the above you try to map
{0, 1} to {0, -1} which is only true if the operation on the element types
returns {0, 1} (thus, STORE_FLAG_VALUE is 1).


Er, seems to me that constant folding of a scalar comparison in the
front/middle-end only returns {0, 1}.


+/* Return true if EXPR is an integer constant representing true.  */
+
+bool
+integer_truep (const_tree expr)
+{
+  STRIP_NOPS (expr);
+
+  switch (TREE_CODE (expr))
+{
+case INTEGER_CST:
+  /* Do not just test != 0, some places expect the value 1.  */
+  return (TREE_INT_CST_LOW (expr) == 1
+ && TREE_INT_CST_HIGH (expr) == 0);


I wonder if using STORE_FLAG_VALUE is better here (note that it
usually differs for FP vs. integral comparisons and the mode passed
to STORE_FLAG_VALUE is that of the comparison result).


I notice there is already a VECTOR_STORE_FLAG_VALUE (used only once in
simplify-rtx, in a way that seems a bit strange but I'll try to
understand that later). Thanks for showing me this macro, it seems
important indeed. However the STORE_FLAG_VALUE mechanism seems to be for
the RTL level.

It looks like it would be possible to have 3 different semantics:
source code is OpenCL, middle-end whatever we want (0 / 1 for instance),
and back-end is whatever the target wants. The front-end would generate
for a
That said, until we are sure what semantics we want here (forwprop
for example doesn't look at 'comparisons' but operations on special
values and types) I'd prefer to not introduce integer_truep ().


I completely agree that defining the semantics comes first :-)

--
Marc Glisse

abs(long long)

2012-10-02 Thread Marc Glisse


Hello,

here is the patch from PR54686. Several notes:

* I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to 
do, but still) is meant to return a double...
* I still don't like the configure-time _GLIBCXX_USE_INT128, I think it 
should use defined(__SIZEOF_INT128__), which would help other compilers.
* newlib has llabs, according to the doc. It would be good to know what 
newlib is missing for libstdc++ to detect it as C99-ready.


I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu 
and Oleg Endo did a basic check on sh/newlib. I'll do a last check after 
the review (no point if the patch needs changing again).


2012-10-02  Marc Glisse  

PR libstdc++/54686
* include/c_std/cstdlib (abs(long long)): Define fallback whenever
we have long long but possibly not llabs.
(abs(long long)): Use llabs when available.
(abs(__int128)): Define when we have __int128.
(div(long long, long long)): Use lldiv.
* testsuite/26_numerics/headers/cstdlib/54686.c: New file.

--
Marc GlisseIndex: include/c_std/cstdlib
===
--- include/c_std/cstdlib   (revision 191941)
+++ include/c_std/cstdlib   (working copy)
@@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using ::strtoul;
   using ::system;
 #ifdef _GLIBCXX_USE_WCHAR_T
   using ::wcstombs;
   using ::wctomb;
 #endif // _GLIBCXX_USE_WCHAR_T
 
   inline long
   abs(long __i) { return labs(__i); }
 
+#if defined (_GLIBCXX_USE_LONG_LONG) \
+&& (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC)
+  // Fallback version if we don't have llabs but still allow long long.
+  inline long long
+  abs(long long __x) { return __x >= 0 ? __x : -__x; }
+#endif
+
+#if !defined(__STRICT_ANSI__) && defined(_GLIBCXX_USE_INT128)
+  inline __int128
+  abs(__int128 __x) { return __x >= 0 ? __x : -__x; }
+#endif
+
   inline ldiv_t
   div(long __i, long __j) { return ldiv(__i, __j); }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
 #if _GLIBCXX_USE_C99
 
 #undef _Exit
 #undef llabs
@@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::lldiv_t;
 #endif
 #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC
   extern "C" void (_Exit)(int) throw () _GLIBCXX_NORETURN;
 #endif
 #if !_GLIBCXX_USE_C99_DYNAMIC
   using ::_Exit;
 #endif
 
+#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   inline long long
-  abs(long long __x) { return __x >= 0 ? __x : -__x; }
+  abs(long long __x) { return ::llabs (__x); }
 
-#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::llabs;
 
   inline lldiv_t
   div(long long __n, long long __d)
-  { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; }
+  { return ::lldiv (__n, __d); }
 
   using ::lldiv;
 #endif
 
 #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   extern "C" long long int (atoll)(const char *) throw ();
   extern "C" long long int
 (strtoll)(const char * __restrict, char ** __restrict, int) throw ();
   extern "C" unsigned long long int
 (strtoull)(const char * __restrict, char ** __restrict, int) throw ();
@@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace __gnu_cxx
 
 namespace std
 {
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::__gnu_cxx::lldiv_t;
 #endif
   using ::__gnu_cxx::_Exit;
-  using ::__gnu_cxx::abs;
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
+  using ::__gnu_cxx::abs;
   using ::__gnu_cxx::llabs;
   using ::__gnu_cxx::div;
   using ::__gnu_cxx::lldiv;
 #endif
   using ::__gnu_cxx::atoll;
   using ::__gnu_cxx::strtof;
   using ::__gnu_cxx::strtoll;
   using ::__gnu_cxx::strtoull;
   using ::__gnu_cxx::strtold;
 } // namespace std
Index: testsuite/26_numerics/headers/cstdlib/54686.c
===
--- testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
+++ testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
@@ -0,0 +1,32 @@
+// { dg-do compile }
+// { dg-options "-std=c++11" }
+
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.or

abs(long long)

2012-10-02 Thread Marc Glisse


(Forgot libstdc++...)

Hello,

here is the patch from PR54686. Several notes:

* I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but 
still) is meant to return a double...
* I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should 
use defined(__SIZEOF_INT128__), which would help other compilers.
* newlib has llabs, according to the doc. It would be good to know what newlib 
is missing for libstdc++ to detect it as C99-ready.


I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and 
Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review 
(no point if the patch needs changing again).


2012-10-02  Marc Glisse  

PR libstdc++/54686
* include/c_std/cstdlib (abs(long long)): Define fallback whenever
we have long long but possibly not llabs.
(abs(long long)): Use llabs when available.
(abs(__int128)): Define when we have __int128.
(div(long long, long long)): Use lldiv.
* testsuite/26_numerics/headers/cstdlib/54686.c: New file.

--
Marc GlisseIndex: include/c_std/cstdlib
===
--- include/c_std/cstdlib   (revision 191941)
+++ include/c_std/cstdlib   (working copy)
@@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using ::strtoul;
   using ::system;
 #ifdef _GLIBCXX_USE_WCHAR_T
   using ::wcstombs;
   using ::wctomb;
 #endif // _GLIBCXX_USE_WCHAR_T
 
   inline long
   abs(long __i) { return labs(__i); }
 
+#if defined (_GLIBCXX_USE_LONG_LONG) \
+&& (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC)
+  // Fallback version if we don't have llabs but still allow long long.
+  inline long long
+  abs(long long __x) { return __x >= 0 ? __x : -__x; }
+#endif
+
+#if !defined(__STRICT_ANSI__) && defined(_GLIBCXX_USE_INT128)
+  inline __int128
+  abs(__int128 __x) { return __x >= 0 ? __x : -__x; }
+#endif
+
   inline ldiv_t
   div(long __i, long __j) { return ldiv(__i, __j); }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
 #if _GLIBCXX_USE_C99
 
 #undef _Exit
 #undef llabs
@@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::lldiv_t;
 #endif
 #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC
   extern "C" void (_Exit)(int) throw () _GLIBCXX_NORETURN;
 #endif
 #if !_GLIBCXX_USE_C99_DYNAMIC
   using ::_Exit;
 #endif
 
+#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   inline long long
-  abs(long long __x) { return __x >= 0 ? __x : -__x; }
+  abs(long long __x) { return ::llabs (__x); }
 
-#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::llabs;
 
   inline lldiv_t
   div(long long __n, long long __d)
-  { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; }
+  { return ::lldiv (__n, __d); }
 
   using ::lldiv;
 #endif
 
 #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   extern "C" long long int (atoll)(const char *) throw ();
   extern "C" long long int
 (strtoll)(const char * __restrict, char ** __restrict, int) throw ();
   extern "C" unsigned long long int
 (strtoull)(const char * __restrict, char ** __restrict, int) throw ();
@@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace __gnu_cxx
 
 namespace std
 {
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::__gnu_cxx::lldiv_t;
 #endif
   using ::__gnu_cxx::_Exit;
-  using ::__gnu_cxx::abs;
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
+  using ::__gnu_cxx::abs;
   using ::__gnu_cxx::llabs;
   using ::__gnu_cxx::div;
   using ::__gnu_cxx::lldiv;
 #endif
   using ::__gnu_cxx::atoll;
   using ::__gnu_cxx::strtof;
   using ::__gnu_cxx::strtoll;
   using ::__gnu_cxx::strtoull;
   using ::__gnu_cxx::strtold;
 } // namespace std
Index: testsuite/26_numerics/headers/cstdlib/54686.c
===
--- testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
+++ testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
@@ -0,0 +1,32 @@
+// { dg-do compile }
+// { dg-options "-std=c++11" }
+
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not

Re: abs(long long)

2012-10-02 Thread Marc Glisse


On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:


On Tue, Oct 2, 2012 at 3:57 AM, Marc Glisse  wrote:

(Forgot libstdc++...)


Hello,

here is the patch from PR54686. Several notes:

* I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do,
but still) is meant to return a double...


don't we have a core issue about preferring unsigned -> long or long long?


Here I am talking of a library issue: the wording that says that there are 
sufficient overloads such that integer types call the double version of 
math functions. It is fairly obvious that it doesn't apply to abs(long) 
for instance which has an explicit overload. For short or unsigned, I 
still read it as saying that it converts to double...



* I still don't like the configure-time _GLIBCXX_USE_INT128, I think it
should use defined(__SIZEOF_INT128__), which would help other compilers.


Why would that be a problem with the appropriate #define?


The library installed by the system was compiled with g++, and is then 
used with clang++. If we can avoid installing 2 config.h files to make 
that work...



* newlib has llabs, according to the doc. It would be good to know what
newlib is missing for libstdc++ to detect it as C99-ready.

I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu
and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the
review (no point if the patch needs changing again).


In general, I think I have a bias toward using compiler intrinsics,
for which the
compiler already has lot of knowledge about.


More precisely, does that mean you want __builtin_llabs instead of 
::llabs? I thought the compiler knew they were the same.


--
Marc Glisse

Re: abs(long long)

2012-10-02 Thread Marc Glisse


On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:


I understand that it is originally a library issue, but I don't think
it makes sense to resolve it in isolation of that core issue.


They seem mostly orthogonal to me, since the library only uses an informal 
language describing the desired outcome and not the actual overloads 
necessary to achieve it, whereas the core issue is about determining 
priorities for a non-ambiguous overload resolution (if we are talking 
about the same, where Jens Maurer has a proposal).



The library installed by the system was compiled with g++, and is then used
with clang++. If we can avoid installing 2 config.h files to make that
work...


Two things:
 1. that is clearly a clang problem.  I don't think it is libstdc++'s job
 tp try to solve clang's misguided configuration and installation.


Translated: libstdc++ should only ever be used with the very version of 
g++ that was used to compile it. clang++, icpc, sunCC, etc should never 
try to use a libstdc++ compiled with another compiler.


I am not saying libstdc++ should go to great lengths to support other 
compilers, but when it is actually easier to support them than not to...

(testing a macro is easier than a configure test)


 2. I am not sure you understand what I wrote: you can leave the
 use of the current macro the way it is if you appropriately
 define it in terms of what you want to change it to.


I was complaining about the configure-time nature of the macro. If it is 
defined at each compiler run based on __SIZEOF_INT128__, I am happy.



More precisely, does that mean you want __builtin_llabs instead of ::llabs?
I thought the compiler knew they were the same.


Yes. Another reason is that it simplifies the implementation AND if
people want want to do something with the intrinsics' fallback
libstdc++ will gracefully deliver that.


I don't see how that simplifies the implementation, it is several 
characters longer than ::llabs, and we still need to handle llabs. Or do 
you mean: always call __builtin_llabs (whether we have an llabs or not), 
and let the compiler replace it with either (x<0)?-x:x or a library call 
(I assume it never does that unless it has seen a corresponding 
declaration)?


Note that I am happy to let you take over this PR if you like.

--
Marc Glisse

Re: abs(long long)

2012-10-02 Thread Marc Glisse


On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:


Whining on this list about libstdc++ internal macros and your dislike
of them is not going to produce anything today or tomorrow.


Other compilers using libstdc++ was just an extra argument. Even if g++ 
was the only compiler on earth, I would still consider a compile-time test 
superior to a configure test. The macro __SIZEOF_INT128__ was invented 
precisely for this purpose. Yes, that's just more whining ;-)



On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:


On Tue, Oct 2, 2012 at 8:07 AM, Marc Glisse  wrote:


Or do you mean:
always call __builtin_llabs (whether we have an llabs or not), and let the
compiler replace it with either (x<0)?-x:x or a library call (I assume it
never does that unless it has seen a corresponding declaration)?


See what we did in c/cmath and c_global/cmath.


Note that llabs is quite different from asin. __builtin_llabs generates an 
ABS_EXPR, which will later be expanded either to a special instruction or 
to a condition. It never generates a call to llabs (I am not sure exactly 
if Paolo's instructions to use llabs meant he wanted an actual library 
call). __builtin_asin on the other hand is never expanded inline (except 
maybe for special constant input like 0) and expands to a call to the 
library function asin.


Would the attached patch be better, assuming it passes testing? For lldiv, 
there is no builtin (for good reason).


* include/c_std/cstdlib (abs(long long)): Define with
__builtin_llabs when we have long long.
(abs(__int128)): Define when we have __int128.
(div(long long, long long)): Use lldiv.


--
Marc GlisseIndex: cstdlib
===
--- cstdlib (revision 191941)
+++ cstdlib (working copy)
@@ -128,21 +128,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using ::strtod;
   using ::strtol;
   using ::strtoul;
   using ::system;
 #ifdef _GLIBCXX_USE_WCHAR_T
   using ::wcstombs;
   using ::wctomb;
 #endif // _GLIBCXX_USE_WCHAR_T
 
   inline long
-  abs(long __i) { return labs(__i); }
+  abs(long __i) { return __builtin_labs(__i); }
+
+#ifdef _GLIBCXX_USE_LONG_LONG
+  inline long long
+  abs(long long __x) { return __builtin_llabs (__x); }
+#endif
+
+#if !defined(__STRICT_ANSI__) && defined(_GLIBCXX_USE_INT128)
+  inline __int128
+  abs(__int128 __x) { return __x >= 0 ? __x : -__x; }
+#endif
 
   inline ldiv_t
   div(long __i, long __j) { return ldiv(__i, __j); }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
 #if _GLIBCXX_USE_C99
 
 #undef _Exit
@@ -161,29 +171,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::lldiv_t;
 #endif
 #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC
   extern "C" void (_Exit)(int) throw () _GLIBCXX_NORETURN;
 #endif
 #if !_GLIBCXX_USE_C99_DYNAMIC
   using ::_Exit;
 #endif
 
-  inline long long
-  abs(long long __x) { return __x >= 0 ? __x : -__x; }
-
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::llabs;
 
   inline lldiv_t
   div(long long __n, long long __d)
-  { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; }
+  { return ::lldiv (__n, __d); }
 
   using ::lldiv;
 #endif
 
 #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   extern "C" long long int (atoll)(const char *) throw ();
   extern "C" long long int
 (strtoll)(const char * __restrict, char ** __restrict, int) throw ();
   extern "C" unsigned long long int
 (strtoull)(const char * __restrict, char ** __restrict, int) throw ();
@@ -198,21 +205,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace __gnu_cxx
 
 namespace std
 {
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::__gnu_cxx::lldiv_t;
 #endif
   using ::__gnu_cxx::_Exit;
-  using ::__gnu_cxx::abs;
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::__gnu_cxx::llabs;
   using ::__gnu_cxx::div;
   using ::__gnu_cxx::lldiv;
 #endif
   using ::__gnu_cxx::atoll;
   using ::__gnu_cxx::strtof;
   using ::__gnu_cxx::strtoll;
   using ::__gnu_cxx::strtoull;
   using ::__gnu_cxx::strtold;

Re: abs(long long)

2012-10-03 Thread Marc Glisse


On Wed, 3 Oct 2012, Gabriel Dos Reis wrote:


On Tue, Oct 2, 2012 at 1:53 PM, Marc Glisse  wrote:


* include/c_std/cstdlib (abs(long long)): Define with
__builtin_llabs when we have long long.

(abs(__int128)): Define when we have __int128.


This change is OK


Thanks, I'll commit that part only.


(div(long long, long long)): Use lldiv.


not this one.


Ok. Note that glibc's implementation does more than just / and %. Possible 
reasons are:

1) glibc does useless work
2) libstdc++ has a bug
3) there are platforms supported by glibc but not by libstdc++

I choose to believe it is option 3.

--
Marc Glisse

Re: patch to fix

2012-10-03 Thread Marc Glisse


On Wed, 3 Oct 2012, Kenneth Zadeck wrote:


The patch defines a new datatype, a 'wide_int' (defined in
wide-int.[ch], and this datatype will be used to perform all of the
integer constant math in the compiler.  Externally, wide-int is very
similar to double-int except that it does not have the limitation that
math must be done on exactly two HOST_WIDE_INTs.

Internally, a wide_int is a structure that contains a fixed sized
array of HOST_WIDE_INTs, a length field and a mode.  The size of the
array is determined at generation time by dividing the number of bits
of the largest integer supported on the target by the number of bits
in a HOST_WIDE_INT of the host.  Thus, with this format, any length of
integer can be supported on any host.


Hello,

did you consider making the size of wide_int a template parameter, now 
that we are using C++? All with a convenient typedef or macro so it 
doesn't show. I am asking because in vrp I do some arithmetic that 
requires 2*N+1 bits where N is the size of double_int.


--
Marc Glisse

Re: patch to fix

2012-10-04 Thread Marc Glisse


On Wed, 3 Oct 2012, Mike Stump wrote:

On Oct 3, 2012, at 1:47 PM, Marc Glisse  wrote:
did you consider making the size of wide_int a template parameter, now 
that we are using C++? All with a convenient typedef or macro so it 
doesn't show. I am asking because in vrp I do some arithmetic that 
requires 2*N+1 bits where N is the size of double_int.


No, not really.  I'd maybe answer it this way, we put in a type 
(singular) to support all integral constants in all languages on a port. 
Since we only needed 1, there was little need to templatize it.  By 
supporting all integral constants in all languages, there is little need 
for more.  If Ada say, wanted a 2048 bit integer, then, we just have it 
drop off the size it wants someplace and we would mix that in on a 
MAX(….) line, net result, the type we use would then directly support 
the needs of Ada.  If vpr wanted 2x of all existing modes, we could 
simply change the MAX equation and essentially double it; if people need 
that.  This comes as a cost, as the intermediate wide values are fixed 
size allocated (not variable); so these all would be larger.


And this cost could be eliminated by having a template wide_int_ so only 
the places that need it actually use the extra size ;-)



On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

i have already converted the vrp code, so i have some guess at where you are 
talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is twice as 
wide as the inputs.


Yes, exactly.

my library is able to do that with one catch (and this is a big catch): the 
target has to have an integer mode that is twice as big as the mode of the 
operands. The issue is that wide-ints actually carry around the mode of the 
value in order to get the bitsize and precision of the operands (it does not 
have the type, because this code has to both work on the rtl and tree level 
and i generally do not want the signness anyway).


my current code in vrp checks to see if such a mode exists and if it does, it 
produces the product.   if the mode does not exist, it returns bottom.   What 
this means is that for most (many or some) targets that have a TImode, the 
largest thing that particular vrp discover ranges for is a DImode value.   We 
could get around this by defining the next larger mode than what the target 
really needs but i wonder how much mileage you are going to get out of that 
with really large numbers.


This will be for discussion when you submit that next patch, but currently 
VRP handles integers the same size as double_int. In particular, it 
handles __int128. I would be unhappy if introducing a larger bigint type 
in gcc made us regress there.



--
Marc Glisse

Re: patch to fix

2012-10-04 Thread Marc Glisse


On Thu, 4 Oct 2012, Kenneth Zadeck wrote:


On 10/04/2012 09:17 AM, Marc Glisse wrote:

On Wed, 3 Oct 2012, Mike Stump wrote:

On Oct 3, 2012, at 1:47 PM, Marc Glisse  wrote:
did you consider making the size of wide_int a template parameter, now 
that we are using C++? All with a convenient typedef or macro so it 
doesn't show. I am asking because in vrp I do some arithmetic that 
requires 2*N+1 bits where N is the size of double_int.


No, not really.  I'd maybe answer it this way, we put in a type (singular) 
to support all integral constants in all languages on a port. Since we 
only needed 1, there was little need to templatize it.  By supporting all 
integral constants in all languages, there is little need for more.  If 
Ada say, wanted a 2048 bit integer, then, we just have it drop off the 
size it wants someplace and we would mix that in on a MAX(….) line, net 
result, the type we use would then directly support the needs of Ada.  If 
vpr wanted 2x of all existing modes, we could simply change the MAX 
equation and essentially double it; if people need that.  This comes as a 
cost, as the intermediate wide values are fixed size allocated (not 
variable); so these all would be larger.


And this cost could be eliminated by having a template wide_int_ so only 
the places that need it actually use the extra size ;-)


The space is not really an issue in most places since wide-ints tend to be 
short lived.


You were the one talking of a cost.

However the real question is what are you going to instantiate the template 
on?What we do is look at the target and determine the largest type that 
the target supports and build a wide int type that supports that.how are 
you going to do better?


In a single place in tree-vrp.c in the code that evaluates 
multiplications, I would instantiate the template on the double (possibly 
+1) of the value you selected as large enough for all constants. For all 
the rest, your type is fine.


This will be for discussion when you submit that next patch, but currently 
VRP handles integers the same size as double_int. In particular, it handles 
__int128. I would be unhappy if introducing a larger bigint type in gcc 
made us regress there.


You are only happy now because you do not really understand the world around 
you.


I did not want to go into details, but let me re-phrase: I do not want to 
regress. Currently, hosts with a 64 bit hwi can handle VRP multiplications 
on __int128. If your patch introducing better big integers breaks that, 
that sounds bad to me, since I would expect s/double_int/wide_int/ to just 
work, and using wide_int<2*MAX> would just be a potential simplification 
of the code for later.



Note that VRP is just the one case I am familiar with. Using templates 
should (I haven't checked) be completely trivial and help the next person 
who needs bigger integers for a specific purpose and doesn't want to 
penalize the whole compiler. If the size of wide_int is completely 
irrelevant and we can make it 10 times larger without thinking, I guess 
some numbers showing it would be great (or maybe that's common 
knowledge, then I guess it is fine).



Now those are only some comments from an occasional contributor, not 
reviewer requirements, it is fine to ignore them.


--
Marc Glisse

Re: patch to fix

2012-10-04 Thread Marc Glisse


On Wed, 3 Oct 2012, Kenneth Zadeck wrote:

i have already converted the vrp code, so i have some guess at where you are 
talking about.  (of course correct me if i am wrong).


in the code that computes the range when two variables are multiplied 
together needs to do a multiplication that produces a result that is twice as 
wide as the inputs.


my library is able to do that with one catch (and this is a big catch): the 
target has to have an integer mode that is twice as big as the mode of the 
operands. The issue is that wide-ints actually carry around the mode of the 
value in order to get the bitsize and precision of the operands (it does not 
have the type, because this code has to both work on the rtl and tree level 
and i generally do not want the signness anyway).


Ah, after reading the whole thread, now I understand that it is because 
wide_int carries a mode that it makes little sense making it a template 
(sorry that it took me so long when the information was in your first 
answer). I understand that it would be inconvenient (much longer code) to 
have a base_wide_int that does just the arithmetic and a wrapper that 
contains the mode as well.


Your idea below to define dummy extra modes does bring the template idea 
back to the table though ;-)


my current code in vrp checks to see if such a mode exists and if it does, it 
produces the product.   if the mode does not exist, it returns bottom.   What 
this means is that for most (many or some) targets that have a TImode, the 
largest thing that particular vrp discover ranges for is a DImode value.   We 
could get around this by defining the next larger mode than what the target 
really needs but i wonder how much mileage you are going to get out of that 
with really large numbers.


The current wrapping multiplication code in vrp works with a pair of 
double_int, so it should keep working with a pair of wide_int. I see now 
why wide_int doesn't allow to simplify the code, but it doesn't have to 
break.


--
Marc Glisse

Re: patch to fix

2012-10-05 Thread Marc Glisse


On Thu, 4 Oct 2012, Kenneth Zadeck wrote:


There are a bunch of ways to skin the cat.

1) we can define the extra mode.
2) if we get rid of the mode inside the wide int and replace it with an 
explicit precision and bitsize, then we can just make the size of the buffer 
twice as big as the analysis of the modes indicates.
3) or we can leave your code in a form that uses 2 wide ints.   my current 
patch (which i have not gotten working yet) changes this to use the mul_full 
call, but it could be changed.   It is much simpler that the existing code.


Thanks, we are exactly on the same page :-)

i do not see how templates offer any solution at all.   the wide int code 
really needs to have something valid to indicate the length of the object, 
and if there is no mode that big, the code will ice.


It would be possible to define the regular wide_int taking into account 
only valid modes, and only use the dummy larger modes in very specific 
circumstances, where the template parameter would somehow indicate the 
last mode that may appear in it. This is not a recommendation at all, just 
an explanation of why templates might have had something to do with it.


my personal feeling is that range analysis is quite useful for small integers 
and not so much as the values get larger.   The only really large integer 
constants that you are likely to find in real code are encryption keys,


Note that gcc regularly decides to use unsigned types where the code is 
signed, so a "small" constant like -1 can be huge.


--
Marc Glisse

Re: vec_cond_expr adjustments

2012-10-05 Thread Marc Glisse

ble to easily recover from this intermediate step
with combine (I'm not sure), so a target dependent value may
be prefered.


Being able to optimize it is indeed a key point. Let's try on an example 
(not assuming any specific representation in the middle-end for now). Say 
I write this C/OpenCL code: ((a

The front-end gives to the middle-end: aOn an architecture like sse, neon or altivec where VECTOR_STORE_FLAG_VALUE 
is -1 (well, should be), expansion of (acan also disappear if the vcond instruction only looks at the MSB (x86). 
And we are left in the back-end with ((a

On other architectures, expecting the back-end to simplify everything does 
seem hard. But it isn't obvious how to handle it in the middle end either.

Some other forms we could imagine the middle-end producing:
(aBut then how do we handle for instance sparc, where IIUC comparing 2 
vectors returns an integer, where bits 0, 1, etc of the integer represent 
true/false for the comparisons of elements 0, 1, etc of the vectors (as in 
vec_merge, but not constant)? Defining VECTOR_STORE_FLAG_VALUE is not 
possible since comparisons don't return a vector, but we would still want 
to compute athe usual code for the selection.



If we assume a -1/0 and MSB representation in the middle-end, the 
front-end could just pass ((amoving to the back-end, "nothing" would happen on x86.



Comparing x86, neon and altivec, they all have comparisons that return a 
vector of -1 and 0. On the other hand, they have different selection 
instructions. x86 uses <0, altivec uses !=0 and neon has a bitwise select 
and thus requires exactly -1 or 0. It thus seems to me that we should 
decide in the middle-end that vector comparisons return vectors of -1 and 
0. VEC_COND_EXPR is more complicated. We could for instance require that 
it takes as first argument a vector of -1 and 0 (thus <0, !=0 and the neon 
thing are equivalent). Which would leave to decide what the expansion of 
vec_cond_expr passes to the targets when the first argument is not a 
comparison, between !=0, <0, ==-1 or others (I vote for <0 because of 
opencl). One issue is that targets wouldn't know if it was a dummy 
comparison that can safely be ignored because the other part is the result 
of logical operations on comparisons (thus composed of -1 and 0) or a 
genuine comparison with an arbitrary vector, so a new optimization would 
be needed (in the back-end I guess or we would need an alternate 
instruction to vcond) to detect if a vector is a "signed boolean" vector.
We could instead say that vec_cond_expr really follows OpenCL's semantics 
and looks at the MSB of each element. I am not sure that would change 
much, it would mostly delay the apparition of <0 to RTL expansion time 
(and thus make gimple slightly lighter).




+/* Return true if EXPR is an integer constant representing true.  */
+
+bool
+integer_truep (const_tree expr)
+{
+  STRIP_NOPS (expr);
+
+  switch (TREE_CODE (expr))
+{
+case INTEGER_CST:
+  /* Do not just test != 0, some places expect the value 1.  */
+  return (TREE_INT_CST_LOW (expr) == 1
+ && TREE_INT_CST_HIGH (expr) == 0);



I wonder if using STORE_FLAG_VALUE is better here (note that it
usually differs for FP vs. integral comparisons and the mode passed
to STORE_FLAG_VALUE is that of the comparison result).



I notice there is already a VECTOR_STORE_FLAG_VALUE (used only once in
simplify-rtx, in a way that seems a bit strange but I'll try to
understand that later). Thanks for showing me this macro, it seems
important indeed. However the STORE_FLAG_VALUE mechanism seems to be for
the RTL level.

It looks like it would be possible to have 3 different semantics:
source code is OpenCL, middle-end whatever we want (0 / 1 for instance),
and back-end is whatever the target wants. The front-end would generate
for a

seems like the middle-end uses this for lowering vector compares,
a < b -> { a[0] < b[0] ? -1 : 0, ... }


and for a?b:c : vec_cond_expr(a<0,b,c)


it looks like ?: is not generally handled by tree-vect-generic, so it must
be either not supported by the frontend or lowered therein (ISTR
it is forced to appear as a != {0,...} ? ... : ...)


Not supported by the front-end yet (not even by the gimplifier), I have 
(bad) patches but I can't really finish them before this conversation is 
done.




I think there are quite few places in the middle-end that assume that 
comparisons return a vector of -1/0 and even fewer that vec_cond_expr only 
looks at the MSB of each element. So it is still time to change that if 
you want to. But if we want to change it, I think it should happen now 
before even more vector code gets in (not particularly my patches, I am 
thinking of cilk and others too).



Ok, that's long enough, I need to send it now...

--
Marc Glisse

Re: [C++] Mixed scalar-vector operations

2012-10-05 Thread Marc Glisse



Ping
http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01557.html

On Fri, 21 Sep 2012, Marc Glisse wrote:


Hello,

this patch adds mixed scalar-vector operations to the C++ front-end. It also 
adds a few operators to the C front-end (comparisons in particular). This 
patch is mostly an import from the C front-end (with the maybe_const stuff 
removed).


2012-09-22  Marc Glisse  

PR c++/54427

c/
* c-typeck.c: Include c-common.h.
(enum stv_conv): Moved to c-common.h.
(scalar_to_vector): Moved to c-common.c.
* Make-lang.in: c-typeck.c depends on c-common.h.

c-family/
* c-common.c (scalar_to_vector): Moved from c-typeck.c. Support
more operations.
* c-common.h (enum stv_conv): Moved from c-typeck.c.
(scalar_to_vector): Declare.

cp/
* typeck.c (cp_build_binary_op): Handle mixed scalar-vector
operations.
[LSHIFT_EXPR, RSHIFT_EXPR]: Likewise.

gcc/
* fold-const.c (fold_binary_loc): Use build_zero_cst instead of
build_int_cst for a potential vector.

testsuite/
* c-c++-common/vector-scalar.c: New testcase.
* g++.dg/ext/vector5.C: This is not an error anymore.
* gcc.dg/init-vec-1.c: Move ...
* c-c++-common/init-vec-1.c: ... here. Adapt error message.
* gcc.c-torture/execute/vector-shift1.c: Move ...
* c-c++-common/torture/vector-shift1.c: ... here.
* gcc.dg/scal-to-vec1.c: Move ...
* c-c++-common/scal-to-vec1.c: ... here. Avoid narrowing for
C++11. Adapt error message.
* gcc.dg/convert-vec-1.c: Move ...
* c-c++-common/convert-vec-1.c: ... here.
* gcc.dg/scal-to-vec2.c: Move ...
* c-c++-common/scal-to-vec2.c: ... here.


--
Marc Glisse

Re: [C++] Mixed scalar-vector operations

2012-10-05 Thread Marc Glisse


On Fri, 5 Oct 2012, Jason Merrill wrote:


On 09/21/2012 02:32 PM, Marc Glisse wrote:

+  gcc_assert (TREE_CODE (type0) == VECTOR_TYPE
+ || TREE_CODE (type1) == VECTOR_TYPE);
+  switch (code)
+{
+  case RSHIFT_EXPR:
+  case LSHIFT_EXPR:
+   if (TREE_CODE (type0) == INTEGER_TYPE
+   && TREE_CODE (TREE_TYPE (type1)) == INTEGER_TYPE)


Here you're asserting that one of the types is a vector and then assuming 
that type1 is a vector and type0 is not.  I guess you need to move the 
swapping code out of the switch.


I didn't write this code, but my understanding is the following. Most 
operations (like PLUS_EXPR) want 2 arguments of the same type. 
[LR]SHIFT_EXPR are special and also accept to have a vector as first 
argument and a scalar as second argument (but not the reverse).


If the first argument is scalar (TREE_CODE (type0) == INTEGER_TYPE), then 
the second must be a vector, and we return that the first argument needs 
converting. Otherwise, we don't perform any conversion (return 
stv_nothing).


cp_build_binary_op has special code for *SHIFT_EXPR for the case of a 
vector and a scalar in this order.


(I don't know why it was decided that *SHIFT_EXPR would be special that 
way, and I don't mind handling it like the other operations if you prefer)



+   error_at (loc, "conversion of scalar to vector "
+  "involves truncation");


These errors should print the types involved.  They also need to be 
suppressed when !(complain & tf_error).


Will do.


+  op0 = convert (TREE_TYPE (type1), op0);
+  op0 = build_vector_from_val (type1, op0);


I don't see anything in cp_build_binary_op that makes sure that the 
VECTOR_TYPE is in type1.


These 2 lines are in a switch in the case where scalar_to_vector returned 
stv_firstarg, meaning that the first arg (op0) is a scalar that needs to 
be converted to a vector of the same type as op1. And some lines above, 
there is:


  type1 = TREE_TYPE (op1);


Am I just missing some comments in the code, or is there something wrong?

I think I should at least change the comment on scalar_to_vector from:

/* Convert scalar to vector for the range of operations.  */

to something like:

/* Determine which of the operands, if any, is a scalar that needs to be
   converted to a vector for the range of operations.  */


And add something in scalar_to_vector about the SHIFT_EXPRs.

Thanks for the comments,

--
Marc Glisse

Re: vec_cond_expr adjustments

2012-10-08 Thread Marc Glisse


On Mon, 8 Oct 2012, Richard Guenther wrote:


VEC_COND_EXPR is more complicated. We could for instance require that it
takes as first argument a vector of -1 and 0 (thus <0, !=0 and the neon
thing are equivalent). Which would leave to decide what the expansion of
vec_cond_expr passes to the targets when the first argument is not a
comparison, between !=0, <0, ==-1 or others (I vote for <0 because of
opencl). One issue is that targets wouldn't know if it was a dummy
comparison that can safely be ignored because the other part is the result
of logical operations on comparisons (thus composed of -1 and 0) or a
genuine comparison with an arbitrary vector, so a new optimization would be
needed (in the back-end I guess or we would need an alternate instruction to
vcond) to detect if a vector is a "signed boolean" vector.
We could instead say that vec_cond_expr really follows OpenCL's semantics
and looks at the MSB of each element. I am not sure that would change much,
it would mostly delay the apparition of <0 to RTL expansion time (and thus
make gimple slightly lighter).


I think we should delay the decision on how to optimize this.  It's indeed
not trivial and the GIMPLE middle-end aggressively forwards feeding
comparisons into the VEC_COND_EXPR expressions already (somewhat
defeating any CSE that might be possible here) in forwprop.


Thanks for going through the long email :-)

What does that imply for the first argument of VEC_COND_EXPR? Currently, 
the expander asserts that it is a comparison, but that is not reflected in 
the gimple checkers.


If we document that VEC_COND_EXPR takes a vector of -1 and 0 (which is the 
case for a comparison), I don't think it prevents from later relaxing that 
to <0 or !=0. But then I don't know how to handle expansion when the 
argument is neither a comparison (vcond) nor a constant (vec_merge? I 
haven't tried but that should be doable), I would have to pass <0 or !=0 
to the target. So is the best choice to document that VEC_COND_EXPR takes 
as first argument a comparison and make gimple checking reflect that? 
(seems sad, but at least that would tell me what I can/can't do)


By the way, since we are documenting comparisons as returning 0 and -1, 
does that bring back the integer_truep predicate?


--
Marc Glisse

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1601 matches

Mail list logo