Re: [PATCH] Don't fold nextafter/nexttoward if -ftrapping-math or -fmath-errno if they produce denormal results (PR c/86420)
On Sat, 7 Jul 2018, Jakub Jelinek wrote: 2018-07-07 Jakub Jelinek PR c/86420 * real.c (real_nextafter): Return true if result is denormal. I have a question on the side: would it be hard / useful, in cases where nextafter may set errno or some exception flag, to fold the result to a constant while keeping the function call (ignoring the value it returns)? To clarify, I mean replace _2 = nextafter(DBL_DENORM_MIN, 0); with nextafter(DBL_DENORM_MIN, 0); _2 = 0; I think we already do that for some other calls, although I can't remember where. The point would be that we have the value of _2 and can keep folding its uses. -- Marc Glisse
Re: [PATCH] Don't fold nextafter/nexttoward if -ftrapping-math or -fmath-errno if they produce denormal results (PR c/86420)
On Sat, 7 Jul 2018, Jakub Jelinek wrote: On Sat, Jul 07, 2018 at 11:55:17AM +0200, Marc Glisse wrote: On Sat, 7 Jul 2018, Jakub Jelinek wrote: 2018-07-07 Jakub Jelinek PR c/86420 * real.c (real_nextafter): Return true if result is denormal. I have a question on the side: would it be hard / useful, in cases where nextafter may set errno or some exception flag, to fold the result to a constant while keeping the function call (ignoring the value it returns)? To clarify, I mean replace _2 = nextafter(DBL_DENORM_MIN, 0); with nextafter(DBL_DENORM_MIN, 0); _2 = 0; I think we already do that for some other calls, although I can't remember where. The point would be that we have the value of _2 and can keep folding its uses. For errno purposes alone that would be possible, but the function is marked #define ATTR_MATHFN_ERRNO (flag_errno_math ? \ ATTR_NOTHROW_LEAF_LIST : ATTR_CONST_NOTHROW_LEAF_LIST) and thus with -ftrapping-math -fno-math-errno I'm afraid we'd immediately DCE the call in the second form (without lhs). That looks like a problem we'll have to fix eventually. But not as part of this patch indeed. -- Marc Glisse
Re: Fold pointer range checks with equal spans
On Fri, 20 Jul 2018, Richard Sandiford wrote: --- gcc/match.pd2018-07-18 18:44:22.565914281 +0100 +++ gcc/match.pd2018-07-20 11:24:33.692045585 +0100 @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (inverse_conditions_p (@0, @2) && element_precision (type) == element_precision (op_type)) (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1))) + +/* For pointers @0 and @2 and unsigned constant offset @1, look for + expressions like: + + A: (@0 + @1 < @2) | (@2 + @1 < @0) + B: (@0 + @1 <= @2) | (@2 + @1 <= @0) + + If pointers are known not to wrap, B checks whether @1 bytes starting at + @0 and @2 do not overlap, while A tests the same thing for @1 + 1 bytes. + A is more efficiently tested as: + + ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2) + + as long as @1 * 2 doesn't overflow. B is the same with @1 replaced + with @1 - 1. */ +(for ior (truth_orif truth_or bit_ior) + (for cmp (le lt) + (simplify + (ior (cmp (pointer_plus:s @0 INTEGER_CST@1) @2) + (cmp (pointer_plus:s @2 @1) @0)) Do you want :c on cmp, in case it appears as @2 > @0 + @1 ? (may need some care with "cmp == LE_EXPR" below) Do you want :s on cmp as well? + (if (!flag_trapv && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) Don't you want TYPE_OVERFLOW_UNDEFINED? +/* Convert the B form to the A form. */ +(with { offset_int off = wi::to_offset (@1) - (cmp == LE_EXPR ? 1 : 0); } + /* Always fails for negative values. */ + (if (wi::min_precision (off, UNSIGNED) * 2 <= TYPE_PRECISION (sizetype)) + /* It doesn't matter whether we use @2 - @0 or @0 - @2, so let +tree_swap_operands_p pick a canonical order. */ + (with { tree ptr1 = @0, ptr2 = @2; + if (tree_swap_operands_p (ptr1, ptr2)) + std::swap (ptr1, ptr2); } + (gt (plus (minus (convert:sizetype { ptr1; }) + (convert:sizetype { ptr2; })) +{ wide_int_to_tree (sizetype, off); }) + { wide_int_to_tree (sizetype, off * 2); } -- Marc Glisse
optimize std::vector move assignment
Hello, I talked about this last year (https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01063.html and thread), this tweaks std::vector move assignment to help gcc generate better code for it. For this code #include #include typedef std::vector V; void f(V&a,V&b){a=std::move(b);} with -O2 -fdump-tree-optimized on powerpc64le-unknown-linux-gnu, we currently have _5 = MEM[(int * &)a_3(D)]; MEM[(int * &)a_3(D)] = 0B; MEM[(int * &)a_3(D) + 8] = 0B; MEM[(int * &)a_3(D) + 16] = 0B; _6 = MEM[(int * &)b_2(D)]; MEM[(int * &)a_3(D)] = _6; MEM[(int * &)b_2(D)] = 0B; _7 = MEM[(int * &)a_3(D) + 8]; _8 = MEM[(int * &)b_2(D) + 8]; MEM[(int * &)a_3(D) + 8] = _8; MEM[(int * &)b_2(D) + 8] = _7; _9 = MEM[(int * &)a_3(D) + 16]; _10 = MEM[(int * &)b_2(D) + 16]; MEM[(int * &)a_3(D) + 16] = _10; MEM[(int * &)b_2(D) + 16] = _9; if (_5 != 0B) with the patch, we go down to _3 = MEM[(const struct _Vector_impl_data &)a_4(D)]._M_start; _5 = MEM[(const struct _Vector_impl_data &)b_2(D)]._M_start; MEM[(struct _Vector_impl_data *)a_4(D)]._M_start = _5; _6 = MEM[(const struct _Vector_impl_data &)b_2(D)]._M_finish; MEM[(struct _Vector_impl_data *)a_4(D)]._M_finish = _6; _7 = MEM[(const struct _Vector_impl_data &)b_2(D)]._M_end_of_storage; MEM[(struct _Vector_impl_data *)a_4(D)]._M_end_of_storage = _7; MEM[(struct _Vector_impl_data *)b_2(D)]._M_start = 0B; MEM[(struct _Vector_impl_data *)b_2(D)]._M_finish = 0B; MEM[(struct _Vector_impl_data *)b_2(D)]._M_end_of_storage = 0B; if (_3 != 0B) removing 2 reads and 3 writes. At -O3 we also get more vectorization. The main idea is to give the compiler more type information. Currently, the only type information the compiler cares about is the type used for a memory read/write. Using std::swap as before, the reads/writes are done on int&. Doing it directly, they are done on _Vector_impl_data::_M_start, a more precise information. Maybe some day the compiler will get stricter and be able to deduce the same information, but not yet. The second point is to rotate { new, old, tmp } in an order that's simpler for the compiler. I was going to remove the swaps and use _M_copy_data directly, but since doing the swaps in a different order works... _M_copy_data is not really needed, we could add a defaulted assignment operator, or remove the move constructor (and call a new _M_clear() from the 2 users), etc. However, it seemed the least intrusive, the least likely to have weird consequences. I didn't add a testcase because I don't see any dg-final scan-tree-dump in the libstdc++ testsuite. The closest would be g++.dg/pr83239.C, g++.dg/vect/pr64410.cc, g++.dg/vect/pr33426-ivdep-4.cc that include , but from previous experience (already with vector), adding libstdc++ testcases to the C++ testsuite is not very popular. Bootstrap+regtest on powerpc64le-unknown-linux-gnu. 2018-07-25 Marc Glisse * include/bits/stl_vector.h (_Vector_impl_data::_M_copy_data): New. (_Vector_impl_data::_M_swap_data): Use _M_copy_data. (vector::_M_move_assign): Reorder the swaps. -- Marc GlisseIndex: libstdc++-v3/include/bits/stl_vector.h === --- libstdc++-v3/include/bits/stl_vector.h (revision 262948) +++ libstdc++-v3/include/bits/stl_vector.h (working copy) @@ -96,25 +96,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER { } #if __cplusplus >= 201103L _Vector_impl_data(_Vector_impl_data&& __x) noexcept : _M_start(__x._M_start), _M_finish(__x._M_finish), _M_end_of_storage(__x._M_end_of_storage) { __x._M_start = __x._M_finish = __x._M_end_of_storage = pointer(); } #endif void + _M_copy_data(_Vector_impl_data const& __x) _GLIBCXX_NOEXCEPT + { + _M_start = __x._M_start; + _M_finish = __x._M_finish; + _M_end_of_storage = __x._M_end_of_storage; + } + + void _M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT { - std::swap(_M_start, __x._M_start); - std::swap(_M_finish, __x._M_finish); - std::swap(_M_end_of_storage, __x._M_end_of_storage); + // Do not use std::swap(_M_start, __x._M_start), etc as it looses + // information used by TBAA. + _Vector_impl_data __tmp; + __tmp._M_copy_data(*this); + _M_copy_data(__x); + __x._M_copy_data(__tmp); } }; struct _Vector_impl : public _Tp_alloc_type, public _Vector_impl_data { _Vector_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Tp_alloc_type()) ) : _Tp_alloc_type() { } @@ -1724,22 +1735,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER #if __cplusplus >= 201103L private: // Constant-time move assignment when source object's memory can be // moved, either because the source's allocator will move too // or because the allocators are equal. void _M_move_assign(vector&& __x, true
Re: optimize std::vector move assignment
On Wed, 25 Jul 2018, Jonathan Wakely wrote: _M_copy_data is not really needed, we could add a defaulted assignment operator, or remove the move constructor (and call a new _M_clear() from the 2 users), etc. However, it seemed the least intrusive, the least likely to have weird consequences. Yes, the alternative would be: --- a/libstdc++-v3/include/bits/stl_vector.h +++ b/libstdc++-v3/include/bits/stl_vector.h @@ -100,14 +100,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER : _M_start(__x._M_start), _M_finish(__x._M_finish), _M_end_of_storage(__x._M_end_of_storage) { __x._M_start = __x._M_finish = __x._M_end_of_storage = pointer(); } + + _Vector_impl_data(const _Vector_impl_data&) = default; + _Vector_impl_data& oeprator=(const _Vector_impl_data&) = default; #endif void _M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT { - std::swap(_M_start, __x._M_start); - std::swap(_M_finish, __x._M_finish); - std::swap(_M_end_of_storage, __x._M_end_of_storage); + _Vector_impl_data __tmp = *this; + *this = __x; + __x = __tmp; Or just std::swap(*this, __x). } }; Your _M_copy_data seems fine. It avoids unintentional copies, because the copy constructor and copy assignment operator remain deleted (at least in C++11). I didn't add a testcase because I don't see any dg-final scan-tree-dump in the libstdc++ testsuite. The closest would be g++.dg/pr83239.C, g++.dg/vect/pr64410.cc, g++.dg/vect/pr33426-ivdep-4.cc that include , but from previous experience (already with vector), adding libstdc++ testcases to the C++ testsuite is not very popular. Yes, C++ tests using std::vector are sometimes a bit fragile. I don't see any reason we can't use scan-tree-dump in the libstdc++ testsuite, if you wanted to add one. We do have other dg-final tests. The others only test for the presence of some name in assembly. But I may try it later. -- Marc Glisse
Re: Share ebo helper throughout lib
On Wed, 25 Jul 2018, François Dumont wrote: It has already been noticed that there are 2 ebo helpers in the lib. Here is a patch to use 1. * include/bits/ebo_helper.h: New. * include/Makefile.am: Add latter. * include/Makefile.in: Regenerate. * include/bits/hashtable_policy.h: Adapt. * include/bits/shared_ptr_base.h: Adapt. Tested under linux x86_64. Ok to commit ? I don't think we support [[no_unique_address]] yet, but assuming we soon will and we enable it also for C++03 (at least with the __attribute__ syntax and/or in system headers), do you know if some similar helper will still be necessary, with a simpler implementation, or if the attribute will magically get rid of it? (I haven't looked at it at all, the answer may be obvious) -- Marc Glisse
Re: [PATCH] Add malloc predictor (PR middle-end/83023).
On Thu, 26 Jul 2018, Martin Liška wrote: Following patch implements new predictors that annotates malloc-like functions. These almost every time return a non-null value. Out of curiosity (the __builtin_expect there doesn't hurt and we don't need to remove it), does it make __builtin_expect unnecessary in the implementation of operator new (libstdc++-v3/libsupc++/new_op.cc)? It looks like while (__builtin_expect ((p = malloc (sz)) == 0, false)) { new_handler handler = std::get_new_handler (); if (! handler) _GLIBCXX_THROW_OR_ABORT(bad_alloc()); handler (); } where being in a loop may trigger opposite heuristics. -- Marc Glisse
Re: [PATCH] Add malloc predictor (PR middle-end/83023).
On Fri, 27 Jul 2018, Martin Liška wrote: So answer is yes, the builtin can be then removed. Good, thanks. While looking at how widely it is going to apply, I noticed that the default, throwing operator new has attribute malloc and everything, but the non-throwing variant declared in doesn't, so it won't benefit from the new predictor. I don't know if there is a good reason for this disparity... -- Marc Glisse
Re: Fold pointer range checks with equal spans
On Tue, 31 Jul 2018, Richard Biener wrote: Also, when @2 == @0 + (@1+1) then the original condition is true but ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2) is not? (sizetype) @0 - (sizetype) (@0 + @1 + 1) + @1 > @1 * 2 -> -1 > @1 * 2 which is false. So I can't really see how you can apply this transform in general (it would be fine for generating alias checks but a bit more pessimistic). But maybe I am missing something? It relies on sizetype being unsigned: (sizetype)-1 > @1 * 2 is true. Hmm, so mathematically this is (@0 - @2) % modreduce + @1 > @1 * 2 then, but I don't want to think too much about this since Marc didn't object here ;) We already transform abs(x)<=3 into (unsigned)x+3u<=6u, that's the usual way we do range checking so I didn't pay much attention to that part. (tempted to say: "I didn't want to think too much about this since Richard was going to do it anyway" ;-) Turning multiple comparisons of the form P + cst CMP Q + cst into a range check on P - Q sounds good (we don't really have to restrict to the case where the range is symmetric). Actually, just turning P + cst CMP Q + cst into P - Q CMP cst should do it, we should already have code to handle range checking on integers (modulo the issue of CSE P-Q and Q-P). But I don't know if a couple :s is sufficient to make this transformation a good canonicalization. If we start from a comparison of pointer_plus, I think it would make sense to use pointer_diff. I believe currently we try to use pointer operations (pointer_plus, pointer_diff, lt) only for related pointers and we cast to some integer type for wilder cases (implementation of std::less in C++ for instance). On the other hand, in an alias check, the 2 pointers are possibly unrelated, so maybe the code emitted for an alias check should be changed. -- Marc Glisse
Re: [PATCH] Remove unnecessary "& 1" in year_month_day_last::day()
On Sun, 5 Nov 2023, Cassio Neri wrote: When year_month_day_last::day() was implemented, Dr. Matthias Kretz realised that the operation "& 1" wasn't necessary but we did not patch it at that time. This patch removes the unnecessary operation. Is there an entry in gcc's bugzilla about having the optimizer handle this kind of optimization? unsigned f(unsigned x){ if(x>=32)__builtin_unreachable(); return 30|(x&1); // --> 30|x } (that optimization would come in addition to your patch, doing the optimization by hand is still a good idea) It looks like the criterion would be a|(b&c) when the possible 1 bits of b are included in the certainly 1 bits of a|c. -- Marc Glisse
Re: [PATCH] match.pd: Use single_use for (T)(A) + CST -> (T)(A + CST) [PR95798]
On Wed, 24 Feb 2021, Jakub Jelinek via Gcc-patches wrote: The following patch adds single_use case which restores these testcases but keeps the testcases the patch meant to improve as is. Hello, I wonder if :s would be sufficient here? I don't have an opinion on which one is better for this particular transformation (don't change the patch because of my comment), we just seem to be getting more and more uses of single_use in match.pd, maybe at some point we need to revisit the meaning of :s or introduce a stronger :S. -- Marc Glisse
Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]
On Fri, 6 Aug 2021, Victor Tong wrote: Thanks for the feedback. Here's the updated pattern: /* X - (X - Y) --> Y */ (simplify (minus (convert1? @0) (convert2? (minus@2 (convert3? @@0) @1))) (if (ANY_INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_UNDEFINED(type) && !TYPE_OVERFLOW_SANITIZED(type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@2)) && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@2)) && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@2)) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0)) && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0)) && TYPE_PRECISION (TREE_TYPE (@2)) <= TYPE_PRECISION (type) && TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)) (convert @1))) I kept the TYPE_OVERFLOW_SANITIZED checks because I saw other patterns that leverage undefined overflows check for it. I think this new pattern shouldn't be applied if overflow sanitizer checks are enabled. why is this testing TREE_TYPE (@0)? I'm checking the type of @0 because I'm concerned that there could be a case where @0's type isn't an integer type with undefined overflow. I tried creating a test case and couldn't seem to create one where this is violated but I kept the checks to avoid causing a regression. If I'm being overcautious and you feel that the type checks on @0 aren't needed, I can remove them. I think the precision check on TREE_TYPE(@0) is needed to avoid truncation cases though. It doesn't matter if @0 has undefined overflow, but it can matter that it be signed (yes, the 2 are correlated...) if it has the same precision as @2. Otherwise (int64_t)(-1u)-(int64_t)((int)(-1u)-0) is definitely not 0 and it has type:int64_t, @2:int, @0:unsigned. Ignoring the sanitizer, the confusing double matching of constants, and restricting to scalars, I think the tightest condition (without vrp) that works is TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2)) || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@2)) && (TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (TREE_TYPE (@2)) || TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@2)) && !TYPE_UNSIGNED (TREE_TYPE (@0)) ) (where implicitly undefined => signed) but of course it is ok to handle only a subset. It is too late for me to think about @0 vs @@0. The patch you posted seems to accept the wrong (int128t)LLONG_MAX-(int128_t)((int)LLONG_MAX-0) -> (int128_t)0 Using ANY_INTEGRAL_TYPE_P allows vectors, and TYPE_PRECISION mustn't be used for vectors (maybe we should add more checking to catch such uses), and they may not be very happy with convert either... Once we'd "inline" nop_convert genmatch would complain about this. Maybe the new transform could be about scalars, and we could restrict the old one to vectors, to simplify the code, but that wouldn't help genmatch with the fact that the pattern x-(x-y) would appear twice. Is it really that bad? It is suspicious, but can be justified. Is someone working on inlining nop_convert? I'd like to avoid breaking someone else's work if that's being worked on right now. I've also added some extra tests to cover this new pattern. I've attached a patch with my latest changes. From: Richard Biener Sent: Wednesday, July 28, 2021 2:59 AM To: Victor Tong Cc: Marc Glisse ; gcc-patches@gcc.gnu.org Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176] On Tue, Jun 29, 2021 at 1:10 AM Victor Tong wrote: Thanks Richard and Marc. I wrote the following test case to compare the outputs of fn1() and fn1NoOpt() below with my extra pattern being applied. I tested the two functions with all of the integers from INT_MIN to INT_MAX. long fn1 (int x) { return 42L - (long)(42 - x); } #pragma GCC push_options #pragma GCC optimize ("O0") long fn1NoOpt (int x) { volatile int y = (42 - x); return 42L - (long)y; } #pragma GCC pop_options int main () { for (long i=INT_MIN; i<=INT_MAX;i++) { auto valNoOpt = fn1NoOpt(i); auto valOpt = fn1(i); if (valNoOpt != valOpt) printf("valOpt=%ld, valNoOpt=%ld\n", valOpt, valNoOpt); } return 0; } I saw that the return values of fn1() and fn1NoOpt() differed when the input was between INT_MIN and INT_MIN+42 inclusive. When passing values in this range to fn1NoOpt(), a signed overflow is triggered which causes the value to differ (undefined behavior). This seems to go in line with what Marc described and I think the transformation is correct in the scenario above. I do think that type casts that result in truncation (i.e. from a higher precision to a lower one) or wit
Re: [PATCH] Implement constant-folding simplifications of reductions.
On Mon, 21 Feb 2022, Roger Sayle wrote: +/* Fold REDUC (@0 op VECTOR_CST) as REDUC (@0) op REDUC (VECTOR_CST). */ +(for reduc (IFN_REDUC_PLUS IFN_REDUC_MAX IFN_REDUC_MIN IFN_REDUC_FMAX +IFN_REDUC_FMIN IFN_REDUC_AND IFN_REDUC_IOR IFN_REDUC_XOR) + op (plus max min IFN_FMAX IFN_FMIN bit_and bit_ior bit_xor) + (simplify (reduc (op @0 VECTOR_CST@1)) +(op (reduc:type @0) (reduc:type @1 I wonder if we need to test flag_associative_math for the 'plus' case, or if the presence of IFN_REDUC_PLUS is enough to justify the possible loss of precision. -- Marc Glisse
Re: [PATCH] tree-optimization/103514 Missing XOR-EQ-AND Optimization
+/* (a & b) ^ (a == b) -> !(a | b) */ +/* (a & b) == (a ^ b) -> !(a | b) */ +(for first_op (bit_xor eq) + second_op (eq bit_xor) + (simplify + (first_op:c (bit_and:c @0 @1) (second_op:c @0 @1)) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) +&& types_match (TREE_TYPE (@0), TREE_TYPE (@1))) +(convert (bit_not (bit_ior @0 @1)) I don't think you need types_match, if both are operands of bit_and, their types must already match. It isn't clear what the INTEGRAL_TYPE_P test is for. Your 2 transformations don't seem that similar to me. The first one requires that a and b have the same type as the result of ==, so they are boolean-like. The second one makes sense for more general integers, but then it looks like it should produce (a|b)==0. It doesn't look like we have a canonical representation between a^b and a!=b for booleans :-( (sorry for the broken thread, I was automatically unsubscribed because mailman doesn't like greylisting) -- Marc Glisse
Re: [Patch] Inline optimization for tanh(x)/sinh(x) -> 1.0/cosh(x)
On Thu, 30 Jan 2020, Vitor Guidi wrote: + /* Simplify tanh (x) / sinh (x) -> 1.0 / cosh (x). */ + (simplify + (rdiv (TANH @0) (SINH @0)) + (rdiv {build_one_cst (type);} (COSH @0))) The existing (simplify (rdiv (SINH:s @0) (COSH:s @0)) (TANH @0)) has :s (which AFAIK are ignored because the output is a single insn) but not this new one, where it would not be ignored. That's not very consistent. -- Marc Glisse
Re: [RFC/PATCH] C++ constexpr vs. floating point exceptions.
On Tue, 21 Sep 2021, Jason Merrill via Gcc-patches wrote: On Tue, Sep 21, 2021 at 4:30 PM Joseph Myers wrote: On Tue, 21 Sep 2021, Jakub Jelinek via Gcc-patches wrote: On Tue, Sep 21, 2021 at 02:15:59PM +0100, Roger Sayle wrote: Can you double check? Integer division by zero is undefined, but isn't floating point division by zero defined by the appropriate IEEE standards? https://eel.is/c++draft/expr.mul#4 doesn't make the division by zero behavior conditional on integral types. C has similar wording. The position for C is that Annex F semantics take precedence over all the ways in which floating-point arithmetic has undefined behavior in the main body of the standard. So floating-point overflow and division by zero are fully defined in the presence of Annex F support, while out-of-range conversions from floating point to integer types produce an unspecified value (not necessarily the same unspecified value for different executions of the conversion in the abstract machine - as discussed in bug 93806, GCC can get that wrong and act as if a single execution of such a conversion in the abstract machine produces more than one result). In C, as specified in Annex F, initializers for floating-point objects with static or thread storage duration are evaluated with exceptions discarded and the default rounding mode in effect; 7.0 / 0.0 is a fully valid initializer for such an object to initialize it to positive infinity. As I understand it, the question for this thread is whether C++ constexpr should have a similar rule to C static initializers (which ought to apply to 1.0 / 3.0, raising inexact, just as much as to 7.0 / 0.0). The C rules seem to be F.8.2 Translation During translation the IEC 60559 default modes are in effect: — The rounding direction mode is rounding to nearest. — The rounding precision mode (if supported) is set so that results are not shortened. — Trapping or stopping (if supported) is disabled on all floating-point exceptions. Recommended practice: The implementation should produce a diagnostic message for each translation-time floating-point exception, other than “inexact”; the implementation should then proceed with the translation of the program. I think following the same rules for C++ would be appropriate in a I agree that looking at the C standard is more interesting, C++ is very bad at specifying anything float related. diagnosing context: warn and continue. In a template argument deduction (SFINAE) context, where errors become silent substitution failures, it's probably better to treat them as non-constant. I am trying to imagine a sfinae example affected by whether 1./0. is constant. Does that mean A<0.,__builtin_inf()> would fail to use the specialization in templatestruct A{}; templatestruct A{}; ? I don't like that, I believe it should use the specialization. With ieee754, 1./0. is perfectly well defined as +inf, the only question is whether it should also set a flag at runtime, which is not relevant in a manifestly consteval context (fetestexcept, etc are not constexpr, that should be enough to catch mistakes). If some user wants to forbid FE_DIVBYZERO, FE_INEXACT, FE_INVALID, FE_OVERFLOW or FE_UNDERFLOW in compile-time operations, that looks like it could be part of a separate compiler flag or pragma, like C's FENV_ROUND can affect the rounding mode in static initializers (of course C++ templates make pragmas less convenient). -- Marc Glisse
Re: [PATCH] Some further match.pd optimizations with nop_convert (PR tree-optimization/92734)
On Wed, 4 Dec 2019, Jakub Jelinek wrote: --- gcc/match.pd.jj 2019-12-03 10:20:00.244913801 +0100 +++ gcc/match.pd2019-12-03 13:36:01.084435697 +0100 @@ -2159,20 +2159,26 @@ (define_operator_list COND_TERNARY /* A - (A +- B) -> -+ B */ /* A +- (B -+ A) -> +- B */ (simplify -(minus (plus:c @0 @1) @0) -@1) + (minus (nop_convert (plus:c (nop_convert @0) @1)) @0) + (view_convert @1)) I know I introduced nop_convert, and it can be convenient, but IIRC it also has some limitations. int f(int b,unsigned c){ int a=c; int d=a+b; return d-a; } int g(int a, int b){ int d=(unsigned)a+b; return d-a; } int h(int b,int a){ int d=a+b; return d-a; } While g and h are properly optimized during forwprop1, f isn't, because nop_convert sees that 'a' comes from a conversion, and only returns d (unlike 'convert?' which would try both a and d). When inside nop_convert we have an operation, say (nop_convert (plus ...)), there is no ambiguity, so we are fine. With just (nop_convert @0), less so. It happens that during EVRP, for some reason (different valuization function?), nop_convert does not match the conversion, and we do optimize f. But that almost looks like an accident. convert? with explicit checks would probably work better for the inner conversion, except that handling the vector view_convert case may become painful. I didn't think too hard about possible fancy tricks like a second nop_convert: (minus (nop_convert (plus:c (nop_convert @0) @1)) (nop_convert @0)) -- Marc Glisse
Re: [PATCH] Some further match.pd optimizations with nop_convert (PR tree-optimization/92734)
On Fri, 6 Dec 2019, Richard Biener wrote: nop_convert sees that 'a' comes from a conversion, and only returns d (unlike 'convert?' which would try both a and d). Maybe I should have formulated it as: nop_convert works kind of like a strip_nops. If use gets more and more we can make nop_convert a first class citizen and allow a? Variant. One reason I did not specifically push for that is that nop_convert is seldom the right condition. It is convenient because it is usually easy enough to check that it is correct, but in most cases one of narrowing / zero-extension / sign-extension also works. Still, it is better to handle just NOPs than no conversion at all, so I guess making that easy is still good. Like the attached (need to adjust docs & rename some functions still) which generalizes [digit]? parsing. This allows you to write (nop_convert? ...) I guess once this is in, we should replace all (most?) 'nop_convert' with 'nop_convert?' (and possibly a digit in some places) and remove the last alternative in the definition of nop_convert. Although that will increase the code size. In case, we could still have both a nop_convert and a strip_nop predicates. Just thinking: (match (nop_convert @0) (convert @0) (if (tree_nop_conversion_p (type, TREE_TYPE (@0) (match (nop_convert @0) (view_convert @0) (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0)) && known_eq (TYPE_VECTOR_SUBPARTS (type), TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))) && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0)) (match (strip_nops @0) (nop_convert? @0)) but that relies on the fact that when there is an optional operation, the machinery first tries with the operation, and then without, the order matters. Maybe being explicit on the priorities is safer (match (strip_nops @0) (nop_convert @0)) (match (strip_nops @0) @0) It works (generated files are unchanged), so I guess I'll push it after polishing. It also extends the 1/2/3 grouping to be able to group like (plus (nop_convert2? @0) (convert2? @1)) so either both will be present or none (meaning that when grouping you can choose the "cheaper" test for one in case you know the conversions will be the same). Nice. -- Marc Glisse
Re: [PATCH] Some further match.pd optimizations with nop_convert (PR tree-optimization/92734)
On Fri, 6 Dec 2019, Richard Biener wrote: Although that will increase the code size. In case, we could still have both a nop_convert and a strip_nop predicates. Just thinking: We should measure it, yes, I hope it won't be too bad ;) In theory making genmatch emitted code more like a graph rather than a tree (with shared leafs) might save us a bit here. Indeed, it isn't worth hacking this specific case. If we really want those savings, making it automatic at a higher level is the right way to go. On Fri, 6 Dec 2019, Richard Biener wrote: @@ -1684,7 +1681,7 @@ (define_operator_list COND_TERNARY /* For equality, this is also true with wrapping overflow. */ (for op (eq ne) (simplify - (op:c (nop_convert@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1)) + (op:c (nop_convert?@3 (plus:c@2 @0 (convert1? @1))) (convert2? @1)) (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) || TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))) Possible future clean-up: those convert1 and convert2 have (strange) associated tree_nop_conversion_p below and look like candidates to become nop_convert. @@ -2159,25 +2156,25 @@ (define_operator_list COND_TERNARY /* A - (A +- B) -> -+ B */ /* A +- (B -+ A) -> +- B */ (simplify - (minus (nop_convert (plus:c (nop_convert @0) @1)) @0) + (minus (nop_convert1? (plus:c (nop_convert2? @0) @1)) @0) (view_convert @1)) I was wondering if we could use nop_convert for both (instead of nop_convert1 and 2), but for constants that wouldn't work, so this looks good. -- Marc Glisse
Re: [PATCH] Canonicalize fancy ways of expressing blend operation into COND_EXPR (PR tree-optimization/92834)
On Fri, 6 Dec 2019, Jakub Jelinek wrote: --- gcc/match.pd.jj 2019-12-06 14:07:26.877749065 +0100 +++ gcc/match.pd2019-12-06 15:06:08.042953309 +0100 @@ -2697,6 +2697,31 @@ (define_operator_list COND_TERNARY (cmp (minmax @0 INTEGER_CST@1) INTEGER_CST@2) (comb (cmp @0 @2) (cmp @1 @2 +/* Undo fancy way of writing max/min or other ?: expressions, + like a - ((a - b) & -(a < b)), in this case into (a < b) ? b : a. + People normally use ?: and that is what we actually try to optimize. */ +(for cmp (simple_comparison) + (simplify + (minus @0 (bit_and:c (minus @0 @1) + (convert? (negate@4 (convert? (cmp@5 @2 @3)) + (if (INTEGRAL_TYPE_P (type) + && INTEGRAL_TYPE_P (TREE_TYPE (@4)) + && TREE_CODE (TREE_TYPE (@4)) != BOOLEAN_TYPE + && INTEGRAL_TYPE_P (TREE_TYPE (@5)) + && (TYPE_PRECISION (TREE_TYPE (@4)) >= TYPE_PRECISION (type) + || !TYPE_UNSIGNED (TREE_TYPE (@4 + (cond (cmp @2 @3) @1 @0))) I was going to suggest (cond @5 @1 @0) and possibly replacing (cmp@5 @2 @3) with truth_valued_p@5, before remembering that COND_EXPR embeds the comparison, and that not transforming when we don't see the comparison is likely on purpose. Plus, if @5 was in a signed 1-bit type, it may look more like -1 than 1 and break the transformation (is that forbidden as return type of a comparion?). -- Marc Glisse
Re: [PATCH] Optimize x < 0 ? ~y : y to (x >> 31) ^ y in match.pd
On Sun, 23 May 2021, apinski--- via Gcc-patches wrote: +(for cmp (ge lt) +/* x < 0 ? ~y : y into (x >> (prec-1)) ^ y. */ +/* x >= 0 ? ~y : y into ~((x >> (prec-1)) ^ y). */ + (simplify + (cond (cmp @0 integer_zerop) (bit_not @1) @1) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) +&& !TYPE_UNSIGNED (TREE_TYPE (@0)) +&& TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type)) Is there a risk that x is signed char (precision 8) and y is a vector with 8 elements? -- Marc Glisse
Re: [ARM] PR66791: Replace calls to builtin in vmul_n (a, b) intrinsics with __a * __b
On Wed, 26 May 2021, Prathamesh Kulkarni via Gcc-patches wrote: The attached patch removes calls to builtins in vmul_n* (a, b) with __a * __b. I am not familiar with neon, but are __a and __b unsigned here? Otherwise, is vmul_n already undefined in case of overflow? -- Marc Glisse
Re: [PATCH] Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738]
On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: Hi: This patch is about to simplify (view_convert:type ~a) < 0 to (view_convert:type a) >= 0 when type is signed integer. Similar for (view_convert:type ~a) >= 0. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for the trunk? gcc/ChangeLog: PR middle-end/100738 * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE simplification. We already have /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ (for cmp (simple_comparison) scmp (swapped_simple_comparison) (simplify (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) (if (single_use (@2) && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) (scmp @0 (bit_not @1) Would it make sense to try and generalize it a bit, say with (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) (scmp (view_convert:XXX @0) (bit_not @1)) (I still believe that it is a bad idea that SSA_NAMEs are strongly typed, encoding the type in operations would be more convenient, but I think the time for that choice has long gone) -- Marc Glisse
Re: [PATCH] Simplify (view_convert ~a) < 0 to (view_convert a) >= 0 [PR middle-end/100738]
On Fri, 4 Jun 2021, Hongtao Liu via Gcc-patches wrote: On Tue, Jun 1, 2021 at 6:17 PM Marc Glisse wrote: On Tue, 1 Jun 2021, Hongtao Liu via Gcc-patches wrote: Hi: This patch is about to simplify (view_convert:type ~a) < 0 to (view_convert:type a) >= 0 when type is signed integer. Similar for (view_convert:type ~a) >= 0. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for the trunk? gcc/ChangeLog: PR middle-end/100738 * match.pd ((view_convert ~a) < 0 --> (view_convert a) >= 0, (view_convert ~a) >= 0 --> (view_convert a) < 0): New GIMPLE simplification. We already have /* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ (for cmp (simple_comparison) scmp (swapped_simple_comparison) (simplify (cmp (bit_not@2 @0) CONSTANT_CLASS_P@1) (if (single_use (@2) && (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)) (scmp @0 (bit_not @1) Would it make sense to try and generalize it a bit, say with (cmp (nop_convert1? (bit_not @0)) CONSTANT_CLASS_P) (scmp (view_convert:XXX @0) (bit_not @1)) Thanks for your advice, it looks great. And can I use *view_convert1?* instead of *nop_convert1?* here, because the original case is view_convert, and nop_convert would fail to simplify the case. Near the top of match.pd, you can see /* With nop_convert? combine convert? and view_convert? in one pattern plus conditionalize on tree_nop_conversion_p conversions. */ (match (nop_convert @0) (convert @0) (if (tree_nop_conversion_p (type, TREE_TYPE (@0) (match (nop_convert @0) (view_convert @0) (if (VECTOR_TYPE_P (type) && VECTOR_TYPE_P (TREE_TYPE (@0)) && known_eq (TYPE_VECTOR_SUBPARTS (type), TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))) && tree_nop_conversion_p (TREE_TYPE (type), TREE_TYPE (TREE_TYPE (@0)) So at least the intention is that it can handle both NOP_EXPR for scalars and VIEW_CONVERT_EXPR for vectors, and I think we alread use it that way in some places in match.pd, like (simplify (negate (nop_convert? (bit_not @0))) (plus (view_convert @0) { build_each_one_cst (type); })) (simplify (bit_xor:c (nop_convert?:s (bit_not:s @0)) @1) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (bit_not (bit_xor (view_convert @0) @1 (the 'if' seems redundant for this one) (simplify (negate (nop_convert? (negate @1))) (if (!TYPE_OVERFLOW_SANITIZED (type) && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@1))) (view_convert @1))) etc. At some point this got some genmatch help, to handle '?' and numbers, so I don't remember all the details, but following these examples should work. -- Marc Glisse
Re: [PATCH] Fold bswap32(x) != 0 to x != 0 (and related transforms)
On Sun, 18 Jul 2021, Roger Sayle wrote: +(if (GIMPLE || !TREE_SIDE_EFFECTS (@0)) I don't think you need to worry about that, the general genmatch machinery is already supposed to take care of it. All the existing cases in match.pd are about cond_expr, where counting the occurrences of each @i is not reliable. -- Marc Glisse
Re: [PATCH] Fold (X<
On Mon, 26 Jul 2021, Roger Sayle wrote: The one aspect that's a little odd is that each transform is paired with a convert@1 variant, using the efficient match machinery to expose any zero extension to fold-const.c's tree_nonzero_bits functionality. Copying the first transform for context +(for op (bit_ior bit_xor) + (simplify + (op (mult:s@0 @1 INTEGER_CST@2) + (mult:s@3 @1 INTEGER_CST@4)) + (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type) + && (tree_nonzero_bits (@0) & tree_nonzero_bits (@3)) == 0) + (mult @1 +{ wide_int_to_tree (type, wi::to_wide (@2) + wi::to_wide (@4)); }))) + (simplify + (op (mult:s@0 (convert@1 @2) INTEGER_CST@3) + (mult:s@4 (convert@1 @2) INTEGER_CST@5)) + (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type) + && (tree_nonzero_bits (@0) & tree_nonzero_bits (@4)) == 0) + (mult @1 +{ wide_int_to_tree (type, wi::to_wide (@3) + wi::to_wide (@5)); }))) Could you explain how the convert helps exactly? -- Marc Glisse
Re: [PATCH] phiopt: Optimize (x <=> y) cmp z [PR94589]
On Tue, 4 May 2021, Jakub Jelinek via Gcc-patches wrote: 2) the pr94589-2.C testcase should be matching just 12 times each, but runs into operator>=(strong_ordering, unspecified) being defined as (_M_value&1)==_M_value rather than _M_value>=0. When not honoring NaNs, the 2 case should be unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0, but is not a single use but two uses. I'll need to pattern match that case specially. Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0, that could be worth doing already in GIMPLE. -- Marc Glisse
Re: [PATCH] phiopt: Optimize (x <=> y) cmp z [PR94589]
On Thu, 6 May 2021, Jakub Jelinek via Gcc-patches wrote: Though, (x&1) == x is equivalent to both (x&~1)==0 and to x < 2U and from the latter two it isn't obvious which one is better/more canonical. On aarch64 I don't see differences in number of insns nor in their size: 10:13001c00sxtbw0, w0 14:721f781ftst w0, #0xfffe 18:1a9f17e0csetw0, eq // eq = none 1c:d65f03c0ret vs. 20:12001c00and w0, w0, #0xff 24:7100041fcmp w0, #0x1 28:1a9f87e0csetw0, ls // ls = plast 2c:d65f03c0ret On x86_64 same number of insns, but the comparison is shorter (note, the spaceship result is a struct with signed char based enum): 10:31 c0 xor%eax,%eax 12:81 e7 fe 00 00 00 and$0xfe,%edi 18:0f 94 c0sete %al 1b:c3 retq 1c:0f 1f 40 00 nopl 0x0(%rax) vs. 20:31 c0 xor%eax,%eax 22:40 80 ff 01 cmp$0x1,%dil 26:0f 96 c0setbe %al 29:c3 retq Generally, I'd think that the comparison should be better because it will be most common in user code that way and VRP will be able to do the best thing for it. We can probably do it in 2 steps, first something like (for cmp (eq ne) (simplify (cmp (bit_and:c @0 @1) @0) (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); }))) to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a mask 111...000 (I thought we already had a function to detect such masks, or the 000...111, but I can't find them anymore). I agree that the comparison seems preferable, although if X is signed, the way GIMPLE represents types will add an inconvenient cast. And I think VRP already manages to use the bit test to derive a range. -- Marc Glisse
Re: [PATCH] match.pd: Optimize (x & y) == x into (x & ~y) == 0 [PR94589]
On Tue, 11 May 2021, Jakub Jelinek via Gcc-patches wrote: On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote: We can probably do it in 2 steps, first something like (for cmp (eq ne) (simplify (cmp (bit_and:c @0 @1) @0) (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); }))) to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a mask 111...000 (I thought we already had a function to detect such masks, or the 000...111, but I can't find them anymore). Ok, here is the first step then. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Or should it have cmp:c too given that == and != are commutative too? Ah, yes, you are right, good point on the cmp:c, thank you. 2021-05-11 Jakub Jelinek Marc Glisse PR tree-optimization/94589 * match.pd ((X & Y) == Y -> (X & ~Y) == 0, ^ X? (X | Y) == Y -> (X & ~Y) == 0): New GIMPLE simplifications. * gcc.dg/tree-ssa/pr94589-1.c: New test. --- gcc/match.pd.jj 2021-04-27 14:46:56.583716888 +0200 +++ gcc/match.pd2021-05-10 22:31:49.726870421 +0200 @@ -4764,6 +4764,18 @@ (define_operator_list COND_TERNARY (cmp:c (bit_xor:c @0 @1) @0) (cmp @1 { build_zero_cst (TREE_TYPE (@1)); })) +#if GIMPLE + /* (X & Y) == X becomes (X & ~Y) == 0. */ + (simplify + (cmp (bit_and:c @0 @1) @0) + (cmp (bit_and @0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })) + + /* (X | Y) == Y becomes (X & ~Y) == 0. */ + (simplify + (cmp (bit_ior:c @0 @1) @1) + (cmp (bit_and @0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })) +#endif + /* (X ^ C1) op C2 can be rewritten as X op (C1 ^ C2). */ (simplify (cmp (convert?@3 (bit_xor @0 INTEGER_CST@1)) INTEGER_CST@2) --- gcc/testsuite/gcc.dg/tree-ssa/pr94589-1.c.jj2021-05-10 22:36:10.574130179 +0200 +++ gcc/testsuite/gcc.dg/tree-ssa/pr94589-1.c 2021-05-10 22:36:17.789054362 +0200 @@ -0,0 +1,21 @@ +/* PR tree-optimization/94589 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +int +foo (int x) +{ + return (x & 23) == x; +/* { dg-final { scan-tree-dump " & -24;" "optimized" } } */ +/* { dg-final { scan-tree-dump-not " & 23;" "optimized" } } */ +/* { dg-final { scan-tree-dump " == 0" "optimized" } } */ +} + +int +bar (int x) +{ + return (x | 137) != 137; +/* { dg-final { scan-tree-dump " & -138;" "optimized" } } */ +/* { dg-final { scan-tree-dump-not " \\| 137;" "optimized" } } */ +/* { dg-final { scan-tree-dump " != 0" "optimized" } } */ +} Jakub -- Marc Glisse
Re: [PATCH] phiopt: Optimize (x <=> y) cmp z [PR94589]
On Fri, 14 May 2021, Jakub Jelinek via Gcc-patches wrote: On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote: We can probably do it in 2 steps, first something like (for cmp (eq ne) (simplify (cmp (bit_and:c @0 @1) @0) (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); }))) to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a mask 111...000 (I thought we already had a function to detect such masks, or the 000...111, but I can't find them anymore). I agree that the comparison seems preferable, although if X is signed, the way GIMPLE represents types will add an inconvenient cast. And I think VRP already manages to use the bit test to derive a range. I've tried the second step, but it unfortunately regresses +FAIL: gcc.dg/ipa/propbits-2.c scan-tree-dump-not optimized "fail_test" +FAIL: gcc.dg/tree-ssa/loop-42.c scan-tree-dump-not ivcanon "under assumptions " so maybe it is better to keep these cases as the users wrote them. Thank you for trying it, the patch looks good (it would probably also work on GENERIC), but indeed it is just a canonicalization, not necessarily worth breaking other stuff. This seems to point to another missed optimization. Looking at propbits-2.c, if I rewrite the condition in f1 as if ((unsigned)x <= 0xf) I see later in VRP # RANGE [0, 4294967295] NONZERO 15 x.2_1 = (unsigned intD.9) x_4(D); if (x.2_1 <= 15) where we fail to derive a range from the nonzero bits. Maybe VRP expects IPA-CP should have done it already and doesn't bother recomputing it. I understand this may not be a priority though, that's fine. I didn't look at loop-42.c. -- Marc Glisse
Re: libstdc++: Turn memmove to memcpy in vector reallocations
On Tue, 21 Nov 2023, Jonathan Wakely wrote: CC Marc Glisse who added the relocation support. He might recall why we use memmove when all uses are for newly-allocated storage, which cannot overlap the existing storage. Going back a bit: https://gcc.gnu.org/pipermail/gcc-patches/2019-April/520658.html "I think the call to memmove in __relocate_a_1 could probably be memcpy (I don't remember why I chose memmove)" Going back a bit further: https://gcc.gnu.org/pipermail/gcc-patches/2018-September/505800.html "I had to add a special case for trivial types, using memmove, to avoid perf regressions, since relocation takes precedence over the old path that is specialized to call memmove." So the reason seems to be because vector already used memmove before my patch. You can dig further if you want to check why that is ;-) -- Marc Glisse
Re: libstdc++: Speed up push_back
On Thu, 23 Nov 2023, Jonathan Wakely wrote: That's why we need -fsane-operator-new Although the standard forbids it, some of us just provide an inline implementation inline void* operator new(std::size_t n){return malloc(n);} inline void operator delete(void*p)noexcept{free(p);} inline void operator delete(void*p,std::size_t)noexcept{free(p);} (I could certainly add a check to abort if malloc returns 0 or other details, depending on what the application calls for) It used to enable a number of optimizations, for instance in gcc-9 auto f(){ return std::vector(4096); } was optimized to just one call to calloc (someone broke that in gcc-10). Using LTO on libsupc++ is related. I don't know if we want to define "sane" operators new/delete, or just have a flag that promises that we won't try to replace the default ones. -- Marc Glisse
Re: [PATCH 3/3] Match: Add pattern for `(a ? b : 0) | (a ? 0 : c)` into `a ? b : c` [PR103660]
--- a/gcc/match.pd +++ b/gcc/match.pd @@ -2339,6 +2339,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (INTEGRAL_TYPE_P (type)) (bit_and @0 @1))) +/* Fold `(a ? b : 0) | (a ? 0 : c)` into (a ? b : c). +Handle also ^ and + in replacement of `|`. */ +(for cnd (cond vec_cond) + (for op (bit_ior bit_xor plus) + (simplify + (op:c +(cnd:s @0 @00 integer_zerop) +(cnd:s @0 integer_zerop @01)) + (cnd @0 @00 @01 Wouldn't it fall into something more generic like (for cnd (cond vec_cond) (for op (any_binary) (simplify (op (cnd:s @0 @1 @2) (cnd:s @0 @3 @4)) (cnd @0 (op! @1 @3) (op! @2 @4) ? The example given in the doc for the use of '!' is pretty close @smallexample (simplify (plus (vec_cond:s @@0 @@1 @@2) @@3) (vec_cond @@0 (plus! @@1 @@3) (plus! @@2 @@3))) @end smallexample -- Marc Glisse
Re: [PATCH v2] MATCH: Add simplification for MAX and MIN to match.pd [PR109878]
A few ideas in case you want to generalize the transformation (these are not requirements to get your patch in, and this is not a review): On Fri, 19 Jul 2024, Eikansh Gupta wrote: --- a/gcc/match.pd +++ b/gcc/match.pd @@ -4321,6 +4321,32 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) @0 @2))) +/* min (a & CST0, a & CST1) -> a & CST0 IFF CST0 & CST1 == CST0 */ +/* max (a & CST0, a & CST1) -> a & CST0 IFF CST0 & CST1 == CST1 */ +/* If signed a, then both the constants should have same sign. */ +(for minmax (min max) + (simplify + (minmax (bit_and@3 @0 INTEGER_CST@1) (bit_and@4 @0 INTEGER_CST@2)) + (if (TYPE_UNSIGNED (type) +|| (tree_int_cst_sgn (@1) == tree_int_cst_sgn (@2))) +(with { auto andvalue = wi::to_wide (@1) & wi::to_wide (@2); } + (if (andvalue == ((minmax == MIN_EXPR) + ? wi::to_wide (@1) : wi::to_wide (@2))) + @3 + (if (andvalue == ((minmax != MIN_EXPR) +? wi::to_wide (@1) : wi::to_wide (@2))) + @4)) Since max(a&1,a&3) is a&3, I think in the signed case we could also replace max(a&N,a&3) with a&3 if N is 1 | sign-bit (i.e. -1u/2+2). Indeed, either a>=0 and a&N is a&1, or a<0 and a&N < 0 <= a&3. +/* min (a, a & CST) --> a & CST */ +/* max (a, a & CST) --> a */ +(for minmax (min max) + (simplify + (minmax @0 (bit_and@1 @0 INTEGER_CST@2)) Why do you require that @2 be a constant? + (if (TYPE_UNSIGNED(type)) +(if (minmax == MIN_EXPR) + @1 + @0 Do we already have the corresponding transformations for comparisons? a&b <= a --> true (if unsigned) etc Ideally, we would have **1** transformation for max(X,Y) that tries to fold X<=Y and if it folds to true then simplifies to Y. This way the transformations would only need to be written for comparisons, not minmax. -- Marc Glisse
Re: vector conditional expression with scalar arguments
On Sun, 4 Aug 2013, Gerald Pfeifer wrote: On Sat, 13 Jul 2013, Marc Glisse wrote: 2013-07-14 Marc Glisse gcc/cp/ * call.c (build_conditional_expr_1): Handle the case with 1 vector and 2 scalars. Call save_expr before building a vector. * typeck.c (cp_build_binary_op): Check complain before complaining. Shouldn't this be documented somewhere (gcc/doc/*.texi and our web based release notes)? Yes, it should. I had posted this some time ago: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00664.html Here is an updated version that also mentions the new scalar case: 2013-08-23 Marc Glisse * doc/extend.texi (Vector Extensions): Document ?: in C++. -- Marc GlisseIndex: doc/extend.texi === --- doc/extend.texi (revision 201948) +++ doc/extend.texi (working copy) @@ -7023,20 +7023,33 @@ otherwise. Consider the following exampl typedef int v4si __attribute__ ((vector_size (16))); v4si a = @{1,2,3,4@}; v4si b = @{3,2,1,4@}; v4si c; c = a > b; /* The result would be @{0, 0,-1, 0@} */ c = a == b; /* The result would be @{0,-1, 0,-1@} */ @end smallexample +In C++, the ternary operator @code{?:} is available. @code{a?b:c}, where +@code{b} and @code{c} are vectors of the same type and @code{a} is an +integer vector of the same size and number of elements as @code{b} and +@code{c}, computes all three arguments and creates a vector +@code{@{a[0]?b[0]:c[0], a[1]?b[1]:c[1], @dots{}@}}. Note that unlike in +OpenCL, @code{a} is thus interpreted as @code{a != 0} and not @code{a < 0}. +As in the case of binary operations, this syntax is also accepted when +one of @code{b} or @code{c} is a scalar that is then transformed into a +vector. If both @code{b} and @code{c} are scalars and the type of +@code{true?b:c} has the same size as the element type of @code{a}, then +@code{b} and @code{c} are converted to a vector type whose elements have +this type and with the same number of elements as @code{a}. + Vector shuffling is available using functions @code{__builtin_shuffle (vec, mask)} and @code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct a permutation of elements from one or two vectors and return a vector of the same type as the input vector(s). The @var{mask} is an integral vector with the same width (@var{W}) and element count (@var{N}) as the output vector. The elements of the input vectors are numbered in memory ordering of @var{vec0} beginning at 0 and @var{vec1} beginning at @var{N}. The
Re: Request to merge Undefined Behavior Sanitizer in
On Thu, 8 Aug 2013, Joseph S. Myers wrote: On Fri, 26 Jul 2013, Marc Glisse wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57324 (that is using llvm's sanitizer) and for a first patch (unreviewed): http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01466.html (started at http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01402.html ) That patch is OK. Applied at r201953 after retesting, thanks. -- Marc Glisse
Re: RFA: Consider int and same-size short as equivalent vector components
On Mon, 26 Aug 2013, Joern Rennecke wrote: avr currently shows the following failure: FAIL: c-c++-common/vector-scalar.c -Wc++-compat (test for excess errors) Excess errors: /home/amylaar/atmel/4.8/unisrc-mainline/gcc/testsuite/c-c++-common/vector-scalar .c:9:34: error: invalid operands to binary | (have '__vector(8) int' and 'veci') The issue is that when we compute the result of an operatiopn of a veci and an int, we get a __vector(8) int result (int is the same size as short), Maybe we could change that? yet the typechecking later does not accept the vectors as compatible. Fixed by relaxing the latter for the case that int and short are the same size. If you do this, maybe rename the function, or at least expand the comment, to make it clear that it should only be used for comparison operators with vectors? The issue seems larger than just short/int. On x86, (lcompile for a vector of long, with lthat seems wrong. -- Marc Glisse
Re: RFA: Consider int and same-size short as equivalent vector components
On Mon, 26 Aug 2013, Joern Rennecke wrote: Quoting Marc Glisse : The issue seems larger than just short/int. On x86, (l I don't understand what you mean here. Could you post the actual code sample? typedef long vl __attribute__((vector_size(16))); void f(vl l){ (l(btw, my sentence was ambiguous, what I find wrong is the failure to compile, not the type of l -- Marc Glisse
Re: Folding (a ? b : c) op d
Ping: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01624.html and the related: http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00230.html -- Marc Glisse
Re: Folding (a ? b : c) op d
On Fri, 30 Aug 2013, Richard Biener wrote: On Sat, Jun 29, 2013 at 9:02 AM, Marc Glisse wrote: Hello, this patch changes the test deciding whether to fold "op d" with both branches in (a ? b : c) op d. I don't know if the new test is right, it gives what I expect on the new testcase, but I may have missed important cases. Cc: Eric for comments as the author of the previous conditions. Passes bootstrap+testsuite on x86_64-unknown-linux-gnu. 2013-06-29 Marc Glisse PR tree-optimization/57755 gcc/ * fold-const.c (fold_binary_op_with_conditional_arg): Change condition under which the transformation is performed. gcc/testsuite/ * gcc.dg/pr57755.c: New testcase. * gcc.dg/binop-notand1a.c: Remove xfail. * gcc.dg/binop-notand4a.c: Likewise. -- Marc Glisse Index: fold-const.c === --- fold-const.c(revision 200556) +++ fold-const.c(working copy) @@ -6097,26 +6097,33 @@ constant_boolean_node (bool value, tree given here), it is the second argument. TYPE is the type of the original expression. Return NULL_TREE if no simplification is possible. */ static tree fold_binary_op_with_conditional_arg (location_t loc, enum tree_code code, tree type, tree op0, tree op1, tree cond, tree arg, int cond_first_p) { - tree cond_type = cond_first_p ? TREE_TYPE (op0) : TREE_TYPE (op1); - tree arg_type = cond_first_p ? TREE_TYPE (op1) : TREE_TYPE (op0); + /* ??? This transformation is only worthwhile if we don't have + to wrap ARG in a SAVE_EXPR. */ + if (TREE_SIDE_EFFECTS (arg)) +return NULL_TREE; + + /* Avoid exponential recursion. */ + static int depth = 0; + if (depth > 1) +return NULL_TREE; + I don't like this kind of measures ... which one again is the transform that undoes what this function does? There is no undoing here (you may be thinking of http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57286 ), it is just a recursion that gets very slow for nested conditions: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55219 tree test, true_value, false_value; tree lhs = NULL_TREE; tree rhs = NULL_TREE; - enum tree_code cond_code = COND_EXPR; if (TREE_CODE (cond) == COND_EXPR || TREE_CODE (cond) == VEC_COND_EXPR) { test = TREE_OPERAND (cond, 0); true_value = TREE_OPERAND (cond, 1); false_value = TREE_OPERAND (cond, 2); /* If this operand throws an expression, then it does not make sense to try to perform a logical or arithmetic operation involving it. */ @@ -6126,54 +6133,49 @@ fold_binary_op_with_conditional_arg (loc rhs = false_value; } else { tree testtype = TREE_TYPE (cond); test = cond; true_value = constant_boolean_node (true, testtype); false_value = constant_boolean_node (false, testtype); } - if (TREE_CODE (TREE_TYPE (test)) == VECTOR_TYPE) -cond_code = VEC_COND_EXPR; - - /* This transformation is only worthwhile if we don't have to wrap ARG - in a SAVE_EXPR and the operation can be simplified without recursing - on at least one of the branches once its pushed inside the COND_EXPR. */ - if (!TREE_CONSTANT (arg) - && (TREE_SIDE_EFFECTS (arg) - || TREE_CODE (arg) == COND_EXPR || TREE_CODE (arg) == VEC_COND_EXPR - || TREE_CONSTANT (true_value) || TREE_CONSTANT (false_value))) -return NULL_TREE; + tree cond_type = cond_first_p ? TREE_TYPE (op0) : TREE_TYPE (op1); + tree arg_type = cond_first_p ? TREE_TYPE (op1) : TREE_TYPE (op0); arg = fold_convert_loc (loc, arg_type, arg); + ++depth; if (lhs == 0) { true_value = fold_convert_loc (loc, cond_type, true_value); if (cond_first_p) lhs = fold_build2_loc (loc, code, type, true_value, arg); else lhs = fold_build2_loc (loc, code, type, arg, true_value); } if (rhs == 0) { false_value = fold_convert_loc (loc, cond_type, false_value); if (cond_first_p) rhs = fold_build2_loc (loc, code, type, false_value, arg); else rhs = fold_build2_loc (loc, code, type, arg, false_value); } + --depth; /* Check that we have simplified at least one of the branches. */ - if (!TREE_CONSTANT (arg) && !TREE_CONSTANT (lhs) && !TREE_CONSTANT (rhs)) + if (!TREE_CONSTANT (lhs) && !TREE_CONSTANT (rhs)) return NULL_TREE; + enum tree_code cond_code = VECTOR_TYPE_P (TREE_TYPE (test)) +? VEC_COND_EXPR : COND_EXPR; return fold_build3_loc (loc, cond_code, type, test, lhs, rhs); } /* Subroutine of fold() that checks for the addition of +/- 0.0. If !NEGATE, return true if ADDEND is +/-0.0 and, for all X of type TYPE, X + ADDEND is the same
Re: folding (vec_)cond_expr in a binary operation
On Fri, 30 Aug 2013, Richard Biener wrote: On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse wrote: First, the fold-const bit causes an assertion failure (building libjava) in combine_cond_expr_cond, which calls: t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1); and then checks: /* Require that we got a boolean type out if we put one in. */ gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type)); which makes sense... Note that the 2 relevant types are: well, the check is probably old and can be relaxed to also allow all compatible types. Ok. Maybe it could even be removed then, we shouldn't check in every function that calls fold_binary_loc that it returns something of the type that was asked for. Second, the way the forwprop transformation is done, it can be necessary to run the forwprop pass several times in a row, which is not nice. It takes: stmt_cond stmt_binop and produces: stmt_folded stmt_cond2 But one of the arguments of stmt_folded could be an ssa_name defined by a cond_expr, which could now be propagated into stmt_folded (there may be other new possible transformations). However, that cond_expr has already been visited and won't be again. The combine part of the pass uses a PLF to revisit statements, but the forwprop part doesn't have any specific mechanism. forwprop is designed to re-process stmts it has folded to catch cascading effects. Now, as it as both a forward and a backward run you don't catch 2nd-order effects with iterating them. On my TODO is to only do one kind, either forward or backward propagation. My impression is that even internally in the forward part, the reprocessing doesn't really work, but that'll disappear anyway when the forward part disappears. Btw, as for the patch I don't like that you basically feed everything into fold. Yes, I know we do that for conditions because that's quite important and nobody has re-written the condition folding as gimple pattern matcher. I doubt that this transform is important enough to warrant another exception ;) I am not sure what you want here. When I notice the pattern (a?b:c) op d, I need to check whether b op d and c op d are likely to simplify before transforming it to a?(b op d):(c op d). And currently I can't see any way to do that other than feeding (b op d) to fold. Even if/when we get a gimple folder, we will still need to call it in about the same way. -- Marc Glisse
Re: folding (vec_)cond_expr in a binary operation
(attaching the latest version. I only added comments since regtesting it) On Tue, 3 Sep 2013, Richard Biener wrote: Btw, as for the patch I don't like that you basically feed everything into fold. Yes, I know we do that for conditions because that's quite important and nobody has re-written the condition folding as gimple pattern matcher. I doubt that this transform is important enough to warrant another exception ;) I am not sure what you want here. When I notice the pattern (a?b:c) op d, I need to check whether b op d and c op d are likely to simplify before transforming it to a?(b op d):(c op d). And currently I can't see any way to do that other than feeding (b op d) to fold. Even if/when we get a gimple folder, we will still need to call it in about the same way. With a gimple folder you'd avoid building trees. Ah, so the problem is the cost of building those 2 trees? It will indeed be better when we can avoid it, but I really don't expect the cost to be that much... You are testing for a quite complex pattern here - (a?b:c) & (d?e:f). But it is really handled in several steps. IIRC: (a?1:0)&x becomes a?(x&1):0. Since x is d?1:0, x&1 becomes d?1:0. a?(d?1:0):0 is not (yet?) simplified further. Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0 then a?(d?-1:0):0 and finally a?d:0. Each step can also trigger individually. It seems to be that for example + vec c=(a>3)?o:z; + vec d=(b>2)?o:z; + vec e=c&d; would be better suited in the combine phase (you are interested in combining the comparisons). That is, look for a [&|^] b where a and b are [VEC_]COND_EXPRs with the same values. Hmm, I am already looking for a binary op which has at least one operand which is a [VEC_]COND_EXPR, in the combine (=backward) part of tree-ssa-forwprop. Isn't that almost what you are suggesting? Heh, and it's not obvious to me with looking for a minute what this should be simplified to ... (a?x:y)&(b?x:y) doesn't really simplify in general. (so the code and the testcase needs some comments on what you are trying to catch ...) aIndex: tree-ssa-forwprop.c === --- tree-ssa-forwprop.c (revision 202185) +++ tree-ssa-forwprop.c (working copy) @@ -363,23 +363,20 @@ combine_cond_expr_cond (gimple stmt, enu gcc_assert (TREE_CODE_CLASS (code) == tcc_comparison); fold_defer_overflow_warnings (); t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1); if (!t) { fold_undefer_overflow_warnings (false, NULL, 0); return NULL_TREE; } - /* Require that we got a boolean type out if we put one in. */ - gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type)); - /* Canonicalize the combined condition for use in a COND_EXPR. */ t = canonicalize_cond_expr_cond (t); /* Bail out if we required an invariant but didn't get one. */ if (!t || (invariant_only && !is_gimple_min_invariant (t))) { fold_undefer_overflow_warnings (false, NULL, 0); return NULL_TREE; } @@ -1135,20 +1132,131 @@ forward_propagate_comparison (gimple_stm /* Remove defining statements. */ return remove_prop_source_from_use (name); bailout: gsi_next (defgsi); return false; } +/* Combine the binary operation defined in *GSI with one of its arguments + that is a condition, i.e. (A ? B : C) op D becomes A ? (B op D) : (C op D). + Returns 1 if there were any changes made, 2 if cfg-cleanup needs to + run. Else it returns 0. */ + +static int +combine_binop_with_condition (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + tree type = TREE_TYPE (gimple_assign_lhs (stmt)); + enum tree_code code = gimple_assign_rhs_code (stmt); + tree rhs1 = gimple_assign_rhs1 (stmt); + tree rhs2 = gimple_assign_rhs2 (stmt); + gimple def_stmt; + enum tree_code defcode; + bool trueok = false; + bool falseok = false; + tree true_op, false_op; + location_t loc = gimple_location (stmt); + + if (TREE_CODE (rhs1) != SSA_NAME) +goto second_op; + + def_stmt = get_prop_source_stmt (rhs1, true, NULL); + if (!def_stmt || !can_propagate_from (def_stmt)) +goto second_op; + + defcode = gimple_assign_rhs_code (def_stmt); + if (defcode != COND_EXPR && defcode != VEC_COND_EXPR) +goto second_op; + + true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2 (def_stmt), +gimple_assign_rhs2 (stmt)); + false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3 (def_stmt), + gimple_assign_rhs2 (stmt)); + + if (is_gimple_val (true_op)) +trueok = true; + if (is_gimple_val (false_op)) +falseok = true; + /* At least one of them has to simplify, or it isn't worth it. */ + if (!trueok && !falseok) +goto second_op; + if (!trueok) +{ + if (!valid_gimple_rhs_p (true_op)) + goto second_op; + gimple g = gimple_build_assign (make_ssa_name (type, NULL),
operator new returns nonzero
Hello, this patch teaches the compiler that operator new, when it can throw, isn't allowed to return a null pointer. Note that it doesn't work for placement new (that one may have to be handled in the front-end or the inliner), and it will not work on user-defined operator new if they are inlined. I guess it would be possible to teach the inliner to insert an assert_expr or something when inlining such a function, but that's not for now. The code I removed from vrp_visit_stmt was duplicated in stmt_interesting_for_vrp and thus useless. I ran the c,c++ testsuite on a non-bootstrap compiler. I'll do more proper testing when trunk bootstraps again. 2013-09-07 Marc Glisse PR c++/19476 gcc/ * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. -- Marc GlisseIndex: fold-const.c === --- fold-const.c(revision 202351) +++ fold-const.c(working copy) @@ -16171,21 +16171,29 @@ tree_expr_nonzero_warnv_p (tree t, bool case MODIFY_EXPR: case BIND_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1), strict_overflow_p); case SAVE_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0), strict_overflow_p); case CALL_EXPR: - return alloca_call_p (t); + { + tree fn = CALL_EXPR_FN (t); + if (TREE_CODE (fn) != ADDR_EXPR) return false; + tree fndecl = TREE_OPERAND (fn, 0); + if (TREE_CODE (fndecl) != FUNCTION_DECL) return false; + if (DECL_IS_OPERATOR_NEW (fndecl) && !TREE_NOTHROW (fndecl)) + return true; + return alloca_call_p (t); + } default: break; } return false; } /* Return true when T is an address and is known to be nonzero. Handle warnings about undefined signed overflow. */ Index: testsuite/g++.dg/tree-ssa/pr19476-1.C === --- testsuite/g++.dg/tree-ssa/pr19476-1.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-1.C (revision 0) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ccp1" } */ + +#include + +int f(){ + return 33 + (0 == new(std::nothrow) int); +} +int g(){ + return 42 + (0 == new int[50]); +} + +/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */ +/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */ +/* { dg-final { cleanup-tree-dump "ccp1" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: testsuite/g++.dg/tree-ssa/pr19476-2.C === --- testsuite/g++.dg/tree-ssa/pr19476-2.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-2.C (revision 0) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#include + +int f(){ + int *p = new(std::nothrow) int; + return 33 + (0 == p); +} +int g(){ + int *p = new int[50]; + return 42 + (0 == p); +} + +/* { dg-final { scan-tree-dump "return 42" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "return 33" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C ___ Added: svn:eol-style + native Added: svn:keywords + Author Date Id Revision URL Index: tree-vrp.c === --- tree-vrp.c (revision 202351) +++ tree-vrp.c (working copy) @@ -1047,21 +1047,27 @@ gimple_assign_nonzero_warnv_p (gimple st *STRICT_OVERFLOW_P.*/ static bool gimple_stmt_nonzero_warnv_p (gimple stmt, bool *strict_overflow_p) { switch (gimple_code (stmt)) { case GIMPLE_ASSIGN: return gimple_assign_nonzero_warnv_p (stmt, strict_overflow_p); case GIMPLE_CALL: - return gimple_alloca_call_p (stmt); + { + tree fndecl = gimple_call_fndecl (stmt); + if (!fndecl) return false; + if (DECL_IS_OPERATOR_NEW (fndecl) && !TREE_NOTHROW (fndecl)) + return true; + return gimple_alloca_call_p (stmt); + } default: gcc_unreachable (); } } /* Like tree_expr_nonzero_warnv_p, but this function uses value ranges obtained so far. */
Re: operator new returns nonzero
On Sat, 7 Sep 2013, Basile Starynkevitch wrote: On Sat, 2013-09-07 at 12:33 +0200, Marc Glisse wrote: Hello, this patch teaches the compiler that operator new, when it can throw, isn't allowed to return a null pointer. Note that it doesn't work for placement new (that one may have to be handled in the front-end or the inliner), and it will not work on user-defined operator new if they are inlined. I guess it would be possible to teach the inliner to insert an assert_expr or something when inlining such a function, but that's not for now. The code I removed from vrp_visit_stmt was duplicated in stmt_interesting_for_vrp and thus useless. Interesting patch. However, I feel that we should give advanced users the ability to say that their new operator is never returning null... Easy, don't specify noexcept on it and that's what the patch already does, as long as the function is not inlined. In particular, if I define my own new operator which never returns nil, I find strange that it would be less optimized than the system's operator new. Perhaps we need an attribute `nonnullresult' which whould means that. (we already the nonnull attribute for function arguments) There is already a PR about that, linked from this PR. But even if we create this attribute, we will still want to be able to guess that new should have it even if it isn't written. -- Marc Glisse
Re: operator new returns nonzero
On Sat, 7 Sep 2013, Marc Glisse wrote: this patch teaches the compiler that operator new, when it can throw, isn't allowed to return a null pointer. Note that it doesn't work for placement new (that one may have to be handled in the front-end or the inliner), Placement new is a completely different thing. For one, it is nothrow (so the test doesn't apply). But mostly, it is a condition on the argument more than the result. Using the nonnull attribute would make sense, but the compiler doesn't seem clever enough to use that information. The easiest seems to be in the library: --- o/new 2013-09-07 11:11:17.388756320 +0200 +++ i/new 2013-09-07 18:00:32.204797291 +0200 @@ -144,9 +144,9 @@ // Default placement versions of operator new. inline void* operator new(std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT -{ return __p; } +{ if (!__p) __builtin_unreachable(); return __p; } inline void* operator new[](std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT -{ return __p; } +{ if (!__p) __builtin_unreachable(); return __p; } // Default placement versions of operator delete. inline void operator delete (void*, void*) _GLIBCXX_USE_NOEXCEPT { } Though I am not sure what the policy is for this kind of thing. And that's assuming it is indeed undefined to pass a null pointer to it, I don't have a good reference for that. and it will not work on user-defined operator new if they are inlined. But then if that operator new is inlined, it may already contain a line like if(p==0)throw(); which lets the compiler know all it needs. I guess it would be possible to teach the inliner to insert an assert_expr or something when inlining such a function, but that's not for now. The code I removed from vrp_visit_stmt was duplicated in stmt_interesting_for_vrp and thus useless. I ran the c,c++ testsuite on a non-bootstrap compiler. I'll do more proper testing when trunk bootstraps again. 2013-09-07 Marc Glisse PR c++/19476 gcc/ * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. -- Marc Glisse
Re: operator new returns nonzero
On Sat, 7 Sep 2013, Gabriel Dos Reis wrote: On Sat, Sep 7, 2013 at 11:42 AM, Marc Glisse wrote: On Sat, 7 Sep 2013, Marc Glisse wrote: this patch teaches the compiler that operator new, when it can throw, isn't allowed to return a null pointer. Note that it doesn't work for placement new (that one may have to be handled in the front-end or the inliner), Placement new is a completely different thing. For one, it is nothrow (so the test doesn't apply). But mostly, it is a condition on the argument more than the result. Using the nonnull attribute would make sense, but the compiler doesn't seem clever enough to use that information. The easiest seems to be in the library: --- o/new 2013-09-07 11:11:17.388756320 +0200 +++ i/new 2013-09-07 18:00:32.204797291 +0200 @@ -144,9 +144,9 @@ // Default placement versions of operator new. inline void* operator new(std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT -{ return __p; } +{ if (!__p) __builtin_unreachable(); return __p; } inline void* operator new[](std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT -{ return __p; } +{ if (!__p) __builtin_unreachable(); return __p; } // Default placement versions of operator delete. inline void operator delete (void*, void*) _GLIBCXX_USE_NOEXCEPT { } Though I am not sure what the policy is for this kind of thing. And that's assuming it is indeed undefined to pass a null pointer to it, I don't have a good reference for that. and it will not work on user-defined operator new if they are inlined. But then if that operator new is inlined, it may already contain a line like if(p==0)throw(); which lets the compiler know all it needs. I am not sure I like this version of the patch. The 2 patches are really independent, one (in the compiler) for regular operator new, and one (in the library) for the placement version. I mostly like this second patch because it is so easy ;-) But I will be happy if someone posts a better solution. I think the appropriate patch should be in the compiler, not here. C++ has several places where certain parameters cannot have non-null values. For example, the implicit parameter 'this' in non-static member functions. This is another instance. Indeed, I didn't check how gcc handles "this" being non-0. Furthermore, I do think that the compiler should have special nodes for both standard placement new and the global operator new functions, That's one way to do it. Since this is the first time I look at those, I don't really see the advantage compared to the current status, but I trust you. What would you do with this special-node placement new? Keep it as is until after vrp so we can use the !=0 property and then expand it to its first argument? Or expand it early to the equivalent of the library code I wrote above? as I explained in previous messages. I couldn't find them, sorry if they contained information that makes this post irrelevant. -- Marc Glisse
Re: operator new returns nonzero
On Sat, 7 Sep 2013, Mike Stump wrote: On Sep 7, 2013, at 3:33 AM, Marc Glisse wrote: this patch teaches the compiler that operator new, when it can throw, isn't allowed to return a null pointer. You sure: @item -fcheck-new @opindex fcheck-new Check that the pointer returned by @code{operator new} is non-null before attempting to modify the storage allocated. This check is normally unnecessary because the C++ standard specifies that @code{operator new} only returns @code{0} if it is declared @samp{throw()}, in which case the compiler always checks the return value even without this option. In all other cases, when @code{operator new} has a non-empty exception specification, memory exhaustion is signalled by throwing @code{std::bad_alloc}. See also @samp{new (nothrow)}. ? Thanks, I didn't know that option. But it doesn't do the same. -fcheck-new is about the front-end inserting a test !=0 between malloc and the constructor. My patch is about teaching the middle-end that the value is not zero (so even user-coded comparisons with 0 can be simplified). Now flag_check_new should probably disable this optimization... The 3 -fcheck-new testcases in the testsuite probably only pass because they don't have -O2. -- Marc Glisse
Re: operator new returns nonzero
On Sat, 7 Sep 2013, Mike Stump wrote: On Sep 7, 2013, at 12:37 PM, Gabriel Dos Reis wrote: On Sat, Sep 7, 2013 at 2:27 PM, Marc Glisse wrote: On Sat, 7 Sep 2013, Mike Stump wrote: On Sep 7, 2013, at 3:33 AM, Marc Glisse wrote: this patch teaches the compiler that operator new, when it can throw, isn't allowed to return a null pointer. You sure: @item -fcheck-new @opindex fcheck-new Check that the pointer returned by @code{operator new} is non-null before attempting to modify the storage allocated. This check is normally unnecessary because the C++ standard specifies that @code{operator new} only returns @code{0} if it is declared @samp{throw()}, in which case the compiler always checks the return value even without this option. In all other cases, when @code{operator new} has a non-empty exception specification, memory exhaustion is signalled by throwing @code{std::bad_alloc}. See also @samp{new (nothrow)}. ? Thanks, I didn't know that option. But it doesn't do the same. Indeed. Can this throw: void *operator new (long unsigned int s) { return 0; } ? Is this allowed to return 0? I think using this function is illegal. It isn't marked noexcept, so it isn't allowed to return 0. -- Marc Glisse
Re: operator new returns nonzero
On Sat, 7 Sep 2013, Mike Stump wrote: On Sep 7, 2013, at 12:27 PM, Marc Glisse wrote: Now flag_check_new should probably disable this optimization… Yes, this why my point. Ok, here it is (again, no proper testing until bootstrap is fixed) 2013-09-07 Marc Glisse PR c++/19476 gcc/ * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. * g++.dg/tree-ssa/pr19476-3.C: Likewise. -- Marc GlisseIndex: testsuite/g++.dg/tree-ssa/pr19476-1.C === --- testsuite/g++.dg/tree-ssa/pr19476-1.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-1.C (revision 0) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ccp1" } */ + +#include + +int f(){ + return 33 + (0 == new(std::nothrow) int); +} +int g(){ + return 42 + (0 == new int[50]); +} + +/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */ +/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */ +/* { dg-final { cleanup-tree-dump "ccp1" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: testsuite/g++.dg/tree-ssa/pr19476-2.C === --- testsuite/g++.dg/tree-ssa/pr19476-2.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-2.C (revision 0) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#include + +int f(){ + int *p = new(std::nothrow) int; + return 33 + (0 == p); +} +int g(){ + int *p = new int[50]; + return 42 + (0 == p); +} + +/* { dg-final { scan-tree-dump "return 42" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "return 33" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: testsuite/g++.dg/tree-ssa/pr19476-3.C === --- testsuite/g++.dg/tree-ssa/pr19476-3.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-3.C (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fcheck-new -fdump-tree-optimized" } */ + +#include + +int g(){ + return 42 + (0 == new int); +} + +/* { dg-final { scan-tree-dump-not "return 42" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-3.C ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: fold-const.c === --- fold-const.c(revision 202351) +++ fold-const.c(working copy) @@ -16171,21 +16171,31 @@ tree_expr_nonzero_warnv_p (tree t, bool case MODIFY_EXPR: case BIND_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1), strict_overflow_p); case SAVE_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0), strict_overflow_p); case CALL_EXPR: - return alloca_call_p (t); + { + tree fn = CALL_EXPR_FN (t); + if (TREE_CODE (fn) != ADDR_EXPR) return false; + tree fndecl = TREE_OPERAND (fn, 0); + if (TREE_CODE (fndecl) != FUNCTION_DECL) return false; + if (!flag_check_new + && DECL_IS_OPERATOR_NEW (fndecl) + && !TREE_NOTHROW (fndecl)) + return true; + return alloca_call_p (t); + } default: break; } return false; } /* Return true when T is an address and is known to be nonzero. Handle warnings about undefined signed overflow. */ Index: tree-vrp.c === --- tree-vrp.c (revision 202351) +++ tree-vrp.c (working copy) @@ -1047,21 +1047,29 @@ gimple_assign_nonzero_warnv_p (gimple st *STRICT_OVERFLOW_P.*/ static bool gimple_stmt_nonzero_warnv_p (gimple stmt, bool *strict_overflow_p) { switch (gimple_code (stmt)) { case GIMPLE_ASSIGN: return gimple_assign_nonzero_warnv_p (stmt, strict_overflow_p); case
Re: operator new returns nonzero
On Sat, 7 Sep 2013, Marc Glisse wrote: On Sat, 7 Sep 2013, Mike Stump wrote: Can this throw: void *operator new (long unsigned int s) { return 0; } ? Is this allowed to return 0? I think using this function is illegal. It isn't marked noexcept, so it isn't allowed to return 0. And if I compile your code with gcc, I get nice warnings (though I get them twice and the column number is not so good): m.cc: In function 'void* operator new(long unsigned int)': m.cc:2:12: warning: 'operator new' must not return NULL unless it is declared 'throw()' (or -fcheck-new is in effect) [enabled by default] return 0; ^ m.cc: At global scope: m.cc:1:7: warning: unused parameter 's' [-Wunused-parameter] void *operator new (long unsigned int s) { ^ -- Marc Glisse
Re: operator new returns nonzero
On Mon, 9 Sep 2013, Richard Biener wrote: On Sat, Sep 7, 2013 at 11:00 PM, Marc Glisse wrote: On Sat, 7 Sep 2013, Mike Stump wrote: On Sep 7, 2013, at 12:27 PM, Marc Glisse wrote: Now flag_check_new should probably disable this optimization… Yes, this why my point. Ok, here it is (again, no proper testing until bootstrap is fixed) I wonder what happens on targets where 0 is a valid address of an object (stated by !flag_delete_null_pointer_checks)? I am not at all familiar with those targets (I thought you had to write asm to access 0 so the compiler doesn't mess with your code), but it makes sense to me to test (flag_delete_null_pointer_checks && !flag_check_new) instead of just !flag_check_new. Consider the patch changed this way. (we have so many options, I wouldn't be surprised if there is yet another one to check...) -- Marc Glisse
Re: operator new returns nonzero
I have now tested bootstrap+testsuite and there was no regression. 2013-09-07 Marc Glisse PR c++/19476 gcc/ * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. * g++.dg/tree-ssa/pr19476-3.C: Likewise. * g++.dg/tree-ssa/pr19476-4.C: Likewise. -- Marc GlisseIndex: fold-const.c === --- fold-const.c(revision 202413) +++ fold-const.c(working copy) @@ -16171,21 +16171,31 @@ tree_expr_nonzero_warnv_p (tree t, bool case MODIFY_EXPR: case BIND_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1), strict_overflow_p); case SAVE_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0), strict_overflow_p); case CALL_EXPR: - return alloca_call_p (t); + { + tree fn = CALL_EXPR_FN (t); + if (TREE_CODE (fn) != ADDR_EXPR) return false; + tree fndecl = TREE_OPERAND (fn, 0); + if (TREE_CODE (fndecl) != FUNCTION_DECL) return false; + if (flag_delete_null_pointer_checks && !flag_check_new + && DECL_IS_OPERATOR_NEW (fndecl) + && !TREE_NOTHROW (fndecl)) + return true; + return alloca_call_p (t); + } default: break; } return false; } /* Return true when T is an address and is known to be nonzero. Handle warnings about undefined signed overflow. */ Index: testsuite/g++.dg/tree-ssa/pr19476-1.C === --- testsuite/g++.dg/tree-ssa/pr19476-1.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-1.C (working copy) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ccp1" } */ + +#include + +int f(){ + return 33 + (0 == new(std::nothrow) int); +} +int g(){ + return 42 + (0 == new int[50]); +} + +/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */ +/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */ +/* { dg-final { cleanup-tree-dump "ccp1" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C ___ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Index: testsuite/g++.dg/tree-ssa/pr19476-2.C === --- testsuite/g++.dg/tree-ssa/pr19476-2.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-2.C (working copy) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#include + +int f(){ + int *p = new(std::nothrow) int; + return 33 + (0 == p); +} +int g(){ + int *p = new int[50]; + return 42 + (0 == p); +} + +/* { dg-final { scan-tree-dump "return 42" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "return 33" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C ___ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Index: testsuite/g++.dg/tree-ssa/pr19476-3.C === --- testsuite/g++.dg/tree-ssa/pr19476-3.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-3.C (working copy) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fcheck-new -fdump-tree-optimized" } */ + +#include + +int g(){ + return 42 + (0 == new int); +} + +/* { dg-final { scan-tree-dump-not "return 42" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-3.C ___ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Index: testsuite/g++.dg/tree-ssa/pr19476-4.C === --- testsuite/g++.dg/tree-ssa/pr19476-4.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-4.C (working copy) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fno-delete-null-pointer-chec
Re: folding (vec_)cond_expr in a binary operation
Any other comments on this patch? http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00129.html On Tue, 3 Sep 2013, Marc Glisse wrote: (attaching the latest version. I only added comments since regtesting it) On Tue, 3 Sep 2013, Richard Biener wrote: Btw, as for the patch I don't like that you basically feed everything into fold. Yes, I know we do that for conditions because that's quite important and nobody has re-written the condition folding as gimple pattern matcher. I doubt that this transform is important enough to warrant another exception ;) I am not sure what you want here. When I notice the pattern (a?b:c) op d, I need to check whether b op d and c op d are likely to simplify before transforming it to a?(b op d):(c op d). And currently I can't see any way to do that other than feeding (b op d) to fold. Even if/when we get a gimple folder, we will still need to call it in about the same way. With a gimple folder you'd avoid building trees. Ah, so the problem is the cost of building those 2 trees? It will indeed be better when we can avoid it, but I really don't expect the cost to be that much... You are testing for a quite complex pattern here - (a?b:c) & (d?e:f). But it is really handled in several steps. IIRC: (a?1:0)&x becomes a?(x&1):0. Since x is d?1:0, x&1 becomes d?1:0. a?(d?1:0):0 is not (yet?) simplified further. Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0 then a?(d?-1:0):0 and finally a?d:0. Each step can also trigger individually. It seems to be that for example + vec c=(a>3)?o:z; + vec d=(b>2)?o:z; + vec e=c&d; would be better suited in the combine phase (you are interested in combining the comparisons). That is, look for a [&|^] b where a and b are [VEC_]COND_EXPRs with the same values. Hmm, I am already looking for a binary op which has at least one operand which is a [VEC_]COND_EXPR, in the combine (=backward) part of tree-ssa-forwprop. Isn't that almost what you are suggesting? Heh, and it's not obvious to me with looking for a minute what this should be simplified to ... (a?x:y)&(b?x:y) doesn't really simplify in general. (so the code and the testcase needs some comments on what you are trying to catch ...) a -- Marc Glisse
Re: vector conditional expression with scalar arguments
Ping http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01381.html On Fri, 23 Aug 2013, Marc Glisse wrote: On Sun, 4 Aug 2013, Gerald Pfeifer wrote: On Sat, 13 Jul 2013, Marc Glisse wrote: 2013-07-14 Marc Glisse gcc/cp/ * call.c (build_conditional_expr_1): Handle the case with 1 vector and 2 scalars. Call save_expr before building a vector. * typeck.c (cp_build_binary_op): Check complain before complaining. Shouldn't this be documented somewhere (gcc/doc/*.texi and our web based release notes)? Yes, it should. I had posted this some time ago: http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00664.html Here is an updated version that also mentions the new scalar case: 2013-08-23 Marc Glisse * doc/extend.texi (Vector Extensions): Document ?: in C++. -- Marc Glisse
[v3] More noexcept for vectors
Hello, this patch passes bootstrap+testsuite. The guarantees given by the standard on allocators are a bit weird, but I see there is already DR2016 taking care of it. 2013-09-14 Marc Glisse PR libstdc++/58338 * include/bits/stl_vector.h (_Vector_impl::_Vector_impl(_Tp_alloc_type const&), _Vector_impl::_Vector_impl(_Tp_alloc_type&&), _Vector_impl::_M_swap_data, _Vector_base::_Vector_base(const allocator_type&), _Vector_base::_Vector_base(allocator_type&&), _Vector_base::_Vector_base(_Vector_base&&), vector::vector(const allocator_type&), vector::operator[], vector::operator[] const, vector::front, vector::front const, vector::back, vector::back const, vector::pop_back, vector::_M_erase_at_end): Mark as noexcept. (vector::~vector): Remove useless noexcept. -- Marc GlisseIndex: include/bits/stl_vector.h === --- include/bits/stl_vector.h (revision 202588) +++ include/bits/stl_vector.h (working copy) @@ -80,32 +80,32 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER : public _Tp_alloc_type { pointer _M_start; pointer _M_finish; pointer _M_end_of_storage; _Vector_impl() : _Tp_alloc_type(), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } - _Vector_impl(_Tp_alloc_type const& __a) + _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT : _Tp_alloc_type(__a), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } #if __cplusplus >= 201103L - _Vector_impl(_Tp_alloc_type&& __a) + _Vector_impl(_Tp_alloc_type&& __a) noexcept : _Tp_alloc_type(std::move(__a)), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } #endif - void _M_swap_data(_Vector_impl& __x) + void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT { std::swap(_M_start, __x._M_start); std::swap(_M_finish, __x._M_finish); std::swap(_M_end_of_storage, __x._M_end_of_storage); } }; public: typedef _Alloc allocator_type; @@ -117,36 +117,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT { return *static_cast(&this->_M_impl); } allocator_type get_allocator() const _GLIBCXX_NOEXCEPT { return allocator_type(_M_get_Tp_allocator()); } _Vector_base() : _M_impl() { } - _Vector_base(const allocator_type& __a) + _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT : _M_impl(__a) { } _Vector_base(size_t __n) : _M_impl() { _M_create_storage(__n); } _Vector_base(size_t __n, const allocator_type& __a) : _M_impl(__a) { _M_create_storage(__n); } #if __cplusplus >= 201103L - _Vector_base(_Tp_alloc_type&& __a) + _Vector_base(_Tp_alloc_type&& __a) noexcept : _M_impl(std::move(__a)) { } - _Vector_base(_Vector_base&& __x) + _Vector_base(_Vector_base&& __x) noexcept : _M_impl(std::move(__x._M_get_Tp_allocator())) { this->_M_impl._M_swap_data(__x._M_impl); } _Vector_base(_Vector_base&& __x, const allocator_type& __a) : _M_impl(__a) { if (__x.get_allocator() == __a) this->_M_impl._M_swap_data(__x._M_impl); else { @@ -246,21 +246,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER * @brief Default constructor creates no elements. */ vector() : _Base() { } /** * @brief Creates a %vector with no elements. * @param __a An allocator object. */ explicit - vector(const allocator_type& __a) + vector(const allocator_type& __a) _GLIBCXX_NOEXCEPT : _Base(__a) { } #if __cplusplus >= 201103L /** * @brief Creates a %vector with default constructed elements. * @param __n The number of elements to initially create. * @param __a An allocator. * * This constructor fills the %vector with @a __n default * constructed elements. @@ -404,21 +404,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _M_initialize_dispatch(__first, __last, _Integral()); } #endif /** * The dtor only erases the elements, and note that if the * elements themselves are pointers, the pointed-to memory is * not touched in any way. Managing the pointer is the user's * responsibility. */ - ~vector() _GLIBCXX_NOEXCEPT + ~vector() { std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish, _M_get_Tp_allocator()); } /** * @brief %Vector assignment oper
Re: [v3] More noexcept for vectors
On Sat, 14 Sep 2013, Paolo Carlini wrote: ... what about debug-mode (and profile-mode)? Is in principle possible to detect the noexcepts? If we can't exclude that, we should probably change at the same time the special modes too. Otherwise seems just matter of consistency? I was going one file at a time, and the priority is clearly "release"-mode, since this is about performance. I don't think there is anything wrong with debug-mode having a different exception specification (it is already binary-incompatible with the default mode) but I understand why people may want to keep some consistency. I can do vector in the other modes next if you want. -- Marc Glisse
Re: [v3] More noexcept for vectors
I had to separate the constructor that takes an allocator from the default constructor in debug/profile, since in release the noexcept only applies to one of them (and the testsuite asserts that release and debug agree on this). An alternative would be to make the release vector default constructor conditionally noexcept (depending on the allocator). Or to use explicit vector(const Allocator& = Allocator()); also in normal mode instead of splitting it in two. I also removed a noexcept in profile mode where the release mode doesn't have noexcept. Tested, no regression. 2013-09-15 Marc Glisse PR libstdc++/58338 * include/bits/stl_vector.h (_Vector_impl::_Vector_impl(_Tp_alloc_type const&), _Vector_impl::_Vector_impl(_Tp_alloc_type&&), _Vector_impl::_M_swap_data, _Vector_base::_Vector_base(const allocator_type&), _Vector_base::_Vector_base(allocator_type&&), _Vector_base::_Vector_base(_Vector_base&&), vector::vector(const allocator_type&), vector::operator[], vector::operator[] const, vector::front, vector::front const, vector::back, vector::back const, vector::pop_back, vector::_M_erase_at_end): Mark as noexcept. (vector::~vector): Remove useless noexcept. * include/debug/vector (vector::vector()): Split. (vector::vector(const _Allocator&), vector::operator[], vector::operator[] const, vector::front, vector::front const, vector::back, vector::back const, vector::pop_back, _M_requires_reallocation, _M_update_guaranteed_capacity, _M_invalidate_after_nth): Mark as noexcept. (vector::~vector): Remove useless noexcept. * include/profile/vector (vector::vector()): Split. (vector::vector(const _Allocator&), vector::operator[], vector::operator[] const, vector::front, vector::front const, vector::back, vector::back const): Mark as noexcept. (vector::vector(vector&&, const _Allocator&)): Remove wrong noexcept. (vector::~vector): Remove useless noexcept. -- Marc GlisseIndex: include/bits/stl_vector.h === --- include/bits/stl_vector.h (revision 202588) +++ include/bits/stl_vector.h (working copy) @@ -80,32 +80,32 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER : public _Tp_alloc_type { pointer _M_start; pointer _M_finish; pointer _M_end_of_storage; _Vector_impl() : _Tp_alloc_type(), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } - _Vector_impl(_Tp_alloc_type const& __a) + _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT : _Tp_alloc_type(__a), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } #if __cplusplus >= 201103L - _Vector_impl(_Tp_alloc_type&& __a) + _Vector_impl(_Tp_alloc_type&& __a) noexcept : _Tp_alloc_type(std::move(__a)), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } #endif - void _M_swap_data(_Vector_impl& __x) + void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT { std::swap(_M_start, __x._M_start); std::swap(_M_finish, __x._M_finish); std::swap(_M_end_of_storage, __x._M_end_of_storage); } }; public: typedef _Alloc allocator_type; @@ -117,36 +117,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT { return *static_cast(&this->_M_impl); } allocator_type get_allocator() const _GLIBCXX_NOEXCEPT { return allocator_type(_M_get_Tp_allocator()); } _Vector_base() : _M_impl() { } - _Vector_base(const allocator_type& __a) + _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT : _M_impl(__a) { } _Vector_base(size_t __n) : _M_impl() { _M_create_storage(__n); } _Vector_base(size_t __n, const allocator_type& __a) : _M_impl(__a) { _M_create_storage(__n); } #if __cplusplus >= 201103L - _Vector_base(_Tp_alloc_type&& __a) + _Vector_base(_Tp_alloc_type&& __a) noexcept : _M_impl(std::move(__a)) { } - _Vector_base(_Vector_base&& __x) + _Vector_base(_Vector_base&& __x) noexcept : _M_impl(std::move(__x._M_get_Tp_allocator())) { this->_M_impl._M_swap_data(__x._M_impl); } _Vector_base(_Vector_base&& __x, const allocator_type& __a) : _M_impl(__a) { if (__x.get_allocator() == __a) this->_M_impl._M_swap_data(__x._M_impl); else { @@ -246,21 +246,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER * @brief Default constructor creates no elements. */ vector() :
Re: [v3] More noexcept for vectors
On Mon, 16 Sep 2013, Paolo Carlini wrote: On 09/15/2013 11:12 AM, Marc Glisse wrote: I had to separate the constructor that takes an allocator from the default constructor in debug/profile, since in release the noexcept only applies to one of them (and the testsuite asserts that release and debug agree on this). An alternative would be to make the release vector default constructor conditionally noexcept (depending on the allocator). Or to use explicit vector(const Allocator& = Allocator()); also in normal mode instead of splitting it in two. Thanks a lot. Now I'm wondering if we shouldn't really do the latter: the issue is, if I remember correctly, in C++11, at variance with C++98, allocators aren't necessarily default constructible, thus by explicit instantiatiation the user can easily tell whether that constructor is split or not. What do you think? Shouldn't it just be illegal to explicitly instantiate a full class like std::vector? But ok, I'll post that version as soon as I can test it. -- Marc Glisse
Re: [v3] More noexcept for vectors
On Mon, 16 Sep 2013, Jonathan Wakely wrote: Are you sure the noexcept on the destructors is useless? Ok, I am putting it back in. -- Marc Glisse
Re: [v3] More noexcept for vectors
New version that passed testing. 2013-09-16 Marc Glisse PR libstdc++/58338 * include/bits/stl_vector.h (vector::vector(), vector::vector(const allocator_type&)): Merge. (_Vector_impl::_Vector_impl(_Tp_alloc_type const&), _Vector_impl::_Vector_impl(_Tp_alloc_type&&), _Vector_impl::_M_swap_data, _Vector_base::_Vector_base(const allocator_type&), _Vector_base::_Vector_base(allocator_type&&), _Vector_base::_Vector_base(_Vector_base&&), _Vector_base::~_Vector_base, vector::vector(const allocator_type&), vector::operator[], vector::operator[] const, vector::front, vector::front const, vector::back, vector::back const, vector::pop_back, vector::_M_erase_at_end): Mark as noexcept. * include/debug/vector (vector::vector(const _Allocator&), vector::operator[], vector::operator[] const, vector::front, vector::front const, vector::back, vector::back const, vector::pop_back, _M_requires_reallocation, _M_update_guaranteed_capacity, _M_invalidate_after_nth): Mark as noexcept. * include/profile/vector (vector::vector(const _Allocator&), vector::operator[], vector::operator[] const, vector::front, vector::front const, vector::back, vector::back const): Mark as noexcept. (vector::vector(vector&&, const _Allocator&)): Remove wrong noexcept. * testsuite/23_containers/vector/requirements/dr438/assign_neg.cc: Adjust line number. * testsuite/23_containers/vector/requirements/dr438/ constructor_1_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/ constructor_2_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/insert_neg.cc: Likewise. -- Marc GlisseIndex: include/bits/stl_vector.h === --- include/bits/stl_vector.h (revision 202625) +++ include/bits/stl_vector.h (working copy) @@ -80,32 +80,32 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER : public _Tp_alloc_type { pointer _M_start; pointer _M_finish; pointer _M_end_of_storage; _Vector_impl() : _Tp_alloc_type(), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } - _Vector_impl(_Tp_alloc_type const& __a) + _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT : _Tp_alloc_type(__a), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } #if __cplusplus >= 201103L - _Vector_impl(_Tp_alloc_type&& __a) + _Vector_impl(_Tp_alloc_type&& __a) noexcept : _Tp_alloc_type(std::move(__a)), _M_start(0), _M_finish(0), _M_end_of_storage(0) { } #endif - void _M_swap_data(_Vector_impl& __x) + void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT { std::swap(_M_start, __x._M_start); std::swap(_M_finish, __x._M_finish); std::swap(_M_end_of_storage, __x._M_end_of_storage); } }; public: typedef _Alloc allocator_type; @@ -117,53 +117,53 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT { return *static_cast(&this->_M_impl); } allocator_type get_allocator() const _GLIBCXX_NOEXCEPT { return allocator_type(_M_get_Tp_allocator()); } _Vector_base() : _M_impl() { } - _Vector_base(const allocator_type& __a) + _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT : _M_impl(__a) { } _Vector_base(size_t __n) : _M_impl() { _M_create_storage(__n); } _Vector_base(size_t __n, const allocator_type& __a) : _M_impl(__a) { _M_create_storage(__n); } #if __cplusplus >= 201103L - _Vector_base(_Tp_alloc_type&& __a) + _Vector_base(_Tp_alloc_type&& __a) noexcept : _M_impl(std::move(__a)) { } - _Vector_base(_Vector_base&& __x) + _Vector_base(_Vector_base&& __x) noexcept : _M_impl(std::move(__x._M_get_Tp_allocator())) { this->_M_impl._M_swap_data(__x._M_impl); } _Vector_base(_Vector_base&& __x, const allocator_type& __a) : _M_impl(__a) { if (__x.get_allocator() == __a) this->_M_impl._M_swap_data(__x._M_impl); else { size_t __n = __x._M_impl._M_finish - __x._M_impl._M_start; _M_create_storage(__n); } } #endif - ~_Vector_base() + ~_Vector_base() _GLIBCXX_NOEXCEPT { _M_deallocate(this->_M_impl._M_start, this->_M_impl._M_end_of_storage - this->_M_impl._M_start); } public: _Vector_impl _M_impl; pointer _M_allocat
Re: operator new returns nonzero
Nobody has expressed concern for a week, so it may be worth doing an official review ;-) http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00676.html On Mon, 9 Sep 2013, Marc Glisse wrote: I have now tested bootstrap+testsuite and there was no regression. 2013-09-07 Marc Glisse PR c++/19476 gcc/ * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. * g++.dg/tree-ssa/pr19476-3.C: Likewise. * g++.dg/tree-ssa/pr19476-4.C: Likewise. -- Marc Glisse
[v3] More noexcept for lists
Hello, after vectors, lists. I didn't touch the throw we were discussing earlier today for now. There will be an inconsistency with debug list iterators because they use a general wrapper: - I would need François to tell if that wrapper is ever used with iterators that can throw, - the same wrapper is used for several containers, so unless we change all containers at once it can't stay consistent. Bootstrap+testsuite ok. 2013-09-18 Marc Glisse PR libstdc++/58338 * include/bits/list.tcc (_List_base::_M_clear, list::erase): Mark as noexcept. * include/bits/stl_list.h (_List_iterator) [_List_iterator, _M_const_cast, operator*, operator->, operator++, operator--, operator==, operator!=]: Likewise. (_List_const_iterator) [_List_const_iterator, _M_const_cast, operator*, operator->, operator++, operator--, operator==, operator!=]: Likewise. (operator==(const _List_iterator&, const _List_const_iterator&), operator!=(const _List_iterator&, const _List_const_iterator&)): Likewise. (_List_impl) [_List_impl(const _Node_alloc_type&), _List_impl(_Node_alloc_type&&)]: Likewise. (_List_base) [_M_put_node, _List_base(const _Node_alloc_type&), _List_base(_List_base&&), _M_clear, _M_init]: Likewise. (list) [list(), list(const allocator_type&)]: Merge. (list) [list(const allocator_type&), front, back, pop_front, pop_back, erase, _M_erase]: Mark as noexcept. * include/debug/list (list) [list(const _Allocator&), front, back, pop_front, pop_back, _M_erase, erase]: Likewise. * include/profile/list (list) [list(const _Allocator&), front, back, pop_front, pop_back, erase]: Likewise. * testsuite/23_containers/list/requirements/dr438/assign_neg.cc: Adjust line number. * testsuite/23_containers/list/requirements/dr438/constructor_1_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/constructor_2_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/insert_neg.cc: Likewise. -- Marc GlisseIndex: include/bits/list.tcc === --- include/bits/list.tcc (revision 202655) +++ include/bits/list.tcc (working copy) @@ -56,21 +56,21 @@ #ifndef _LIST_TCC #define _LIST_TCC 1 namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_CONTAINER template void _List_base<_Tp, _Alloc>:: -_M_clear() +_M_clear() _GLIBCXX_NOEXCEPT { typedef _List_node<_Tp> _Node; _Node* __cur = static_cast<_Node*>(_M_impl._M_node._M_next); while (__cur != &_M_impl._M_node) { _Node* __tmp = __cur; __cur = static_cast<_Node*>(__cur->_M_next); #if __cplusplus >= 201103L _M_get_Node_allocator().destroy(__tmp); #else @@ -138,21 +138,21 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER return __it; } return __position._M_const_cast(); } #endif template typename list<_Tp, _Alloc>::iterator list<_Tp, _Alloc>:: #if __cplusplus >= 201103L -erase(const_iterator __position) +erase(const_iterator __position) noexcept #else erase(iterator __position) #endif { iterator __ret = iterator(__position._M_node->_M_next); _M_erase(__position._M_const_cast()); return __ret; } #if __cplusplus >= 201103L Index: include/bits/stl_list.h === --- include/bits/stl_list.h (revision 202655) +++ include/bits/stl_list.h (working copy) @@ -126,76 +126,76 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER { typedef _List_iterator<_Tp>_Self; typedef _List_node<_Tp>_Node; typedef ptrdiff_t difference_type; typedef std::bidirectional_iterator_tagiterator_category; typedef _Tpvalue_type; typedef _Tp* pointer; typedef _Tp& reference; - _List_iterator() + _List_iterator() _GLIBCXX_NOEXCEPT : _M_node() { } explicit - _List_iterator(__detail::_List_node_base* __x) + _List_iterator(__detail::_List_node_base* __x) _GLIBCXX_NOEXCEPT : _M_node(__x) { } _Self - _M_const_cast() const + _M_const_cast() const _GLIBCXX_NOEXCEPT { return *this; } // Must downcast from _List_node_base to _List_node to get to _M_data. reference - operator*() const + operator*() const _GLIBCXX_NOEXCEPT { return static_cast<_Node*>(_M_node)->_M_data; } pointer -
Re: [v3] More noexcept for lists
On Wed, 18 Sep 2013, Paolo Carlini wrote: On 09/17/2013 08:44 PM, Marc Glisse wrote: after vectors, lists. I didn't touch the throw we were discussing earlier today for now. There will be an inconsistency with debug list iterators because they use a general wrapper: - I would need François to tell if that wrapper is ever used with iterators that can throw, - the same wrapper is used for several containers, so unless we change all containers at once it can't stay consistent. Thus the idea is changing first all the containers and eventually go back to __gnu_debug::_Safe_iterator and consistently add the noexcepts there? Yes. Let's not forget that! (or alternately leave out all the iterators related bits for the time being ;) That would mean one mega-patch doing all the iterators at once :-( Patch is otherwise Ok with me, thanks. Ok, I'll commit it now and move to other containers. -- Marc Glisse
[v3] More noexcept -- 3rd
Hello, some more containers... In debug array, we already have throw in noexcept functions, but if I understand correctly it is only because of syntax limitations for constexpr functions and aborts before throwing, although the use of _GLIBCXX_THROW_OR_ABORT is suspicious. In any case, I am not changing this with my patch. I replaced throw with abort in list, as discussed, and thus removed the corresponding testcase. bootstrap+testsuite ok. 2013-09-19 Marc Glisse PR libstdc++/58338 * include/bits/stl_iterator.h (__normal_iterator) [__normal_iterator, _M_const_cast, operator*, operator->, operator++, operator--, operator[], operator+=, operator+, operator-=, operator-, base]: Mark as noexcept. (operator==(const __normal_iterator&, const __normal_iterator&), operator!=(const __normal_iterator&, const __normal_iterator&), operator<(const __normal_iterator&, const __normal_iterator&), operator>(const __normal_iterator&, const __normal_iterator&), operator<=(const __normal_iterator&, const __normal_iterator&), operator>=(const __normal_iterator&, const __normal_iterator&), operator-(const __normal_iterator&, const __normal_iterator&), operator+(difference_type, const __normal_iterator&)): Likewise. * include/bits/stl_list.h (list) [splice, _M_check_equal_allocators]: Likewise. (list::_M_check_equal_allocators): Abort instead of throwing. * include/debug/array (array) [operator[], front, back]: Mark as noexcept. * include/profile/array (array) [operator[], front, back]: Likewise. * include/std/array (array) [operator[], front, back]: Likewise. * include/debug/list (list::splice): Likewise. * include/profile/list (list::splice): Likewise. * testsuite/23_containers/list/operations/5.cc: Remove file. * testsuite/23_containers/list/operations/5.h: Likewise. -- Marc GlisseIndex: include/bits/stl_iterator.h === --- include/bits/stl_iterator.h (revision 202699) +++ include/bits/stl_iterator.h (working copy) @@ -714,216 +714,232 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION typedef iterator_traits<_Iterator> __traits_type; public: typedef _Iteratoriterator_type; typedef typename __traits_type::iterator_category iterator_category; typedef typename __traits_type::value_type value_type; typedef typename __traits_type::difference_type difference_type; typedef typename __traits_type::referencereference; typedef typename __traits_type::pointer pointer; - _GLIBCXX_CONSTEXPR __normal_iterator() : _M_current(_Iterator()) { } + _GLIBCXX_CONSTEXPR __normal_iterator() _GLIBCXX_NOEXCEPT + : _M_current(_Iterator()) { } explicit - __normal_iterator(const _Iterator& __i) : _M_current(__i) { } + __normal_iterator(const _Iterator& __i) _GLIBCXX_NOEXCEPT + : _M_current(__i) { } // Allow iterator to const_iterator conversion template __normal_iterator(const __normal_iterator<_Iter, typename __enable_if< (std::__are_same<_Iter, typename _Container::pointer>::__value), - _Container>::__type>& __i) + _Container>::__type>& __i) _GLIBCXX_NOEXCEPT : _M_current(__i.base()) { } #if __cplusplus >= 201103L __normal_iterator - _M_const_cast() const + _M_const_cast() const noexcept { using _PTraits = std::pointer_traits; return __normal_iterator (_PTraits::pointer_to(const_cast (*_M_current))); } #else __normal_iterator _M_const_cast() const { return *this; } #endif // Forward iterator requirements reference - operator*() const + operator*() const _GLIBCXX_NOEXCEPT { return *_M_current; } pointer - operator->() const + operator->() const _GLIBCXX_NOEXCEPT { return _M_current; } __normal_iterator& - operator++() + operator++() _GLIBCXX_NOEXCEPT { ++_M_current; return *this; } __normal_iterator - operator++(int) + operator++(int) _GLIBCXX_NOEXCEPT { return __normal_iterator(_M_current++); } // Bidirectional iterator requirements __normal_iterator& - operator--() + operator--() _GLIBCXX_NOEXCEPT { --_M_current; return *this; } __normal_iterator - operator--(int) + operator--(int) _GLIBCXX_NOEXCEPT { return __normal_iterator(_M_current--); } // Rand
Re: [v3] More noexcept -- 3rd
On Wed, 18 Sep 2013, Paolo Carlini wrote: On 09/18/2013 05:51 PM, Marc Glisse wrote: In debug array, we already have throw in noexcept functions, but if I understand correctly it is only because of syntax limitations for constexpr functions and aborts before throwing, although the use of _GLIBCXX_THROW_OR_ABORT is suspicious. In any case, I am not changing this with my patch. If I remember correctly, somebody invented that mild hack and suggested it to indeed have a check as part of a constexpr function, not a trivial task. Jon participated to that discussion. After a while I resurrected that old discussion, tested the code and it appeared to work well. In practice, are you experiencing specific problems with it? No, no problem. For some reason I thought there would be issues when the macro expands to __builtin_abort(), but there aren't, great. Any other comments on the patch? (Jon's "great" doesn't really sound like an "ok") -- Marc Glisse
[v3] More noexcept -- 4th
Hello, I did not touch the regular basic_string because Paulo usually says not to touch it, but I could do it as well if wanted. I didn't add noexcept to the debug string swap (and move assignments) because the regular basic_string swap can currently throw if the allocators are distinct. bootstrap+testsuite ok. 2013-09-19 Marc Glisse PR libstdc++/58338 * include/bits/stl_tree.h (_Rb_tree_node_base) [_S_minimum, _S_maximum]: Mark as noexcept. (_Rb_tree_iterator) [_Rb_tree_iterator, operator*, operator->, operator++, operator--, operator==, operator!=]: Likewise. (_Rb_tree_const_iterator) [_Rb_tree_const_iterator, _M_const_cast, operator*, operator->, operator++, operator--, operator==, operator!=]: Likewise. (operator==(const _Rb_tree_iterator&, const _Rb_tree_const_iterator&), operator!=(const _Rb_tree_iterator&, const _Rb_tree_const_iterator&)): Likewise. (_Rb_tree) [_M_put_node, _M_destroy_node, _M_root, _M_leftmost, _M_rightmost, _M_begin, _M_end, _S_left, _S_right, _S_minimum, _S_maximum]: Likewise. * include/debug/string (basic_string) [basic_string(const _Allocator&), shrink_to_fit, operator[], pop_back]: Likewise. * include/ext/vstring.h (__versa_string) [_M_limit, _M_disjunct, _M_ibegin, _M_iend, __versa_string(const _Alloc&), operator=(__versa_string&&), shrink_to_fit, operator[], front, back, assign(__versa_string&&), swap]: Likewise. (__versa_string) [__versa_string(), __versa_string(const _Alloc&)]: Merge. -- Marc GlisseIndex: include/bits/stl_tree.h === --- include/bits/stl_tree.h (revision 202722) +++ include/bits/stl_tree.h (working copy) @@ -92,42 +92,42 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { typedef _Rb_tree_node_base* _Base_ptr; typedef const _Rb_tree_node_base* _Const_Base_ptr; _Rb_tree_color _M_color; _Base_ptr _M_parent; _Base_ptr _M_left; _Base_ptr _M_right; static _Base_ptr -_S_minimum(_Base_ptr __x) +_S_minimum(_Base_ptr __x) _GLIBCXX_NOEXCEPT { while (__x->_M_left != 0) __x = __x->_M_left; return __x; } static _Const_Base_ptr -_S_minimum(_Const_Base_ptr __x) +_S_minimum(_Const_Base_ptr __x) _GLIBCXX_NOEXCEPT { while (__x->_M_left != 0) __x = __x->_M_left; return __x; } static _Base_ptr -_S_maximum(_Base_ptr __x) +_S_maximum(_Base_ptr __x) _GLIBCXX_NOEXCEPT { while (__x->_M_right != 0) __x = __x->_M_right; return __x; } static _Const_Base_ptr -_S_maximum(_Const_Base_ptr __x) +_S_maximum(_Const_Base_ptr __x) _GLIBCXX_NOEXCEPT { while (__x->_M_right != 0) __x = __x->_M_right; return __x; } }; template struct _Rb_tree_node : public _Rb_tree_node_base { typedef _Rb_tree_node<_Val>* _Link_type; @@ -160,72 +160,72 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION typedef _Tp& reference; typedef _Tp* pointer; typedef bidirectional_iterator_tag iterator_category; typedef ptrdiff_t difference_type; typedef _Rb_tree_iterator<_Tp>_Self; typedef _Rb_tree_node_base::_Base_ptr _Base_ptr; typedef _Rb_tree_node<_Tp>* _Link_type; - _Rb_tree_iterator() + _Rb_tree_iterator() _GLIBCXX_NOEXCEPT : _M_node() { } explicit - _Rb_tree_iterator(_Link_type __x) + _Rb_tree_iterator(_Link_type __x) _GLIBCXX_NOEXCEPT : _M_node(__x) { } reference - operator*() const + operator*() const _GLIBCXX_NOEXCEPT { return static_cast<_Link_type>(_M_node)->_M_value_field; } pointer - operator->() const + operator->() const _GLIBCXX_NOEXCEPT { return std::__addressof(static_cast<_Link_type> (_M_node)->_M_value_field); } _Self& - operator++() + operator++() _GLIBCXX_NOEXCEPT { _M_node = _Rb_tree_increment(_M_node); return *this; } _Self - operator++(int) + operator++(int) _GLIBCXX_NOEXCEPT { _Self __tmp = *this; _M_node = _Rb_tree_increment(_M_node); return __tmp; } _Self& - operator--() + operator--() _GLIBCXX_NOEXCEPT { _M_node = _Rb_tree_decrement(_M_node); return *this; } _Self - operator--(int) + operator--(int) _GLIBCXX_NOEXCEPT { _Self __tmp = *this; _M_node = _Rb_tree_decrement(_M_node); return __tmp; } bool - operator==(const _Self& __x) const + operator==(co
Re: [v3] More noexcept -- 5th
On Fri, 20 Sep 2013, Paolo Carlini wrote: On 09/20/2013 09:46 AM, Marc Glisse wrote: Hello, for basic_string, I tried not to add lies about exceptions, but I didn't remove existing ones. Of course we should not have lies, I thought we didn't, besides maybe special cases having to do with the FULLY_DYNAMIC string thing, really a C++98 legacy wa, which will not exist in the future. Can you please send an updated patch fixing those? Would you mind if we did that as a separate follow-up patch, unless there are other problems with the patch? One is adding noexcept for optimization, the other one would be removing some (no intersection) for correctness. I'll do it this WE. I'll also need to remove the corresponding noexcept from debug/profile mode... -- Marc Glisse
Re: [v3] More noexcept -- 5th
On Fri, 20 Sep 2013, Paolo Carlini wrote: By the way, I would be curious at some point to actually see with my eyes the effect of those optimizations in the assembly: is it easy to produce examples? Even at say -O2? If you use "if(noexcept(container.shrink_to_fit()))", you can easily cause different code to be used. Now whether for regular code it somehow produces fewer implicit try-catch or some optimization like that, I have no idea. If it ever makes code worse, please beat the core- people who required std::terminate with a screen showing the benchmarks (I keep wanting to introduce -fno-abort -fno-terminate flags to turn those 2 calls into __builtin_unreachable). Note that I am still a proponent of noexcept(auto), if it can't be the default. If someone feels like implementing it as an extension, we could use it in the library. -- Marc Glisse
Re: [patch] Make vector::at() assertion message more useful (try #2)
On Mon, 23 Sep 2013, Paolo Carlini wrote: There are a lot of targets using unsigned int for size_t, which would have been uncovered by proper testing. We can't test all patches on 3-4 different targets... It wasn't obvious this patch could be that sensitive to the target. That's true, just remember to test *both* -m32 and -m64, for non trivial changes. So how do you do that in practice ? Is it done by default if multilib is enabled? You also mentioned doing something special to check debug/profile modes recently, is there a make target to really perform all the tests necessary for a submission? http://gcc.gnu.org/contribute.html has an outdated section on testing. It mentions that you should do a bootstrap for a change to the C front-end (should also be for the C++ front-end and I guess libstdc++ even if it isn't used much inside gcc). -- Marc Glisse
[v3] Less noexcept
Hello, this patch was tested on x86_64 with a bootstrap and a simple make -k check, without regression. Note that it doesn't completely fix 56166, it merely admits that we may currently throw and avoids turning that into std::terminate. 2013-09-24 Marc Glisse PR libstdc++/58338 PR libstdc++/56166 * include/bits/basic_string.h (basic_string) [basic_string(basic_string&&)]: Make the noexcept conditional. [operator=(basic_string&&), assign(basic_string&&)]: Link to PR 58265. [begin(), end(), rbegin(), rend(), clear]: Remove noexcept. * include/debug/string (basic_string) [basic_string(const _Allocator&), basic_string(basic_string&&), begin(), end(), rbegin(), rend(), clear, operator[](size_type), pop_back]: Comment out noexcept, until vstring replaces basic_string. -- Marc GlisseIndex: include/bits/basic_string.h === --- include/bits/basic_string.h (revision 202831) +++ include/bits/basic_string.h (working copy) @@ -502,21 +502,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION basic_string(size_type __n, _CharT __c, const _Alloc& __a = _Alloc()); #if __cplusplus >= 201103L /** * @brief Move construct string. * @param __str Source string. * * The newly-created string contains the exact contents of @a __str. * @a __str is a valid, but unspecified string. **/ - basic_string(basic_string&& __str) noexcept + basic_string(basic_string&& __str) +#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0 + noexcept +#endif : _M_dataplus(__str._M_dataplus) { #if _GLIBCXX_FULLY_DYNAMIC_STRING == 0 __str._M_data(_S_empty_rep()._M_refdata()); #else __str._M_data(_S_construct(size_type(), _CharT(), get_allocator())); #endif } /** @@ -574,20 +577,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } #if __cplusplus >= 201103L /** * @brief Move assign the value of @a str to this string. * @param __str Source string. * * The contents of @a str are moved into this string (without copying). * @a str is a valid, but unspecified string. **/ + // PR 58265, this should be noexcept. basic_string& operator=(basic_string&& __str) { // NB: DR 1204. this->swap(__str); return *this; } /** * @brief Set value to string constructed from initializer %list. @@ -600,78 +604,78 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return *this; } #endif // C++11 // Iterators: /** * Returns a read/write iterator that points to the first character in * the %string. Unshares the string. */ iterator - begin() _GLIBCXX_NOEXCEPT + begin() // Can throw. { _M_leak(); return iterator(_M_data()); } /** * Returns a read-only (constant) iterator that points to the first * character in the %string. */ const_iterator begin() const _GLIBCXX_NOEXCEPT { return const_iterator(_M_data()); } /** * Returns a read/write iterator that points one past the last * character in the %string. Unshares the string. */ iterator - end() _GLIBCXX_NOEXCEPT + end() // Can throw. { _M_leak(); return iterator(_M_data() + this->size()); } /** * Returns a read-only (constant) iterator that points one past the * last character in the %string. */ const_iterator end() const _GLIBCXX_NOEXCEPT { return const_iterator(_M_data() + this->size()); } /** * Returns a read/write reverse iterator that points to the last * character in the %string. Iteration is done in reverse element * order. Unshares the string. */ reverse_iterator - rbegin() _GLIBCXX_NOEXCEPT + rbegin() // Can throw. { return reverse_iterator(this->end()); } /** * Returns a read-only (constant) reverse iterator that points * to the last character in the %string. Iteration is done in * reverse element order. */ const_reverse_iterator rbegin() const _GLIBCXX_NOEXCEPT { return const_reverse_iterator(this->end()); } /** * Returns a read/write reverse iterator that points to one before the * first character in the %string. Iteration is done in reverse * element order. Unshares the string. */ reverse_iterator - rend() _GLIBCXX_NOEXCEPT + rend() // Can throw. { return reverse_iterator(this->begin()); } /** * Returns a read-only (constant) rev
Re: [v3] Less noexcept
On Mon, 23 Sep 2013, Paolo Carlini wrote: On 9/23/13 10:55 AM, Marc Glisse wrote: Hello, this patch was tested on x86_64 with a bootstrap and a simple make -k check, without regression. Note that it doesn't completely fix 56166, it merely admits that we may currently throw and avoids turning that into std::terminate. Of course. Patch basically Ok, can you be a little less terse in the comments before the noexcepts which you are removing and in fact must be there for conformance, aren't just QoI? Point to an existing bug, when it exists, like 56166, add a FIXME C++11 at least. Like this? It is funny that with fully dynamic strings, the copy constructor is "better" than the move constructor: faster, doesn't throw, etc. I think we should remove the move constructor in that case, or at least make it act the same as the copy constructor. I didn't mark the copy constructor as noexcept, but without checking the code it seems likely we could. Also, __versa_string actually has the same issues as basic_string, if you choose the reference counted base... I guess I have to do a patch to remove noexcept there now :-( Has someone started work on some branch to get a real C++11 basic_string, or are we waiting until the "lack of manpower" argument convinces everyone to forget about trying to preserve any ABI compatibility? 2013-09-24 Marc Glisse PR libstdc++/58338 PR libstdc++/56166 * include/bits/basic_string.h (basic_string) [basic_string(basic_string&&)]: Make the noexcept conditional. [operator=(basic_string&&), assign(basic_string&&)]: Link to PR 58265. [begin(), end(), rbegin(), rend(), clear]: Remove noexcept. [pop_back]: Comment on the lack of noexcept. * include/debug/string (basic_string) [basic_string(const _Allocator&), basic_string(basic_string&&), begin(), end(), rbegin(), rend(), clear, operator[](size_type), pop_back]: Comment out noexcept, until vstring replaces basic_string. -- Marc GlisseIndex: include/bits/basic_string.h === --- include/bits/basic_string.h (revision 202838) +++ include/bits/basic_string.h (working copy) @@ -502,21 +502,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION basic_string(size_type __n, _CharT __c, const _Alloc& __a = _Alloc()); #if __cplusplus >= 201103L /** * @brief Move construct string. * @param __str Source string. * * The newly-created string contains the exact contents of @a __str. * @a __str is a valid, but unspecified string. **/ - basic_string(basic_string&& __str) noexcept + basic_string(basic_string&& __str) +#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0 + noexcept // FIXME C++11: should always be noexcept. +#endif : _M_dataplus(__str._M_dataplus) { #if _GLIBCXX_FULLY_DYNAMIC_STRING == 0 __str._M_data(_S_empty_rep()._M_refdata()); #else __str._M_data(_S_construct(size_type(), _CharT(), get_allocator())); #endif } /** @@ -574,20 +577,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } #if __cplusplus >= 201103L /** * @brief Move assign the value of @a str to this string. * @param __str Source string. * * The contents of @a str are moved into this string (without copying). * @a str is a valid, but unspecified string. **/ + // PR 58265, this should be noexcept. basic_string& operator=(basic_string&& __str) { // NB: DR 1204. this->swap(__str); return *this; } /** * @brief Set value to string constructed from initializer %list. @@ -600,78 +604,78 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return *this; } #endif // C++11 // Iterators: /** * Returns a read/write iterator that points to the first character in * the %string. Unshares the string. */ iterator - begin() _GLIBCXX_NOEXCEPT + begin() // FIXME C++11: should be noexcept. { _M_leak(); return iterator(_M_data()); } /** * Returns a read-only (constant) iterator that points to the first * character in the %string. */ const_iterator begin() const _GLIBCXX_NOEXCEPT { return const_iterator(_M_data()); } /** * Returns a read/write iterator that points one past the last * character in the %string. Unshares the string. */ iterator - end() _GLIBCXX_NOEXCEPT + end() // FIXME C++11: should be noexcept. { _M_leak(); return iterator(_M_data() + this->size()); } /** * Returns a read-only (constant) iterator that points one
Re: [v3] Less noexcept
On Mon, 23 Sep 2013, Paolo Carlini wrote: Hi again, It is funny that with fully dynamic strings, the copy constructor is "better" than the move constructor: faster, doesn't throw, etc. I think we should remove the move constructor in that case, or at least make it act the same as the copy constructor. I didn't mark the copy constructor as noexcept, but without checking the code it seems likely we could. I took a look at the code. First I got a headache because of the allocator stuff. And once the paracetamol started helping, I noticed that the is_leaked mechanism means it can throw anyway, so I was wrong. We could, but in my opinion fiddling with those isn't worth the trouble, because the whole "fully dynamic string" thing is just a workaround for issues of the current reference counted implementation vs the statically allocated empty string on some targets. Well, I have a second (practical) thought about this part. If you are willing to spend a little more time on this, and can confirm your preliminary analysis about copy-constructor vs move-constructor, first blush it definitely makes sense to me, I'm certainly not against your proposal of having the move-constructor identical to the copy-constructor in that case. In 4.9.x some targets, not Linux, would benefit from it. Sorry. -- Marc Glisse
[v3] plus
Hello, this was only minimally tested since trunk is broken at the moment. There don't seem to be specific tests for the existing functors, so a couple tests for the new specializations seem good enough. 2013-09-25 Marc Glisse * include/bits/stl_function.h: Include for std::forward. (bit_not): New class. (plus, minus ,multiplies, divides, modulus, negate, equal_to, not_equal_to, greater, less, greater_equal, less_equal, logical_and, logical_or, logical_not, bit_and, bit_or, bit_xor, bit_not): New specializations. * testsuite/20_util/headers/functional/generic_function_objects.cc: New file. -- Marc GlisseIndex: include/bits/stl_function.h === --- include/bits/stl_function.h (revision 202872) +++ include/bits/stl_function.h (working copy) @@ -49,20 +49,22 @@ */ /** @file bits/stl_function.h * This is an internal header file, included by other library headers. * Do not attempt to use it directly. @headername{functional} */ #ifndef _STL_FUNCTION_H #define _STL_FUNCTION_H 1 +#include // for std::forward + namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION // 20.3.1 base classes /** @defgroup functors Function Objects * @ingroup utilities * * Function objects, or @e functors, are objects with an @c operator() * defined and accessible. They can be passed as arguments to algorithm @@ -129,201 +131,514 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * @ingroup functors * * Because basic math often needs to be done during an algorithm, * the library provides functors for those operations. See the * documentation for @link functors the base classes@endlink * for examples of their use. * * @{ */ /// One of the @link arithmetic_functors math functors@endlink. - template + template struct plus : public binary_function<_Tp, _Tp, _Tp> { _Tp operator()(const _Tp& __x, const _Tp& __y) const { return __x + __y; } }; +#if __cplusplus >= 201103L + template<> +struct plus +{ + template + constexpr auto + operator()(_Tp&& __x, _Up&& __y) const + noexcept(noexcept(std::forward<_Tp>(__x) + std::forward<_Up>(__y))) + -> decltype(std::forward<_Tp>(__x) + std::forward<_Up>(__y)) + { + return std::forward<_Tp>(__x) + std::forward<_Up>(__y); + } + typedef void is_transparent; + }; +#endif + /// One of the @link arithmetic_functors math functors@endlink. - template + template struct minus : public binary_function<_Tp, _Tp, _Tp> { _Tp operator()(const _Tp& __x, const _Tp& __y) const { return __x - __y; } }; +#if __cplusplus >= 201103L + template<> +struct minus +{ + template + constexpr auto + operator()(_Tp&& __x, _Up&& __y) const + noexcept(noexcept(std::forward<_Tp>(__x) - std::forward<_Up>(__y))) + -> decltype(std::forward<_Tp>(__x) - std::forward<_Up>(__y)) + { + return std::forward<_Tp>(__x) - std::forward<_Up>(__y); + } + typedef void is_transparent; + }; +#endif + /// One of the @link arithmetic_functors math functors@endlink. - template + template struct multiplies : public binary_function<_Tp, _Tp, _Tp> { _Tp operator()(const _Tp& __x, const _Tp& __y) const { return __x * __y; } }; +#if __cplusplus >= 201103L + template<> +struct multiplies +{ + template + constexpr auto + operator()(_Tp&& __x, _Up&& __y) const + noexcept(noexcept(std::forward<_Tp>(__x) * std::forward<_Up>(__y))) + -> decltype(std::forward<_Tp>(__x) * std::forward<_Up>(__y)) + { + return std::forward<_Tp>(__x) * std::forward<_Up>(__y); + } + typedef void is_transparent; + }; +#endif + /// One of the @link arithmetic_functors math functors@endlink. - template + template struct divides : public binary_function<_Tp, _Tp, _Tp> { _Tp operator()(const _Tp& __x, const _Tp& __y) const { return __x / __y; } }; +#if __cplusplus >= 201103L + template<> +struct divides +{ + template + constexpr auto + operator()(_Tp&& __x, _Up&& __y) const + noexcept(noexcept(std::forward<_Tp>(__x) / std::forward<_Up>(__y))) + -> decltype(std::forward<_Tp>(__x) / std::forward<_Up>(__y)) + { + return std::forward<_Tp>(__x) / std::forward<_Up>(__y); + } + typedef void is_transparent; + }; +#endif + /// One of the @link arithme
Re: [v3] plus
On Wed, 25 Sep 2013, Jonathan Wakely wrote: I've had this sitting in my tree waiting to do something with, I did ask last week if someone had done it already... I'm about to go to sleep so didn't check if the test covers anything yours doesn't. In the test you have basic cover for all functors, and I cover only 2 cases (more specifically though, since I look at the return type cv-qual). In my patch, I added constexpr and noexcept, I couldn't see any reason not to for such basic utilities. Yes, I did read the wiki and noticed the vote yesterday about constexpr, but imho that's wrong. It looks like your patch adds the default template argument even in C++98 mode, I avoided that by putting forward declarations at the top of the file, in a #if block. This only lets me write: std::plus<> *p; in C++98 mode (std::plus<> p; gives an error), doesn't seem that bad. And I also add the void specializations in C++11 mode, as an extension. Well, let's forget my patch and go with yours, though at least adding noexcept seems like a good idea. (too bad we don't have noexcept(auto) here) (too bad we can't use decltype(auto) for the return type, as that would disable sfinae, it is a bad sign when the standard turns its nose up at its own dog food...) -- Marc Glisse
Re: operator new returns nonzero
It isn't a front-end patch, but it is still a C++ patch, maybe Jason will have comments? Anyone else? On Mon, 16 Sep 2013, Marc Glisse wrote: Nobody has expressed concern for a week, so it may be worth doing an official review ;-) http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00676.html On Mon, 9 Sep 2013, Marc Glisse wrote: I have now tested bootstrap+testsuite and there was no regression. 2013-09-07 Marc Glisse PR c++/19476 gcc/ * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. * g++.dg/tree-ssa/pr19476-3.C: Likewise. * g++.dg/tree-ssa/pr19476-4.C: Likewise. -- Marc Glisse
Re: operator new returns nonzero
On Wed, 2 Oct 2013, Jakub Jelinek wrote: On Mon, Sep 09, 2013 at 10:49:40PM +0200, Marc Glisse wrote: --- fold-const.c(revision 202413) +++ fold-const.c(working copy) @@ -16171,21 +16171,31 @@ tree_expr_nonzero_warnv_p (tree t, bool case MODIFY_EXPR: case BIND_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1), strict_overflow_p); case SAVE_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0), strict_overflow_p); case CALL_EXPR: - return alloca_call_p (t); + { + tree fn = CALL_EXPR_FN (t); + if (TREE_CODE (fn) != ADDR_EXPR) return false; + tree fndecl = TREE_OPERAND (fn, 0); + if (TREE_CODE (fndecl) != FUNCTION_DECL) return false; + if (flag_delete_null_pointer_checks && !flag_check_new + && DECL_IS_OPERATOR_NEW (fndecl) + && !TREE_NOTHROW (fndecl)) + return true; + return alloca_call_p (t); Not commenting on what this patch does, but how: why don't you use tree fndecl = get_callee_fndecl (t); if (fndecl && ...) return true; instead? Because I copied the code from alloca_call_p ;-) get_callee_fndecl does look better indeed. Perhaps alloca_call_p should use it too. Thanks, I'll prepare a new patch. -- Marc Glisse
Make the 2 versions of delete more similar
Hello, I don't understand why those 2 files differ by more than 1 extra argument, so I am changing that. Bootstrap and testsuite on x86_64. 2013-10-03 Marc Glisse * libsupc++/del_op.cc (operator delete): Don't test for 0 before free. * libsupc++/del_opnt.cc (free): Only declare if freestanding. (operator delete): Qualify free with std::. -- Marc GlisseIndex: libsupc++/del_op.cc === --- libsupc++/del_op.cc (revision 203101) +++ libsupc++/del_op.cc (working copy) @@ -36,13 +36,12 @@ _GLIBCXX_END_NAMESPACE_VERSION } // namespace #else # include #endif #include "new" _GLIBCXX_WEAK_DEFINITION void operator delete(void* ptr) _GLIBCXX_USE_NOEXCEPT { - if (ptr) -std::free(ptr); + std::free(ptr); } Index: libsupc++/del_opnt.cc === --- libsupc++/del_opnt.cc (revision 203101) +++ libsupc++/del_opnt.cc (working copy) @@ -17,19 +17,31 @@ // Under Section 7 of GPL version 3, you are granted additional // permissions described in the GCC Runtime Library Exception, version // 3.1, as published by the Free Software Foundation. // You should have received a copy of the GNU General Public License and // a copy of the GCC Runtime Library Exception along with this program; // see the files COPYING3 and COPYING.RUNTIME respectively. If not, see // <http://www.gnu.org/licenses/>. #include -#include "new" -extern "C" void free (void *); +#if !_GLIBCXX_HOSTED +// A freestanding C runtime may not provide "free" -- but there is no +// other reasonable way to implement "operator delete". +namespace std +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + extern "C" void free(void*); +_GLIBCXX_END_NAMESPACE_VERSION +} // namespace +#else +# include +#endif + +#include "new" _GLIBCXX_WEAK_DEFINITION void operator delete (void *ptr, const std::nothrow_t&) _GLIBCXX_USE_NOEXCEPT { - free (ptr); + std::free(ptr); }
Re: operator new returns nonzero
New version after Jakub's comment, bootstrap and testsuite on x86_64. 2013-10-03 Marc Glisse PR c++/19476 gcc/ * calls.c (alloca_call_p): Use get_callee_fndecl. * fold-const.c (tree_expr_nonzero_warnv_p): Handle operator new. * tree-vrp.c (gimple_stmt_nonzero_warnv_p, stmt_interesting_for_vrp): Likewise. (vrp_visit_stmt): Remove duplicated code. gcc/testsuite/ * g++.dg/tree-ssa/pr19476-1.C: New file. * g++.dg/tree-ssa/pr19476-2.C: Likewise. * g++.dg/tree-ssa/pr19476-3.C: Likewise. * g++.dg/tree-ssa/pr19476-4.C: Likewise. -- Marc GlisseIndex: calls.c === --- calls.c (revision 203101) +++ calls.c (working copy) @@ -628,25 +628,24 @@ gimple_alloca_call_p (const_gimple stmt) return true; return false; } /* Return true when exp contains alloca call. */ bool alloca_call_p (const_tree exp) { + tree fndecl; if (TREE_CODE (exp) == CALL_EXPR - && TREE_CODE (CALL_EXPR_FN (exp)) == ADDR_EXPR - && (TREE_CODE (TREE_OPERAND (CALL_EXPR_FN (exp), 0)) == FUNCTION_DECL) - && (special_function_p (TREE_OPERAND (CALL_EXPR_FN (exp), 0), 0) - & ECF_MAY_BE_ALLOCA)) + && (fndecl = get_callee_fndecl (exp)) + && (special_function_p (fndecl, 0) & ECF_MAY_BE_ALLOCA)) return true; return false; } /* Return TRUE if FNDECL is either a TM builtin or a TM cloned function. Return FALSE otherwise. */ static bool is_tm_builtin (const_tree fndecl) { Index: fold-const.c === --- fold-const.c(revision 203101) +++ fold-const.c(working copy) @@ -16215,21 +16215,29 @@ tree_expr_nonzero_warnv_p (tree t, bool case MODIFY_EXPR: case BIND_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 1), strict_overflow_p); case SAVE_EXPR: return tree_expr_nonzero_warnv_p (TREE_OPERAND (t, 0), strict_overflow_p); case CALL_EXPR: - return alloca_call_p (t); + { + tree fndecl = get_callee_fndecl (t); + if (!fndecl) return false; + if (flag_delete_null_pointer_checks && !flag_check_new + && DECL_IS_OPERATOR_NEW (fndecl) + && !TREE_NOTHROW (fndecl)) + return true; + return alloca_call_p (t); + } default: break; } return false; } /* Return true when T is an address and is known to be nonzero. Handle warnings about undefined signed overflow. */ Index: testsuite/g++.dg/tree-ssa/pr19476-1.C === --- testsuite/g++.dg/tree-ssa/pr19476-1.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-1.C (working copy) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ccp1" } */ + +#include + +int f(){ + return 33 + (0 == new(std::nothrow) int); +} +int g(){ + return 42 + (0 == new int[50]); +} + +/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */ +/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */ +/* { dg-final { cleanup-tree-dump "ccp1" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-1.C ___ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Index: testsuite/g++.dg/tree-ssa/pr19476-2.C === --- testsuite/g++.dg/tree-ssa/pr19476-2.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-2.C (working copy) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +#include + +int f(){ + int *p = new(std::nothrow) int; + return 33 + (0 == p); +} +int g(){ + int *p = new int[50]; + return 42 + (0 == p); +} + +/* { dg-final { scan-tree-dump "return 42" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "return 33" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: testsuite/g++.dg/tree-ssa/pr19476-2.C ___ Added: svn:keywords ## -0,0 +1 ## +Author Date Id Revision URL \ No newline at end of property Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Index: testsuite/g++.dg/tree-ssa/pr19476-3.C === --- testsuite/g++.dg/tree-ssa/pr19476-3.C (revision 0) +++ testsuite/g++.dg/tree-ssa/pr19476-3.C (working copy) @@ -0,0 +1,11
Re: operator new returns nonzero
On Wed, 2 Oct 2013, Jason Merrill wrote: On 10/02/2013 08:33 AM, Marc Glisse wrote: + if (flag_delete_null_pointer_checks && !flag_check_new You can't use flag_check_new in language-independent code without moving it from c.opt to common.opt. Thanks, that makes sense and I'll do that, but I am surprised that the fortran and java compilers built without complaining. -- Marc Glisse
Re: Make the 2 versions of delete more similar
On Wed, 2 Oct 2013, Christopher Jefferson wrote: On 2 October 2013 13:28, Marc Glisse wrote: Hello, I don't understand why those 2 files differ by more than 1 extra argument, so I am changing that. Bootstrap and testsuite on x86_64. 2013-10-03 Marc Glisse * libsupc++/del_op.cc (operator delete): Don't test for 0 before free. Just checking, for the nervous: Is the plan that this change will not effect any code behaviour (as correct implementations of free are happy to take a NULL pointer, and not do anything)? Yes. As far as I can tell from the logs, there was a massive change from if(p)free(p) to just free(p) that missed this file, while the patch on how to declare free missed the other. -- Marc Glisse
Re: vec_cond_expr adjustments
On Fri, 28 Sep 2012, Richard Guenther wrote: On Fri, Sep 28, 2012 at 12:42 AM, Marc Glisse wrote: Hello, I have been experimenting with generating VEC_COND_EXPR from the front-end, and these are just a couple things I noticed. 1) optabs.c requires that the first argument of vec_cond_expr be a comparison, but verify_gimple_assign_ternary only checks is_gimple_condexpr, like for COND_EXPR. In the long term, it seems better to also allow ssa_name and vector_cst (thus match the is_gimple_condexpr condition), but for now I just want to know early if I created an invalid vec_cond_expr. optabs should be fixed instead, an is_gimple_val condition is implicitely val != 0. For vectors, I think it should be val < 0 (with an appropriate cast of val to a signed integer vector type if necessary). Or (val & highbit) != 0, but that's longer. The tree.[ch] and gimple-fold.c hunks are ok if tested properly, the tree-ssa-forwprop.c idea of using TREE_TYPE (cond), too. Ok, I will retest that way. I don't like the tree-cfg.c change, instead re-factor optabs.c to get a decomposed cond for vector_compare_rtx and appropriately "decompose" a non-comparison-class cond in expand_vec_cond_expr. So vector_compare_rtx will take as arguments rcode, t_op0, t_op1 instead of cond. And in expand_vec_cond_expr, if I have a condition, I pass its elements to vector_compare_rtx, and otherwise I use 0 and the code for LT_EXPR as the other arguments. If we for example have predicate = a < b; x = predicate ? d : e; y = predicate ? f : g; we ideally want to re-use the predicate computation on targets where that would be optimal (and combine should be able to recover the case where it is not). That I don't understand. The vcond instruction implemented by targets takes as arguments d, e, cmp, a, b and emits the comparison itself. I don't see how I can avoid sending to the targets both (d,e,<,a,b) and (f,g,<,a,b). They will notice eventually that aremove one of the two, but I don't see how to do that in optabs.c. Or I can compute x = a < b, use x < 0 as the comparison passed to the targets, and expect targets (those for which it is true) to recognize that < 0 is useless in a vector condition (PR54700), or is useless on a comparison result. Thanks for the comments, -- Marc Glisse
Constant-fold vector comparisons
Hello, this patch does 2 things (I should have split it in 2, but the questions go together): 1) it handles constant folding of vector comparisons, 2) it fixes another place where vectors are not expected (I'll probably wait to have front-end support and testcases to do more of those, but there is something to discuss). I wasn't sure what integer_truep should test exactly. For integer: == 1 or != 0? For vectors: == -1 or < 0? I chose the one that worked best for the forwprop case where I used it. It seems that before this patch, the middle-end didn't know how comparison results were encoded (a good reason for VEC_COND_EXPR to require a comparison as its first argument). I am using the OpenCL encoding that what matters is the high bit of each vector element. I am not quite sure what happens for targets (are there any?) that use a different encoding. When expanding vcond, they can do the comparison as they like. When expanding an isolated comparison, I expect they have to expand it as vcond(asomething. 2012-10-01 Marc Glisse gcc/ * tree.c (integer_truep): New function. * tree.h (integer_truep): Declare. * tree-ssa-forwprop.c (forward_propagate_into_cond): Call it. Don't use boolean_type_node for vectors. * fold-const.c (fold_relational_const): Handle VECTOR_CST. gcc/testsuite/ * gcc.dg/tree-ssa/foldconst-6.c: New testcase. -- Marc GlisseIndex: gcc/tree.h === --- gcc/tree.h (revision 191850) +++ gcc/tree.h (working copy) @@ -5272,20 +5272,25 @@ extern int integer_zerop (const_tree); /* integer_onep (tree x) is nonzero if X is an integer constant of value 1. */ extern int integer_onep (const_tree); /* integer_all_onesp (tree x) is nonzero if X is an integer constant all of whose significant bits are 1. */ extern int integer_all_onesp (const_tree); +/* integer_truep (tree x) is nonzero if X is an integer constant of value 1, + or a vector constant of value < 0. */ + +extern bool integer_truep (const_tree); + /* integer_pow2p (tree x) is nonzero is X is an integer constant with exactly one bit 1. */ extern int integer_pow2p (const_tree); /* integer_nonzerop (tree x) is nonzero if X is an integer constant with a nonzero value. */ extern int integer_nonzerop (const_tree); Index: gcc/tree-ssa-forwprop.c === --- gcc/tree-ssa-forwprop.c (revision 191850) +++ gcc/tree-ssa-forwprop.c (working copy) @@ -564,46 +564,46 @@ forward_propagate_into_cond (gimple_stmt enum tree_code code; tree name = cond; gimple def_stmt = get_prop_source_stmt (name, true, NULL); if (!def_stmt || !can_propagate_from (def_stmt)) return 0; code = gimple_assign_rhs_code (def_stmt); if (TREE_CODE_CLASS (code) == tcc_comparison) tmp = fold_build2_loc (gimple_location (def_stmt), code, - boolean_type_node, + TREE_TYPE (cond), gimple_assign_rhs1 (def_stmt), gimple_assign_rhs2 (def_stmt)); else if ((code == BIT_NOT_EXPR && TYPE_PRECISION (TREE_TYPE (cond)) == 1) || (code == BIT_XOR_EXPR - && integer_onep (gimple_assign_rhs2 (def_stmt + && integer_truep (gimple_assign_rhs2 (def_stmt { tmp = gimple_assign_rhs1 (def_stmt); swap = true; } } if (tmp && is_gimple_condexpr (tmp)) { if (dump_file && tmp) { fprintf (dump_file, " Replaced '"); print_generic_expr (dump_file, cond, 0); fprintf (dump_file, "' with '"); print_generic_expr (dump_file, tmp, 0); fprintf (dump_file, "'\n"); } - if (integer_onep (tmp)) + if (integer_truep (tmp)) gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs2 (stmt)); else if (integer_zerop (tmp)) gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs3 (stmt)); else { gimple_assign_set_rhs1 (stmt, unshare_expr (tmp)); if (swap) { tree t = gimple_assign_rhs2 (stmt); gimple_assign_set_rhs2 (stmt, gimple_assign_rhs3 (stmt)); Index: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c === --- gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0) @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ccp1" } */ + +typedef long vec __attribute__ ((vector_size (2 * sizeof(long; + +
Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector
On Sat, 29 Sep 2012, Eric Botcazou wrote: this patch lets the compiler try to rewrite: (vec_concat (vec_select x [a]) (vec_select x [b])) as: vec_select x [a b] or even just "x" if appropriate. [...] OK, but: 1) Always add a comment describing the simplification when you add one, 2) The condition: + if (GET_MODE (XEXP (trueop0, 0)) == mode + && INTVAL (XVECEXP (XEXP (trueop1, 1), 0, 0)) + - INTVAL (XVECEXP (XEXP (trueop0, 1), 0, 0)) == 1) + return XEXP (trueop0, 0); can be simplified: if GET_MODE (XEXP (trueop0, 0)) == mode, then XEXP (trueop0, 0) is a 2-element vector so the only possible case is (0,1). That would probably even be more correct since you don't test CONST_INT_P for the indices, while the test is done in the VEC_SELECT case. It looks like I was trying to be clever by replacing 2 understandable tests with a single more obscure one, bad idea. Why not generalizing to all kinds of VEC_SELECTs instead of just scalar ones? Ok, I changed the patch a bit to handle arbitrary VEC_SELECTs, and moved the identity recognition to VEC_SELECT handling (where it belonged). Testing with non-scalar VEC_SELECTs was limited though, because they are not that easy to generate. Also, the identity case is the only one where it actually optimized. To handle more cases, I'd have to look through several layers of VEC_SELECTs, which gets a bit complicated (for instance, the permutation 0,1,3,2 will appear as a vec_concat of a vec_select(v,[0,1]) and a vec_select(vec_select(v,[2,3]),[1,0]), or worse with a vec_concat in the middle). It also didn't optimize 3,2,3,2, possibly because that meant substituting the same rtx twice (I didn't go that far in gdb). Then there is also the vec_duplicate case (I should try to replace vec_duplicate with vec_concat in simplify-rtx to see what happens...). Still, the identity case is nice to have. Thanks for your comments. bootstrap+testsuite on x86_64-linux-gnu with default languages. 2012-09-09 Marc Glisse gcc/ * simplify-rtx.c (simplify_binary_operation_1) : Detect the identity. : Handle VEC_SELECTs from the same vector. gcc/testsuite/ * gcc.target/i386/vect-rebuild.c: New testcase. -- Marc GlisseIndex: gcc/testsuite/gcc.target/i386/vect-rebuild.c === --- gcc/testsuite/gcc.target/i386/vect-rebuild.c(revision 0) +++ gcc/testsuite/gcc.target/i386/vect-rebuild.c(revision 0) @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mavx -fno-tree-forwprop" } */ + +typedef double v2df __attribute__ ((__vector_size__ (16))); +typedef double v4df __attribute__ ((__vector_size__ (32))); + +v2df f1 (v2df x) +{ + v2df xx = { x[0], x[1] }; + return xx; +} + +v4df f2 (v4df x) +{ + v4df xx = { x[0], x[1], x[2], x[3] }; + return xx; +} + +v2df g (v2df x) +{ + v2df xx = { x[1], x[0] }; + return xx; +} + +v2df h (v4df x) +{ + v2df xx = { x[2], x[3] }; + return xx; +} + +/* { dg-final { scan-assembler-not "unpck" } } */ +/* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */ +/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */ Property changes on: gcc/testsuite/gcc.target/i386/vect-rebuild.c ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c (revision 191865) +++ gcc/simplify-rtx.c (working copy) @@ -3239,20 +3239,37 @@ simplify_binary_operation_1 (enum rtx_co rtx x = XVECEXP (trueop1, 0, i); gcc_assert (CONST_INT_P (x)); RTVEC_ELT (v, i) = CONST_VECTOR_ELT (trueop0, INTVAL (x)); } return gen_rtx_CONST_VECTOR (mode, v); } + /* Recognize the identity. */ + if (GET_MODE (trueop0) == mode) + { + bool maybe_ident = true; + for (int i = 0; i < XVECLEN (trueop1, 0); i++) + { + rtx j = XVECEXP (trueop1, 0, i); + if (!CONST_INT_P (j) || INTVAL (j) != i) + { + maybe_ident = false; + break; + } + } + if (maybe_ident) + return trueop0; + } + /* If we build {a,b} then permute it, build the result directly. */ if (XVECLEN (trueop1, 0) == 2 && CONST_INT_P (XVECEXP (trueop1, 0, 0)) && CONST_INT_P (XVECEXP (trueop1, 0, 1)) && GET_CODE (trueop0) == VEC_CONCAT && GET_CODE (XEXP (trueop0,
Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector
On Mon, 1 Oct 2012, Eric Botcazou wrote: 2012-09-09 Marc Glisse gcc/ * simplify-rtx.c (simplify_binary_operation_1) : Detect the identity. : Handle VEC_SELECTs from the same vector. gcc/testsuite/ * gcc.target/i386/vect-rebuild.c: New testcase. OK if you adjust the above date and add the missing space at the end of: /* Try to merge 2 VEC_SELECTs from the same vector into a single one. */ I was trying to avoid splitting in 2 lines, but ok I'll split. Thank you for the quick reply, -- Marc Glisse
Re: vec_cond_expr adjustments
[merging both threads, thanks for the answers] On Mon, 1 Oct 2012, Richard Guenther wrote: optabs should be fixed instead, an is_gimple_val condition is implicitely val != 0. For vectors, I think it should be val < 0 (with an appropriate cast of val to a signed integer vector type if necessary). Or (val & highbit) != 0, but that's longer. I don't think so. Throughout the compiler we generally assume false == 0 and anything else is true. (yes, for FP there is STORE_FLAG_VALUE, but it's scope is quite limited - if we want sth similar for vectors we'd have to invent it). See below. If we for example have predicate = a < b; x = predicate ? d : e; y = predicate ? f : g; we ideally want to re-use the predicate computation on targets where that would be optimal (and combine should be able to recover the case where it is not). That I don't understand. The vcond instruction implemented by targets takes as arguments d, e, cmp, a, b and emits the comparison itself. I don't see how I can avoid sending to the targets both (d,e,<,a,b) and (f,g,<,a,b). They will notice eventually that a But that's a limitation of how vcond works. ISTR there is/was a vselect instruction as well, taking a "mask" and two vectors to select from. At least that's how vcond works internally for some sub-targets. vselect seems to only appear in config/. Would it be defined as: vselect(m,a,b)=(a&m)|(b&~m) ? I would almost be tempted to just define a pattern in .md files and let combine handle it, although it might be one instruction too long for that (and if m is x=y). Or would it match the OpenCL select: "For each component of a vector type, result[i] = if MSB of c[i] is set ? b[i] : a[i]."? Or the pattern with & and | but with a precondition that the value of each element of the mask must be 0 or ±1? I don't find vcond that bad, as long as targets check for trivial comparisons in the expansion (what trivial means may depend on the platform). It is quite flexible for targets. On Mon, 1 Oct 2012, Richard Guenther wrote: tmp = fold_build2_loc (gimple_location (def_stmt), code, - boolean_type_node, + TREE_TYPE (cond), That's obvious. Ok, I'll test and commit that line separately. + if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST) +{ + int count = VECTOR_CST_NELTS (op0); + tree *elts = XALLOCAVEC (tree, count); + gcc_assert (TREE_CODE (type) == VECTOR_TYPE); + + for (int i = 0; i < count; i++) + { + tree elem_type = TREE_TYPE (type); + tree elem0 = VECTOR_CST_ELT (op0, i); + tree elem1 = VECTOR_CST_ELT (op1, i); + + elts[i] = fold_relational_const (code, elem_type, + elem0, elem1); + + if(elts[i] == NULL_TREE) + return NULL_TREE; + + elts[i] = fold_negate_const (elts[i], elem_type); I think you need to invent something new similar to STORE_FLAG_VALUE or use STORE_FLAG_VALUE here. With the above you try to map {0, 1} to {0, -1} which is only true if the operation on the element types returns {0, 1} (thus, STORE_FLAG_VALUE is 1). Er, seems to me that constant folding of a scalar comparison in the front/middle-end only returns {0, 1}. +/* Return true if EXPR is an integer constant representing true. */ + +bool +integer_truep (const_tree expr) +{ + STRIP_NOPS (expr); + + switch (TREE_CODE (expr)) +{ +case INTEGER_CST: + /* Do not just test != 0, some places expect the value 1. */ + return (TREE_INT_CST_LOW (expr) == 1 + && TREE_INT_CST_HIGH (expr) == 0); I wonder if using STORE_FLAG_VALUE is better here (note that it usually differs for FP vs. integral comparisons and the mode passed to STORE_FLAG_VALUE is that of the comparison result). I notice there is already a VECTOR_STORE_FLAG_VALUE (used only once in simplify-rtx, in a way that seems a bit strange but I'll try to understand that later). Thanks for showing me this macro, it seems important indeed. However the STORE_FLAG_VALUE mechanism seems to be for the RTL level. It looks like it would be possible to have 3 different semantics: source code is OpenCL, middle-end whatever we want (0 / 1 for instance), and back-end is whatever the target wants. The front-end would generate for a That said, until we are sure what semantics we want here (forwprop for example doesn't look at 'comparisons' but operations on special values and types) I'd prefer to not introduce integer_truep (). I completely agree that defining the semantics comes first :-) -- Marc Glisse
abs(long long)
Hello, here is the patch from PR54686. Several notes: * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but still) is meant to return a double... * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should use defined(__SIZEOF_INT128__), which would help other compilers. * newlib has llabs, according to the doc. It would be good to know what newlib is missing for libstdc++ to detect it as C99-ready. I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review (no point if the patch needs changing again). 2012-10-02 Marc Glisse PR libstdc++/54686 * include/c_std/cstdlib (abs(long long)): Define fallback whenever we have long long but possibly not llabs. (abs(long long)): Use llabs when available. (abs(__int128)): Define when we have __int128. (div(long long, long long)): Use lldiv. * testsuite/26_numerics/headers/cstdlib/54686.c: New file. -- Marc GlisseIndex: include/c_std/cstdlib === --- include/c_std/cstdlib (revision 191941) +++ include/c_std/cstdlib (working copy) @@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using ::strtoul; using ::system; #ifdef _GLIBCXX_USE_WCHAR_T using ::wcstombs; using ::wctomb; #endif // _GLIBCXX_USE_WCHAR_T inline long abs(long __i) { return labs(__i); } +#if defined (_GLIBCXX_USE_LONG_LONG) \ +&& (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC) + // Fallback version if we don't have llabs but still allow long long. + inline long long + abs(long long __x) { return __x >= 0 ? __x : -__x; } +#endif + +#if !defined(__STRICT_ANSI__) && defined(_GLIBCXX_USE_INT128) + inline __int128 + abs(__int128 __x) { return __x >= 0 ? __x : -__x; } +#endif + inline ldiv_t div(long __i, long __j) { return ldiv(__i, __j); } _GLIBCXX_END_NAMESPACE_VERSION } // namespace #if _GLIBCXX_USE_C99 #undef _Exit #undef llabs @@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::lldiv_t; #endif #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC extern "C" void (_Exit)(int) throw () _GLIBCXX_NORETURN; #endif #if !_GLIBCXX_USE_C99_DYNAMIC using ::_Exit; #endif +#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC inline long long - abs(long long __x) { return __x >= 0 ? __x : -__x; } + abs(long long __x) { return ::llabs (__x); } -#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::llabs; inline lldiv_t div(long long __n, long long __d) - { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; } + { return ::lldiv (__n, __d); } using ::lldiv; #endif #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC extern "C" long long int (atoll)(const char *) throw (); extern "C" long long int (strtoll)(const char * __restrict, char ** __restrict, int) throw (); extern "C" unsigned long long int (strtoull)(const char * __restrict, char ** __restrict, int) throw (); @@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_END_NAMESPACE_VERSION } // namespace __gnu_cxx namespace std { #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::lldiv_t; #endif using ::__gnu_cxx::_Exit; - using ::__gnu_cxx::abs; #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC + using ::__gnu_cxx::abs; using ::__gnu_cxx::llabs; using ::__gnu_cxx::div; using ::__gnu_cxx::lldiv; #endif using ::__gnu_cxx::atoll; using ::__gnu_cxx::strtof; using ::__gnu_cxx::strtoll; using ::__gnu_cxx::strtoull; using ::__gnu_cxx::strtold; } // namespace std Index: testsuite/26_numerics/headers/cstdlib/54686.c === --- testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) +++ testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) @@ -0,0 +1,32 @@ +// { dg-do compile } +// { dg-options "-std=c++11" } + +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.or
abs(long long)
(Forgot libstdc++...) Hello, here is the patch from PR54686. Several notes: * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but still) is meant to return a double... * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should use defined(__SIZEOF_INT128__), which would help other compilers. * newlib has llabs, according to the doc. It would be good to know what newlib is missing for libstdc++ to detect it as C99-ready. I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review (no point if the patch needs changing again). 2012-10-02 Marc Glisse PR libstdc++/54686 * include/c_std/cstdlib (abs(long long)): Define fallback whenever we have long long but possibly not llabs. (abs(long long)): Use llabs when available. (abs(__int128)): Define when we have __int128. (div(long long, long long)): Use lldiv. * testsuite/26_numerics/headers/cstdlib/54686.c: New file. -- Marc GlisseIndex: include/c_std/cstdlib === --- include/c_std/cstdlib (revision 191941) +++ include/c_std/cstdlib (working copy) @@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using ::strtoul; using ::system; #ifdef _GLIBCXX_USE_WCHAR_T using ::wcstombs; using ::wctomb; #endif // _GLIBCXX_USE_WCHAR_T inline long abs(long __i) { return labs(__i); } +#if defined (_GLIBCXX_USE_LONG_LONG) \ +&& (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC) + // Fallback version if we don't have llabs but still allow long long. + inline long long + abs(long long __x) { return __x >= 0 ? __x : -__x; } +#endif + +#if !defined(__STRICT_ANSI__) && defined(_GLIBCXX_USE_INT128) + inline __int128 + abs(__int128 __x) { return __x >= 0 ? __x : -__x; } +#endif + inline ldiv_t div(long __i, long __j) { return ldiv(__i, __j); } _GLIBCXX_END_NAMESPACE_VERSION } // namespace #if _GLIBCXX_USE_C99 #undef _Exit #undef llabs @@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::lldiv_t; #endif #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC extern "C" void (_Exit)(int) throw () _GLIBCXX_NORETURN; #endif #if !_GLIBCXX_USE_C99_DYNAMIC using ::_Exit; #endif +#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC inline long long - abs(long long __x) { return __x >= 0 ? __x : -__x; } + abs(long long __x) { return ::llabs (__x); } -#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::llabs; inline lldiv_t div(long long __n, long long __d) - { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; } + { return ::lldiv (__n, __d); } using ::lldiv; #endif #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC extern "C" long long int (atoll)(const char *) throw (); extern "C" long long int (strtoll)(const char * __restrict, char ** __restrict, int) throw (); extern "C" unsigned long long int (strtoull)(const char * __restrict, char ** __restrict, int) throw (); @@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_END_NAMESPACE_VERSION } // namespace __gnu_cxx namespace std { #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::lldiv_t; #endif using ::__gnu_cxx::_Exit; - using ::__gnu_cxx::abs; #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC + using ::__gnu_cxx::abs; using ::__gnu_cxx::llabs; using ::__gnu_cxx::div; using ::__gnu_cxx::lldiv; #endif using ::__gnu_cxx::atoll; using ::__gnu_cxx::strtof; using ::__gnu_cxx::strtoll; using ::__gnu_cxx::strtoull; using ::__gnu_cxx::strtold; } // namespace std Index: testsuite/26_numerics/headers/cstdlib/54686.c === --- testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) +++ testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) @@ -0,0 +1,32 @@ +// { dg-do compile } +// { dg-options "-std=c++11" } + +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not
Re: abs(long long)
On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: On Tue, Oct 2, 2012 at 3:57 AM, Marc Glisse wrote: (Forgot libstdc++...) Hello, here is the patch from PR54686. Several notes: * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but still) is meant to return a double... don't we have a core issue about preferring unsigned -> long or long long? Here I am talking of a library issue: the wording that says that there are sufficient overloads such that integer types call the double version of math functions. It is fairly obvious that it doesn't apply to abs(long) for instance which has an explicit overload. For short or unsigned, I still read it as saying that it converts to double... * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should use defined(__SIZEOF_INT128__), which would help other compilers. Why would that be a problem with the appropriate #define? The library installed by the system was compiled with g++, and is then used with clang++. If we can avoid installing 2 config.h files to make that work... * newlib has llabs, according to the doc. It would be good to know what newlib is missing for libstdc++ to detect it as C99-ready. I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review (no point if the patch needs changing again). In general, I think I have a bias toward using compiler intrinsics, for which the compiler already has lot of knowledge about. More precisely, does that mean you want __builtin_llabs instead of ::llabs? I thought the compiler knew they were the same. -- Marc Glisse
Re: abs(long long)
On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: I understand that it is originally a library issue, but I don't think it makes sense to resolve it in isolation of that core issue. They seem mostly orthogonal to me, since the library only uses an informal language describing the desired outcome and not the actual overloads necessary to achieve it, whereas the core issue is about determining priorities for a non-ambiguous overload resolution (if we are talking about the same, where Jens Maurer has a proposal). The library installed by the system was compiled with g++, and is then used with clang++. If we can avoid installing 2 config.h files to make that work... Two things: 1. that is clearly a clang problem. I don't think it is libstdc++'s job tp try to solve clang's misguided configuration and installation. Translated: libstdc++ should only ever be used with the very version of g++ that was used to compile it. clang++, icpc, sunCC, etc should never try to use a libstdc++ compiled with another compiler. I am not saying libstdc++ should go to great lengths to support other compilers, but when it is actually easier to support them than not to... (testing a macro is easier than a configure test) 2. I am not sure you understand what I wrote: you can leave the use of the current macro the way it is if you appropriately define it in terms of what you want to change it to. I was complaining about the configure-time nature of the macro. If it is defined at each compiler run based on __SIZEOF_INT128__, I am happy. More precisely, does that mean you want __builtin_llabs instead of ::llabs? I thought the compiler knew they were the same. Yes. Another reason is that it simplifies the implementation AND if people want want to do something with the intrinsics' fallback libstdc++ will gracefully deliver that. I don't see how that simplifies the implementation, it is several characters longer than ::llabs, and we still need to handle llabs. Or do you mean: always call __builtin_llabs (whether we have an llabs or not), and let the compiler replace it with either (x<0)?-x:x or a library call (I assume it never does that unless it has seen a corresponding declaration)? Note that I am happy to let you take over this PR if you like. -- Marc Glisse
Re: abs(long long)
On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: Whining on this list about libstdc++ internal macros and your dislike of them is not going to produce anything today or tomorrow. Other compilers using libstdc++ was just an extra argument. Even if g++ was the only compiler on earth, I would still consider a compile-time test superior to a configure test. The macro __SIZEOF_INT128__ was invented precisely for this purpose. Yes, that's just more whining ;-) On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: On Tue, Oct 2, 2012 at 8:07 AM, Marc Glisse wrote: Or do you mean: always call __builtin_llabs (whether we have an llabs or not), and let the compiler replace it with either (x<0)?-x:x or a library call (I assume it never does that unless it has seen a corresponding declaration)? See what we did in c/cmath and c_global/cmath. Note that llabs is quite different from asin. __builtin_llabs generates an ABS_EXPR, which will later be expanded either to a special instruction or to a condition. It never generates a call to llabs (I am not sure exactly if Paolo's instructions to use llabs meant he wanted an actual library call). __builtin_asin on the other hand is never expanded inline (except maybe for special constant input like 0) and expands to a call to the library function asin. Would the attached patch be better, assuming it passes testing? For lldiv, there is no builtin (for good reason). * include/c_std/cstdlib (abs(long long)): Define with __builtin_llabs when we have long long. (abs(__int128)): Define when we have __int128. (div(long long, long long)): Use lldiv. -- Marc GlisseIndex: cstdlib === --- cstdlib (revision 191941) +++ cstdlib (working copy) @@ -128,21 +128,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using ::strtod; using ::strtol; using ::strtoul; using ::system; #ifdef _GLIBCXX_USE_WCHAR_T using ::wcstombs; using ::wctomb; #endif // _GLIBCXX_USE_WCHAR_T inline long - abs(long __i) { return labs(__i); } + abs(long __i) { return __builtin_labs(__i); } + +#ifdef _GLIBCXX_USE_LONG_LONG + inline long long + abs(long long __x) { return __builtin_llabs (__x); } +#endif + +#if !defined(__STRICT_ANSI__) && defined(_GLIBCXX_USE_INT128) + inline __int128 + abs(__int128 __x) { return __x >= 0 ? __x : -__x; } +#endif inline ldiv_t div(long __i, long __j) { return ldiv(__i, __j); } _GLIBCXX_END_NAMESPACE_VERSION } // namespace #if _GLIBCXX_USE_C99 #undef _Exit @@ -161,29 +171,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::lldiv_t; #endif #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC extern "C" void (_Exit)(int) throw () _GLIBCXX_NORETURN; #endif #if !_GLIBCXX_USE_C99_DYNAMIC using ::_Exit; #endif - inline long long - abs(long long __x) { return __x >= 0 ? __x : -__x; } - #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::llabs; inline lldiv_t div(long long __n, long long __d) - { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; } + { return ::lldiv (__n, __d); } using ::lldiv; #endif #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC extern "C" long long int (atoll)(const char *) throw (); extern "C" long long int (strtoll)(const char * __restrict, char ** __restrict, int) throw (); extern "C" unsigned long long int (strtoull)(const char * __restrict, char ** __restrict, int) throw (); @@ -198,21 +205,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_END_NAMESPACE_VERSION } // namespace __gnu_cxx namespace std { #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::lldiv_t; #endif using ::__gnu_cxx::_Exit; - using ::__gnu_cxx::abs; #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::llabs; using ::__gnu_cxx::div; using ::__gnu_cxx::lldiv; #endif using ::__gnu_cxx::atoll; using ::__gnu_cxx::strtof; using ::__gnu_cxx::strtoll; using ::__gnu_cxx::strtoull; using ::__gnu_cxx::strtold;
Re: abs(long long)
On Wed, 3 Oct 2012, Gabriel Dos Reis wrote: On Tue, Oct 2, 2012 at 1:53 PM, Marc Glisse wrote: * include/c_std/cstdlib (abs(long long)): Define with __builtin_llabs when we have long long. (abs(__int128)): Define when we have __int128. This change is OK Thanks, I'll commit that part only. (div(long long, long long)): Use lldiv. not this one. Ok. Note that glibc's implementation does more than just / and %. Possible reasons are: 1) glibc does useless work 2) libstdc++ has a bug 3) there are platforms supported by glibc but not by libstdc++ I choose to believe it is option 3. -- Marc Glisse
Re: patch to fix
On Wed, 3 Oct 2012, Kenneth Zadeck wrote: The patch defines a new datatype, a 'wide_int' (defined in wide-int.[ch], and this datatype will be used to perform all of the integer constant math in the compiler. Externally, wide-int is very similar to double-int except that it does not have the limitation that math must be done on exactly two HOST_WIDE_INTs. Internally, a wide_int is a structure that contains a fixed sized array of HOST_WIDE_INTs, a length field and a mode. The size of the array is determined at generation time by dividing the number of bits of the largest integer supported on the target by the number of bits in a HOST_WIDE_INT of the host. Thus, with this format, any length of integer can be supported on any host. Hello, did you consider making the size of wide_int a template parameter, now that we are using C++? All with a convenient typedef or macro so it doesn't show. I am asking because in vrp I do some arithmetic that requires 2*N+1 bits where N is the size of double_int. -- Marc Glisse
Re: patch to fix
On Wed, 3 Oct 2012, Mike Stump wrote: On Oct 3, 2012, at 1:47 PM, Marc Glisse wrote: did you consider making the size of wide_int a template parameter, now that we are using C++? All with a convenient typedef or macro so it doesn't show. I am asking because in vrp I do some arithmetic that requires 2*N+1 bits where N is the size of double_int. No, not really. I'd maybe answer it this way, we put in a type (singular) to support all integral constants in all languages on a port. Since we only needed 1, there was little need to templatize it. By supporting all integral constants in all languages, there is little need for more. If Ada say, wanted a 2048 bit integer, then, we just have it drop off the size it wants someplace and we would mix that in on a MAX(….) line, net result, the type we use would then directly support the needs of Ada. If vpr wanted 2x of all existing modes, we could simply change the MAX equation and essentially double it; if people need that. This comes as a cost, as the intermediate wide values are fixed size allocated (not variable); so these all would be larger. And this cost could be eliminated by having a template wide_int_ so only the places that need it actually use the extra size ;-) On Wed, 3 Oct 2012, Kenneth Zadeck wrote: i have already converted the vrp code, so i have some guess at where you are talking about. (of course correct me if i am wrong). in the code that computes the range when two variables are multiplied together needs to do a multiplication that produces a result that is twice as wide as the inputs. Yes, exactly. my library is able to do that with one catch (and this is a big catch): the target has to have an integer mode that is twice as big as the mode of the operands. The issue is that wide-ints actually carry around the mode of the value in order to get the bitsize and precision of the operands (it does not have the type, because this code has to both work on the rtl and tree level and i generally do not want the signness anyway). my current code in vrp checks to see if such a mode exists and if it does, it produces the product. if the mode does not exist, it returns bottom. What this means is that for most (many or some) targets that have a TImode, the largest thing that particular vrp discover ranges for is a DImode value. We could get around this by defining the next larger mode than what the target really needs but i wonder how much mileage you are going to get out of that with really large numbers. This will be for discussion when you submit that next patch, but currently VRP handles integers the same size as double_int. In particular, it handles __int128. I would be unhappy if introducing a larger bigint type in gcc made us regress there. -- Marc Glisse
Re: patch to fix
On Thu, 4 Oct 2012, Kenneth Zadeck wrote: On 10/04/2012 09:17 AM, Marc Glisse wrote: On Wed, 3 Oct 2012, Mike Stump wrote: On Oct 3, 2012, at 1:47 PM, Marc Glisse wrote: did you consider making the size of wide_int a template parameter, now that we are using C++? All with a convenient typedef or macro so it doesn't show. I am asking because in vrp I do some arithmetic that requires 2*N+1 bits where N is the size of double_int. No, not really. I'd maybe answer it this way, we put in a type (singular) to support all integral constants in all languages on a port. Since we only needed 1, there was little need to templatize it. By supporting all integral constants in all languages, there is little need for more. If Ada say, wanted a 2048 bit integer, then, we just have it drop off the size it wants someplace and we would mix that in on a MAX(….) line, net result, the type we use would then directly support the needs of Ada. If vpr wanted 2x of all existing modes, we could simply change the MAX equation and essentially double it; if people need that. This comes as a cost, as the intermediate wide values are fixed size allocated (not variable); so these all would be larger. And this cost could be eliminated by having a template wide_int_ so only the places that need it actually use the extra size ;-) The space is not really an issue in most places since wide-ints tend to be short lived. You were the one talking of a cost. However the real question is what are you going to instantiate the template on?What we do is look at the target and determine the largest type that the target supports and build a wide int type that supports that.how are you going to do better? In a single place in tree-vrp.c in the code that evaluates multiplications, I would instantiate the template on the double (possibly +1) of the value you selected as large enough for all constants. For all the rest, your type is fine. This will be for discussion when you submit that next patch, but currently VRP handles integers the same size as double_int. In particular, it handles __int128. I would be unhappy if introducing a larger bigint type in gcc made us regress there. You are only happy now because you do not really understand the world around you. I did not want to go into details, but let me re-phrase: I do not want to regress. Currently, hosts with a 64 bit hwi can handle VRP multiplications on __int128. If your patch introducing better big integers breaks that, that sounds bad to me, since I would expect s/double_int/wide_int/ to just work, and using wide_int<2*MAX> would just be a potential simplification of the code for later. Note that VRP is just the one case I am familiar with. Using templates should (I haven't checked) be completely trivial and help the next person who needs bigger integers for a specific purpose and doesn't want to penalize the whole compiler. If the size of wide_int is completely irrelevant and we can make it 10 times larger without thinking, I guess some numbers showing it would be great (or maybe that's common knowledge, then I guess it is fine). Now those are only some comments from an occasional contributor, not reviewer requirements, it is fine to ignore them. -- Marc Glisse
Re: patch to fix
On Wed, 3 Oct 2012, Kenneth Zadeck wrote: i have already converted the vrp code, so i have some guess at where you are talking about. (of course correct me if i am wrong). in the code that computes the range when two variables are multiplied together needs to do a multiplication that produces a result that is twice as wide as the inputs. my library is able to do that with one catch (and this is a big catch): the target has to have an integer mode that is twice as big as the mode of the operands. The issue is that wide-ints actually carry around the mode of the value in order to get the bitsize and precision of the operands (it does not have the type, because this code has to both work on the rtl and tree level and i generally do not want the signness anyway). Ah, after reading the whole thread, now I understand that it is because wide_int carries a mode that it makes little sense making it a template (sorry that it took me so long when the information was in your first answer). I understand that it would be inconvenient (much longer code) to have a base_wide_int that does just the arithmetic and a wrapper that contains the mode as well. Your idea below to define dummy extra modes does bring the template idea back to the table though ;-) my current code in vrp checks to see if such a mode exists and if it does, it produces the product. if the mode does not exist, it returns bottom. What this means is that for most (many or some) targets that have a TImode, the largest thing that particular vrp discover ranges for is a DImode value. We could get around this by defining the next larger mode than what the target really needs but i wonder how much mileage you are going to get out of that with really large numbers. The current wrapping multiplication code in vrp works with a pair of double_int, so it should keep working with a pair of wide_int. I see now why wide_int doesn't allow to simplify the code, but it doesn't have to break. -- Marc Glisse
Re: patch to fix
On Thu, 4 Oct 2012, Kenneth Zadeck wrote: There are a bunch of ways to skin the cat. 1) we can define the extra mode. 2) if we get rid of the mode inside the wide int and replace it with an explicit precision and bitsize, then we can just make the size of the buffer twice as big as the analysis of the modes indicates. 3) or we can leave your code in a form that uses 2 wide ints. my current patch (which i have not gotten working yet) changes this to use the mul_full call, but it could be changed. It is much simpler that the existing code. Thanks, we are exactly on the same page :-) i do not see how templates offer any solution at all. the wide int code really needs to have something valid to indicate the length of the object, and if there is no mode that big, the code will ice. It would be possible to define the regular wide_int taking into account only valid modes, and only use the dummy larger modes in very specific circumstances, where the template parameter would somehow indicate the last mode that may appear in it. This is not a recommendation at all, just an explanation of why templates might have had something to do with it. my personal feeling is that range analysis is quite useful for small integers and not so much as the values get larger. The only really large integer constants that you are likely to find in real code are encryption keys, Note that gcc regularly decides to use unsigned types where the code is signed, so a "small" constant like -1 can be huge. -- Marc Glisse
Re: vec_cond_expr adjustments
ble to easily recover from this intermediate step with combine (I'm not sure), so a target dependent value may be prefered. Being able to optimize it is indeed a key point. Let's try on an example (not assuming any specific representation in the middle-end for now). Say I write this C/OpenCL code: ((a The front-end gives to the middle-end: aOn an architecture like sse, neon or altivec where VECTOR_STORE_FLAG_VALUE is -1 (well, should be), expansion of (acan also disappear if the vcond instruction only looks at the MSB (x86). And we are left in the back-end with ((a On other architectures, expecting the back-end to simplify everything does seem hard. But it isn't obvious how to handle it in the middle end either. Some other forms we could imagine the middle-end producing: (aBut then how do we handle for instance sparc, where IIUC comparing 2 vectors returns an integer, where bits 0, 1, etc of the integer represent true/false for the comparisons of elements 0, 1, etc of the vectors (as in vec_merge, but not constant)? Defining VECTOR_STORE_FLAG_VALUE is not possible since comparisons don't return a vector, but we would still want to compute athe usual code for the selection. If we assume a -1/0 and MSB representation in the middle-end, the front-end could just pass ((amoving to the back-end, "nothing" would happen on x86. Comparing x86, neon and altivec, they all have comparisons that return a vector of -1 and 0. On the other hand, they have different selection instructions. x86 uses <0, altivec uses !=0 and neon has a bitwise select and thus requires exactly -1 or 0. It thus seems to me that we should decide in the middle-end that vector comparisons return vectors of -1 and 0. VEC_COND_EXPR is more complicated. We could for instance require that it takes as first argument a vector of -1 and 0 (thus <0, !=0 and the neon thing are equivalent). Which would leave to decide what the expansion of vec_cond_expr passes to the targets when the first argument is not a comparison, between !=0, <0, ==-1 or others (I vote for <0 because of opencl). One issue is that targets wouldn't know if it was a dummy comparison that can safely be ignored because the other part is the result of logical operations on comparisons (thus composed of -1 and 0) or a genuine comparison with an arbitrary vector, so a new optimization would be needed (in the back-end I guess or we would need an alternate instruction to vcond) to detect if a vector is a "signed boolean" vector. We could instead say that vec_cond_expr really follows OpenCL's semantics and looks at the MSB of each element. I am not sure that would change much, it would mostly delay the apparition of <0 to RTL expansion time (and thus make gimple slightly lighter). +/* Return true if EXPR is an integer constant representing true. */ + +bool +integer_truep (const_tree expr) +{ + STRIP_NOPS (expr); + + switch (TREE_CODE (expr)) +{ +case INTEGER_CST: + /* Do not just test != 0, some places expect the value 1. */ + return (TREE_INT_CST_LOW (expr) == 1 + && TREE_INT_CST_HIGH (expr) == 0); I wonder if using STORE_FLAG_VALUE is better here (note that it usually differs for FP vs. integral comparisons and the mode passed to STORE_FLAG_VALUE is that of the comparison result). I notice there is already a VECTOR_STORE_FLAG_VALUE (used only once in simplify-rtx, in a way that seems a bit strange but I'll try to understand that later). Thanks for showing me this macro, it seems important indeed. However the STORE_FLAG_VALUE mechanism seems to be for the RTL level. It looks like it would be possible to have 3 different semantics: source code is OpenCL, middle-end whatever we want (0 / 1 for instance), and back-end is whatever the target wants. The front-end would generate for a seems like the middle-end uses this for lowering vector compares, a < b -> { a[0] < b[0] ? -1 : 0, ... } and for a?b:c : vec_cond_expr(a<0,b,c) it looks like ?: is not generally handled by tree-vect-generic, so it must be either not supported by the frontend or lowered therein (ISTR it is forced to appear as a != {0,...} ? ... : ...) Not supported by the front-end yet (not even by the gimplifier), I have (bad) patches but I can't really finish them before this conversation is done. I think there are quite few places in the middle-end that assume that comparisons return a vector of -1/0 and even fewer that vec_cond_expr only looks at the MSB of each element. So it is still time to change that if you want to. But if we want to change it, I think it should happen now before even more vector code gets in (not particularly my patches, I am thinking of cilk and others too). Ok, that's long enough, I need to send it now... -- Marc Glisse
Re: [C++] Mixed scalar-vector operations
Ping http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01557.html On Fri, 21 Sep 2012, Marc Glisse wrote: Hello, this patch adds mixed scalar-vector operations to the C++ front-end. It also adds a few operators to the C front-end (comparisons in particular). This patch is mostly an import from the C front-end (with the maybe_const stuff removed). 2012-09-22 Marc Glisse PR c++/54427 c/ * c-typeck.c: Include c-common.h. (enum stv_conv): Moved to c-common.h. (scalar_to_vector): Moved to c-common.c. * Make-lang.in: c-typeck.c depends on c-common.h. c-family/ * c-common.c (scalar_to_vector): Moved from c-typeck.c. Support more operations. * c-common.h (enum stv_conv): Moved from c-typeck.c. (scalar_to_vector): Declare. cp/ * typeck.c (cp_build_binary_op): Handle mixed scalar-vector operations. [LSHIFT_EXPR, RSHIFT_EXPR]: Likewise. gcc/ * fold-const.c (fold_binary_loc): Use build_zero_cst instead of build_int_cst for a potential vector. testsuite/ * c-c++-common/vector-scalar.c: New testcase. * g++.dg/ext/vector5.C: This is not an error anymore. * gcc.dg/init-vec-1.c: Move ... * c-c++-common/init-vec-1.c: ... here. Adapt error message. * gcc.c-torture/execute/vector-shift1.c: Move ... * c-c++-common/torture/vector-shift1.c: ... here. * gcc.dg/scal-to-vec1.c: Move ... * c-c++-common/scal-to-vec1.c: ... here. Avoid narrowing for C++11. Adapt error message. * gcc.dg/convert-vec-1.c: Move ... * c-c++-common/convert-vec-1.c: ... here. * gcc.dg/scal-to-vec2.c: Move ... * c-c++-common/scal-to-vec2.c: ... here. -- Marc Glisse
Re: [C++] Mixed scalar-vector operations
On Fri, 5 Oct 2012, Jason Merrill wrote: On 09/21/2012 02:32 PM, Marc Glisse wrote: + gcc_assert (TREE_CODE (type0) == VECTOR_TYPE + || TREE_CODE (type1) == VECTOR_TYPE); + switch (code) +{ + case RSHIFT_EXPR: + case LSHIFT_EXPR: + if (TREE_CODE (type0) == INTEGER_TYPE + && TREE_CODE (TREE_TYPE (type1)) == INTEGER_TYPE) Here you're asserting that one of the types is a vector and then assuming that type1 is a vector and type0 is not. I guess you need to move the swapping code out of the switch. I didn't write this code, but my understanding is the following. Most operations (like PLUS_EXPR) want 2 arguments of the same type. [LR]SHIFT_EXPR are special and also accept to have a vector as first argument and a scalar as second argument (but not the reverse). If the first argument is scalar (TREE_CODE (type0) == INTEGER_TYPE), then the second must be a vector, and we return that the first argument needs converting. Otherwise, we don't perform any conversion (return stv_nothing). cp_build_binary_op has special code for *SHIFT_EXPR for the case of a vector and a scalar in this order. (I don't know why it was decided that *SHIFT_EXPR would be special that way, and I don't mind handling it like the other operations if you prefer) + error_at (loc, "conversion of scalar to vector " + "involves truncation"); These errors should print the types involved. They also need to be suppressed when !(complain & tf_error). Will do. + op0 = convert (TREE_TYPE (type1), op0); + op0 = build_vector_from_val (type1, op0); I don't see anything in cp_build_binary_op that makes sure that the VECTOR_TYPE is in type1. These 2 lines are in a switch in the case where scalar_to_vector returned stv_firstarg, meaning that the first arg (op0) is a scalar that needs to be converted to a vector of the same type as op1. And some lines above, there is: type1 = TREE_TYPE (op1); Am I just missing some comments in the code, or is there something wrong? I think I should at least change the comment on scalar_to_vector from: /* Convert scalar to vector for the range of operations. */ to something like: /* Determine which of the operands, if any, is a scalar that needs to be converted to a vector for the range of operations. */ And add something in scalar_to_vector about the SHIFT_EXPRs. Thanks for the comments, -- Marc Glisse
Re: vec_cond_expr adjustments
On Mon, 8 Oct 2012, Richard Guenther wrote: VEC_COND_EXPR is more complicated. We could for instance require that it takes as first argument a vector of -1 and 0 (thus <0, !=0 and the neon thing are equivalent). Which would leave to decide what the expansion of vec_cond_expr passes to the targets when the first argument is not a comparison, between !=0, <0, ==-1 or others (I vote for <0 because of opencl). One issue is that targets wouldn't know if it was a dummy comparison that can safely be ignored because the other part is the result of logical operations on comparisons (thus composed of -1 and 0) or a genuine comparison with an arbitrary vector, so a new optimization would be needed (in the back-end I guess or we would need an alternate instruction to vcond) to detect if a vector is a "signed boolean" vector. We could instead say that vec_cond_expr really follows OpenCL's semantics and looks at the MSB of each element. I am not sure that would change much, it would mostly delay the apparition of <0 to RTL expansion time (and thus make gimple slightly lighter). I think we should delay the decision on how to optimize this. It's indeed not trivial and the GIMPLE middle-end aggressively forwards feeding comparisons into the VEC_COND_EXPR expressions already (somewhat defeating any CSE that might be possible here) in forwprop. Thanks for going through the long email :-) What does that imply for the first argument of VEC_COND_EXPR? Currently, the expander asserts that it is a comparison, but that is not reflected in the gimple checkers. If we document that VEC_COND_EXPR takes a vector of -1 and 0 (which is the case for a comparison), I don't think it prevents from later relaxing that to <0 or !=0. But then I don't know how to handle expansion when the argument is neither a comparison (vcond) nor a constant (vec_merge? I haven't tried but that should be doable), I would have to pass <0 or !=0 to the target. So is the best choice to document that VEC_COND_EXPR takes as first argument a comparison and make gimple checking reflect that? (seems sad, but at least that would tell me what I can/can't do) By the way, since we are documenting comparisons as returning 0 and -1, does that bring back the integer_truep predicate? -- Marc Glisse