Re: [PATCH] Add pow -> exp hack for SPEC2k17 628.pop2_s (PR tree-optimization/82004)
On Tue, 27 Mar 2018, Jakub Jelinek wrote: > Hi! > > As mentioned in the PR, sw_absorption.fppized.f90 relies on pow in > x = log10 (something) - y; > for (...) > { > x = x + y; > z = pow (10.0, x); > } > where x + y in the first iteration is exactly -3 to be >= 0.001, > unfortunately with the pow(cst, x) -> exp (log (cst) * x) optimization > with -Ofast and -flto this returns something a few ulps smaller than that > and the benchmark fails. > > In the PR I've attached quite large patch that attempts to optimize the > case using x = x * cst;, unfortunately even for -Ofast measures that > generates quite big relative errors when the loop has 400 iterations. > > So, instead this simple patch just tries to detect the case where we > have on some edge pow (10.0, integer) and just doesn't attempt to optimize > it in that case to exp. If glibc folks add an optimized exp10 eventually, > we can switch it later on to emitting exp10 instead. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Few comments below. > 2018-03-27 Jakub Jelinek > > PR tree-optimization/82004 > * generic-match-head.c (optimize_pow_to_exp): New function. > * gimple-match-head.c (optimize_pow_to_exp): New function. > * match.pd (pow(C,x) -> exp(log(C)*x)): Don't fold if > optimize_pow_to_exp is false. > > * gcc.dg/pr82004.c: New test. > > --- gcc/generic-match-head.c.jj 2018-02-13 09:33:31.089560180 +0100 > +++ gcc/generic-match-head.c 2018-03-27 18:28:36.663913272 +0200 > @@ -77,3 +77,11 @@ canonicalize_math_after_vectorization_p > { >return false; > } > + > +/* Return true if pow(cst, x) should be optimized into exp(log(cst) * x). */ > + > +static bool > +optimize_pow_to_exp (tree arg0, tree arg1) > +{ > + return false; > +} So instead of this simply guard the pattern with #if GIMPLE as we already do for some. That saves in generic-match.c code size. > --- gcc/gimple-match-head.c.jj2018-02-13 09:33:31.107560174 +0100 > +++ gcc/gimple-match-head.c 2018-03-27 18:48:21.205369113 +0200 > @@ -840,3 +840,55 @@ canonicalize_math_after_vectorization_p > { >return !cfun || (cfun->curr_properties & PROP_gimple_lvec) != 0; > } > + > +/* Return true if pow(cst, x) should be optimized into exp(log(cst) * x). > + As a workaround for SPEC CPU2017 628.pop2_s, don't do it if arg0 > + is 10.0, arg1 = phi_res + cst1 and phi_res = PHI > + where cst1 + cst2 is an exact integer, because then pow (10.0, arg1) > + will likely be exact, while exp (log (10.0) * arg1) might be not. */ So this looks somewhat like a very specific SPEC hack to me. Can we make this sligtly more generic and consider pow (integer, cst1 [+- cst2]) where the 2nd arg is an exact integer? Basically allow not only 10 for arg0 but any real_is_integer plus allow arg1 to be directly fed by the PHI, not only via an intermediate plus (here also consider a minus, negate or other binary op we can fed to const_binop). ? The motivation is of course still SPEC but the actual implementation should also consider slightly different but similar situations where people might rely on exact pow() to be not optimized. Btw, did anybody file a defect with SPEC for this? Richard. > +static bool > +optimize_pow_to_exp (tree arg0, tree arg1) > +{ > + gcc_assert (TREE_CODE (arg0) == REAL_CST); > + REAL_VALUE_TYPE ten; > + real_from_integer (&ten, TYPE_MODE (TREE_TYPE (arg0)), 10, SIGNED); > + if (!real_identical (TREE_REAL_CST_PTR (arg0), &ten)) > +return true; > + > + if (TREE_CODE (arg1) != SSA_NAME) > +return true; > + > + gimple *def = SSA_NAME_DEF_STMT (arg1); > + if (!is_gimple_assign (def) > + || gimple_assign_rhs_code (def) != PLUS_EXPR > + || TREE_CODE (gimple_assign_rhs1 (def)) != SSA_NAME > + || TREE_CODE (gimple_assign_rhs2 (def)) != REAL_CST) > +return true; > + > + gphi *phi = dyn_cast (SSA_NAME_DEF_STMT (gimple_assign_rhs1 > (def))); > + if (!phi) > +return true; > + > + tree cst = NULL_TREE; > + int n = gimple_phi_num_args (phi); > + for (int i = 0; i < n; i++) > +{ > + tree arg = PHI_ARG_DEF (phi, i); > + if (TREE_CODE (arg) != REAL_CST) > + continue; > + else if (cst == NULL_TREE) > + cst = arg; > + else if (!operand_equal_p (cst, arg, 0)) > + return true; > +} > + > + tree cst2 = const_binop (PLUS_EXPR, TREE_TYPE (cst), cst, > +gimple_assign_rhs2 (def)); > + if (cst2 > + && TREE_CODE (cst2) == REAL_CST > + && real_isinteger (TREE_REAL_CST_PTR (cst2), > + TYPE_MODE (TREE_TYPE (cst2 > +return false; > + return true; > +} > --- gcc/match.pd.jj 2018-03-13 09:12:29.579110925 +0100 > +++ gcc/match.pd 2018-03-27 18:29:38.292936995 +0200 > @@ -4016,7 +4016,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > because exp(log(C)*x), while faster, will have worse precision > and if x folds into a constant too, th
Re: [PATCH] [PR c++/84979] improve auto handling in explicit tmpl args for concepts
On Mar 23, 2018, Jason Merrill wrote: > On Fri, Mar 23, 2018 at 3:38 PM, Alexandre Oliva wrote: >> + /* Concepts allows 'auto' in template arguments, even multiple >> + 'auto's in a single argument. > I think that's only intended for class templates; Aah, that explains a lot! Dang, it was so close! It actually makes sense; was there any discussion about standardizing this, with reasons to reject this construct, or is it just that we aren't there yet? I wouldn't be surprised if it were to appear in some future version of C++. > we should reject 'auto' as a template argument for function or > variable templates. Is this (in the patch below) the best spot to test for it? [PR c++/84979] reject auto in explicit tmpl args for tmpl-fn With concepts, we accept auto in explicit template arguments, but we should only accept them for template classes. Passing them to template functions is not allowed. So, reject it. Regstrapping on i686- and x86-64-linux-gnu, after a successful make check-c++-all. Ok to install if it passes? for gcc/cp/ChangeLog PR c++/84979 * semantics.c (finish_id_expression): Reject template args referencing auto for template fns. for gcc/testsuite/ChangeLog PR c++/84979 * g++.dg/concepts/pr84979.C: New. --- gcc/cp/semantics.c | 11 +-- gcc/testsuite/g++.dg/concepts/pr84979.C |9 + 2 files changed, 18 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/concepts/pr84979.C diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 035e39515748..9b33234c8b73 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -3528,8 +3528,15 @@ finish_id_expression (tree id_expression, /* If we have a template-id, then no further lookup is required. If the template-id was for a template-class, we will sometimes have a TYPE_DECL at this point. */ - else if (TREE_CODE (decl) == TEMPLATE_ID_EXPR - || TREE_CODE (decl) == TYPE_DECL) + else if (TREE_CODE (decl) == TEMPLATE_ID_EXPR) +{ + if (flag_concepts && type_uses_auto (decl)) + { + *error_msg = "invalid use of % in template argument for template function"; + return error_mark_node; + } +} + else if (TREE_CODE (decl) == TYPE_DECL) ; /* Look up the name. */ else diff --git a/gcc/testsuite/g++.dg/concepts/pr84979.C b/gcc/testsuite/g++.dg/concepts/pr84979.C new file mode 100644 index ..c70b3756f2b8 --- /dev/null +++ b/gcc/testsuite/g++.dg/concepts/pr84979.C @@ -0,0 +1,9 @@ +// { dg-do compile { target c++11 } } +// { dg-options "-fconcepts" } + +template void foo() {} + +void bar() +{ + foo(); // { dg-error "invalid|no match" } +} -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PATCH PR81647][AARCH64] Fix handling of Unordered Comparisons in aarch64-simd.md
Hi On 20/03/18 10:57, Sudakshina Das wrote: Hi On 20/03/18 08:13, Christophe Lyon wrote: On 19 March 2018 at 19:55, Sudakshina Das wrote: Hi On 19/03/18 14:29, James Greenhalgh wrote: On Fri, Dec 15, 2017 at 11:57:46AM +, Sudi Das wrote: Hi This patch fixes the inconsistent behavior observed at -O3 for the unordered comparisons. According to the online docs (https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gccint/Unary-and-Binary-Expressions.html), all of the following should not raise an FP exception: - UNGE_EXPR - UNGT_EXPR - UNLE_EXPR - UNLT_EXPR - UNEQ_EXPR Also ORDERED_EXPR and UNORDERED_EXPR should only return zero or one. The aarch64-simd.md handling of these were generating exception raising instructions such as fcmgt. This patch changes the instructions that are emitted to in order to not give out the exceptions. We first check each operand for NaNs and force any elements containing NaN to zero before using them in the compare. Example: UN (a, b) -> UNORDERED (a, b) | (cm (isnan (a) ? 0.0 : a, isnan (b) ? 0.0 : b)) The ORDERED_EXPR is now handled as (cmeq (a, a) & cmeq (b, b)) and UNORDERED_EXPR as ~ORDERED_EXPR and UNEQ as (~ORDERED_EXPR | cmeq (a,b)). Testing done: Checked for regressions on bootstrapped aarch64-none-linux-gnu and added a new test case. Is this ok for trunk? This will probably need a back-port to gcc-7-branch as well. OK. Let it soak on trunk for a while before the backport. Backported both r258653 and Christophe's r258672 to gcc-7-branch as r258917 (reg-tested). Needed non-functional edits because of the name change from vec_cmp to vec_cmp between gcc-7 and trunk. Thanks Sudi Thanks. Committed to trunk as r258653. Will wait a week before backport. Hi, As the test failed to compile on aarch64 bare-metal targets, I added /* { dg-require-effective-target fenv_exceptions } */ as obvious (r258672). 2018-03-20 Christophe Lyon PR target/81647 * gcc.target/aarch64/pr81647.c: Require fenv_exceptions. Index: testsuite/gcc.target/aarch64/pr81647.c === --- testsuite/gcc.target/aarch64/pr81647.c (revision 258671) +++ testsuite/gcc.target/aarch64/pr81647.c (revision 258672) @@ -1,5 +1,6 @@ /* { dg-do run } */ /* { dg-options "-O3 -fdump-tree-ssa" } */ +/* { dg-require-effective-target fenv_exceptions } */ #include Christophe Thanks for fixing this and apologies for missing it on the first place! Sudi Sudi Thanks, James ChangeLog Entries: *** gcc/ChangeLog *** 2017-12-15 Sudakshina Das PR target/81647 * config/aarch64/aarch64-simd.md (vec_cmp): Modify instructions for UNLT, UNLE, UNGT, UNGE, UNEQ, UNORDERED and ORDERED. *** gcc/testsuite/ChangeLog *** 2017-12-15 Sudakshina Das PR target/81647 * gcc.target/aarch64/pr81647.c: New. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index f90f74fe7fd5990a97b9f4eb68f5735b7d4fb9aa..acff06c753b3e3aaa5775632929909afa4d3294b 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2731,10 +2731,10 @@ break; } /* Fall through. */ - case UNGE: + case UNLT: std::swap (operands[2], operands[3]); /* Fall through. */ - case UNLE: + case UNGT: case GT: comparison = gen_aarch64_cmgt; break; @@ -2745,10 +2745,10 @@ break; } /* Fall through. */ - case UNGT: + case UNLE: std::swap (operands[2], operands[3]); /* Fall through. */ - case UNLT: + case UNGE: case GE: comparison = gen_aarch64_cmge; break; @@ -2771,21 +2771,35 @@ case UNGT: case UNLE: case UNLT: - case NE: - /* FCM returns false for lanes which are unordered, so if we use - the inverse of the comparison we actually want to emit, then - invert the result, we will end up with the correct result. - Note that a NE NaN and NaN NE b are true for all a, b. - - Our transformations are: - a UNGE b -> !(b GT a) - a UNGT b -> !(b GE a) - a UNLE b -> !(a GT b) - a UNLT b -> !(a GE b) - a NE b -> !(a EQ b) */ - gcc_assert (comparison != NULL); - emit_insn (comparison (operands[0], operands[2], operands[3])); - emit_insn (gen_one_cmpl2 (operands[0], operands[0])); + { + /* All of the above must not raise any FP exceptions. Thus we first + check each operand for NaNs and force any elements containing NaN to + zero before using them in the compare. + Example: UN (a, b) -> UNORDERED (a, b) | + (cm (isnan (a) ? 0.0 : a, + isnan (b) ? 0.0 : b)) + We use the following transformations for doing the comparisions: +
Re: [PR middle-end/70359] uncoalesce IVs outside of loops
On Tue, Mar 20, 2018 at 2:36 PM, Richard Biener wrote: > On Mon, Mar 19, 2018 at 6:08 PM, Aldy Hernandez wrote: >> Hi Richard. >> >> As discussed in the PR, the problem here is that we have two different >> iterations of an IV live outside of a loop. This inhibits us from using >> autoinc/dec addressing on ARM, and causes extra lea's on x86. >> >> An abbreviated example is this: >> >> loop: >> # p_9 = PHI >> p_20 = p_9 + 18446744073709551615; >> goto loop >> p_24 = p_9 + 18446744073709551614; >> MEM[(char *)p_20 + -1B] = 45; >> >> Here we have both the previous IV (p_9) and the current IV (p_20) used >> outside of the loop. On Arm this keeps us from using auto-dec addressing, >> because one use is -2 and the other one is -1. >> >> With the attached patch we attempt to rewrite out-of-loop uses of the IV in >> terms of the current/last IV (p_20 in the case above). With it, we end up >> with: >> >> p_24 = p_20 + 18446744073709551615; >> *p_24 = 45; >> >> ...which helps both x86 and Arm. >> >> As you have suggested in comment 38 on the PR, I handle specially >> out-of-loop IV uses of the form IV+CST and propagate those accordingly >> (along with the MEM_REF above). Otherwise, in less specific cases, we un-do >> the IV increment, and use that value in all out-of-loop uses. For instance, >> in the attached testcase, we rewrite: >> >> george (p_9); >> >> into >> >> _26 = p_20 + 1; >> ... >> george (_26); >> >> The attached testcase tests the IV+CST specific case, as well as the more >> generic case with george(). >> >> Although the original PR was for ARM, this behavior can be noticed on x86, >> so I tested on x86 with a full bootstrap + tests. I also ran the specific >> test on an x86 cross ARM build and made sure we had 2 auto-dec with the >> test. For the original test (slightly different than the testcase in this >> patch), with this patch we are at 104 bytes versus 116 without it. There is >> still the issue of a division optimization which would further reduce the >> code size. I will discuss this separately as it is independent from this >> patch. >> >> Oh yeah, we could make this more generic, and maybe handle any multiple of >> the constant, or perhaps *= and /=. Perhaps something for next stage1... >> >> OK for trunk? > > Sorry for noticing so late but you use loop properties like ->latch and > ->loop_father (via flow_bb_inside_loop_p) that may not be valid at > this point given expand doesn't do loop_optimizer_init/finalize. Generally > you may face loops with multiple latches (->latch == NULL) here and > loops may have multiple exits. You probably are not hitting any > of those problems because is_iv_plus_constant is quite restrictive > (doesn't handle PLUS_EXPR or MINUS_EXPR). > > Also the latch doesn't need to be the block with the exit conditional. > > So first of all you'd need to wrap insert_backedge_copies () with > > loop_optimizer_init (AVOID_CFG_MODIFICATIONS); > ... > loop_optimizer_finalize (); > > that is required to fixup an eventually broken state. Then you > probably should restrict handling to PHIs in BBs with > bb->loop_father->header == bb (aka loop header PHIs), > use get_loop_exit_edges when the IV increment is of OK form > and then use dominator info to query in which exit block to > insert compensation (or simply refuse to do anything for > loops with multiple exits). > > So if you have more cycles to work on this that would be nice, > otherwise I can certainly take over from here... Ok, so I've finally taken a closer look. There's a thing that goes wrong unpatched with the current insert_backedge_copies code. It does detect the coalescing conflict but instead of solving it in a way so the backedge copy can be coalesced it inserts a copy on that edge (actually before the exit test). It could have done better by inserting a copy before the IV increment and adjusting out-of-loop uses to use that. There's also another IV the existing code triggers on un-helpfully inserting a copy between two uses (exit test and producer of the backedge value) instead of sinking the producer after the last use (splitting the latch edge) (doesn't help on x86 where we combine div and mod later, recreating the conflict). Iff there were not similar increments to be adjusted outside of the loop with our approach inserting the copy inside the loop would have been the better choice I guess. So on x86 I see with the loop copy .L2: movl%ecx, %eax xorl%edx, %edx movq%rbx, %rbp <--- pre-inc copy decq%rbx divl%esi addl$48, %edx movb%dl, (%rbx) cmpl$9, %ecx jbe .L7 movl%eax, %ecx jmp .L2 .L7: testl %edi, %edi jns .L1 movq%rbp, %rdi callgeorge movb$45, -1(%rbx) leaq-2(%rbp), %rbx .L1: movq%rbx, %rax vs. sth like the original fix .L2: movl%ecx, %
Re: [PATCH] Fix compile-time hog in MPX boundary checking (PR target/84988).
On Wed, Mar 28, 2018 at 08:31:38AM +0200, Martin Liška wrote: > Ok, so should we make the set of cum->bnds_in_bt based on > flag_check_pointer_bounds flag? > > If so, I've got patch that I've tested on my x86_64-linux-gnu machin. > > Martin > >From 7b5978e61305c5098a084c2352fcbacb4c347158 Mon Sep 17 00:00:00 2001 > From: marxin > Date: Wed, 21 Mar 2018 10:51:32 +0100 > Subject: [PATCH] Do not call chkp_type_bounds_count if MPX is not enabled (PR > target/84988). > > gcc/ChangeLog: > > 2018-03-21 Martin Liska > > PR target/84988 > * config/i386/i386.c (ix86_function_arg_advance): Do not call > chkp_type_bounds_count if MPX is not enabled. LGTM. > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -8618,7 +8618,8 @@ ix86_function_arg_advance (cumulative_args_t cum_v, > machine_mode mode, >if (cum->caller) > cfun->machine->outgoing_args_on_stack = true; > > - cum->bnds_in_bt = chkp_type_bounds_count (type); > + if (flag_check_pointer_bounds) > + cum->bnds_in_bt = chkp_type_bounds_count (type); > } > } > > -- > 2.16.2 > Jakub
Re: [PATCH][i386,AVX] Fix PR84783 - backport missing permutexvar to GCC7
On Tue, Mar 27, 2018 at 03:18:12PM +, Peryt, Sebastian wrote: > Hi Jakub, > > Gentle ping. Ok for 7.4. > > > 2018-03-22 Sebastian Peryt > > > > > > gcc: > > > PR84783 > > > * config/i386/avx512vlintrin.h (_mm256_permutexvar_epi64) > > > (_mm256_permutexvar_epi32, _mm256_permutex_epi64): New > > intrinsics. > > > > > > gcc/testsuite: > > > PR84783 > > > > > > * gcc.target/i386/avx512vl-vpermd-1.c (_mm256_permutexvar_epi32): > > > Test new intrinsic. > > > * gcc.target/i386/avx512vl-vpermq-imm-1.c > > (_mm256_permutex_epi64): > > > Ditto. > > > * gcc.target/i386/avx512vl-vpermq-var-1.c > > (_mm256_permutexvar_epi64): > > > Ditto. > > > * gcc.target/i386/avx512f-vpermd-2.c: Do not check for AVX512F_LEN. > > > * gcc.target/i386/avx512f-vpermq-imm-2.c: Ditto. > > > * gcc.target/i386/avx512f-vpermq-var-2.c: Ditto. > > > > > > Is it ok for merge? > > Your patch is pretty much simple and is OK to me. > > > > However, since you're aiming to GCC 7, I'd like to here GM's OK here as > > well. Jakub
[C++ Patch] PR 85028 ("[8 Regression] ICE on invalid C++ code: in tsubst_default_argument, at cp/pt.c:12340")
Hi, as I said in the audit trail, in its way this error recovery issue is somewhat interesting: Jason's r251422 added some code at the beginning of tsubst_default_argument, included the gcc_assert that triggers here. In fact, parmtype is only used in the assertion thus the error recovery check could be moved inside the assertion, something like (appears to also pass testing, lightly tested, so far): gcc_assert (parmtype == error_mark_node || same_type_ignoring_top_level_qualifiers_p (type, parmtype)); and for this bug we would be back to the gcc-7 status, thus we would not ICE and we would issue *2* errors, one for the parameter and then one more for the default argument itself, at instantiation time. In fact, some other compilers also do that. Or, as I have below, we can return early, after the first error. Tested x86_64-linux. Thanks, Paolo. /// /cp 2018-03-28 Paolo Carlini PR c++/85028 * pt.c (tsubst_default_argument): Early return if the type of the parameter is erroneous. /testsuite 2018-03-28 Paolo Carlini PR c++/85028 * g++.dg/other/default13.C: New. Index: cp/pt.c === --- cp/pt.c (revision 258915) +++ cp/pt.c (working copy) @@ -12337,6 +12337,9 @@ tsubst_default_argument (tree fn, int parmnum, tre tree parmtype = TREE_TYPE (parm); if (DECL_BY_REFERENCE (parm)) parmtype = TREE_TYPE (parmtype); + if (parmtype == error_mark_node) +return error_mark_node; + gcc_assert (same_type_ignoring_top_level_qualifiers_p (type, parmtype)); tree *slot; Index: testsuite/g++.dg/other/default13.C === --- testsuite/g++.dg/other/default13.C (nonexistent) +++ testsuite/g++.dg/other/default13.C (working copy) @@ -0,0 +1,11 @@ +// PR c++/85028 + +struct A; + +template < typename > struct B +{ + B (int, A = A()) : f (0) {} // { dg-error "incomplete type" } + int f; +}; + +B < int > b (0);
Re: [PATCH,nvptx] Fix PR85056
On 03/27/2018 01:17 AM, Tom de Vries wrote: > On 03/26/2018 11:57 PM, Cesar Philippidis wrote: >> As noted in PR85056, the nvptx BE isn't declaring external arrays using >> PTX array notation. Specifically, it's emitting code that's missing the >> empty angle brackets '[]'. > > [ FYI, see https://en.wikipedia.org/wiki/Bracket > > For '[]' I find "square brackets, closed brackets, hard brackets, third > brackets, crotchets, or brackets (US)". > > Angle brackets are different symbols. ] Sorry, you're correct. I meant square brackets. >> This patch corrects that problem. >> >> Tom, in contrast to my earlier patch in the PR, this patch only >> considers external arrays. The patch I posted incorrectly handled >> zero-length arrays and empty structs. >> >> I tested this patch with a standalone nvptx toolchain using newlib 3.0, >> and I found no new regressions. However I'm still waiting for the >> results that are using the older version of newlib. Is this patch OK for >> trunk if the results come back clean? >> > > OK for stage4 trunk. Can I backport this patch to GCC 6 and 7? That fix is necessary to build an updated version of newlib that I'm working on. Thanks, Cesar > [ A minor style nit: in submission emails, rather than having the very > specific but rather non-descriptive subject "Fix PR85056", move the PR > number to "[PATCH,nvptx,PR85056]" and add a subject line that describes > the nature of the patch, f.i.: "Fix declaration of external array with > unknown size". > > So, something like: > ... > [PATCH,nvptx,PR85056] Fix declaration of external array with unknown size > ... > > Then, use the subject line as commit log header line (dropping "PATCH", > and the PR number): > ... > [nvptx] Fix declaration of external array with unknown size > ... > ] > > Thanks, > - Tom > >> Thanks, >> Cesar >> >> >> nvptx-extern-arrays.diff >> >> >> 2018-03-26 Cesar Philippidis >> >> gcc/ >> >> PR target/85056 >> * config/nvptx/nvptx.c (nvptx_assemble_decl_begin): Add '[]' to >> extern array declarations. >> >> gcc/testsuite/ >> * testsuite/gcc.target/nvptx/pr85056.c: New test. >> * testsuite/gcc.target/nvptx/pr85056a.c: New test. >> >> >> diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c >> index 3cb33ae8c2d..38f25add6ab 100644 >> --- a/gcc/config/nvptx/nvptx.c >> +++ b/gcc/config/nvptx/nvptx.c >> @@ -2038,6 +2038,9 @@ static void >> nvptx_assemble_decl_begin (FILE *file, const char *name, const char >> *section, >> const_tree type, HOST_WIDE_INT size, unsigned align) >> { >> + bool atype = (TREE_CODE (type) == ARRAY_TYPE) >> + && (TYPE_DOMAIN (type) == NULL_TREE); >> + >> while (TREE_CODE (type) == ARRAY_TYPE) >> type = TREE_TYPE (type); >> @@ -2077,6 +2080,8 @@ nvptx_assemble_decl_begin (FILE *file, const >> char *name, const char *section, >> /* We make everything an array, to simplify any initialization >> emission. */ >> fprintf (file, "[" HOST_WIDE_INT_PRINT_DEC "]", >> init_frag.remaining); >> + else if (atype) >> + fprintf (file, "[]"); >> } >> /* Called when the initializer for a decl has been completely >> output through >> diff --git a/gcc/testsuite/gcc.target/nvptx/pr85056.c >> b/gcc/testsuite/gcc.target/nvptx/pr85056.c >> new file mode 100644 >> index 000..fe7f8af856e >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/nvptx/pr85056.c >> @@ -0,0 +1,20 @@ >> +/* { dg-do run } */ >> +/* { dg-additional-sources "pr85056a.c" } */ >> + >> +extern void abort (); >> + >> +extern int a[]; >> + >> +int >> +main () >> +{ >> + int i, sum; >> + >> + for (i = 0; i < 10; i++) >> + sum += a[i]; >> + >> + if (sum != 55) >> + abort (); >> + >> + return 0; >> +} >> diff --git a/gcc/testsuite/gcc.target/nvptx/pr85056a.c >> b/gcc/testsuite/gcc.target/nvptx/pr85056a.c >> new file mode 100644 >> index 000..a45a5f2b07f >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/nvptx/pr85056a.c >> @@ -0,0 +1,3 @@ >> +/* { dg-skip-if "" { *-*-* } } */ >> + >> +int a[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }; >> >
Re: [PATCH,nvptx] Fix PR85056
On 03/28/2018 03:43 PM, Cesar Philippidis wrote: OK for stage4 trunk. Can I backport this patch to GCC 6 and 7? Yes please. Thanks, - Tom
Re: [PATCH] Add pow -> exp hack for SPEC2k17 628.pop2_s (PR tree-optimization/82004, take 2)
On Wed, 28 Mar 2018, Jakub Jelinek wrote: > On Wed, Mar 28, 2018 at 09:46:52AM +0200, Richard Biener wrote: > > > +/* Return true if pow(cst, x) should be optimized into exp(log(cst) * > > > x). */ > > > + > > > +static bool > > > +optimize_pow_to_exp (tree arg0, tree arg1) > > > +{ > > > + return false; > > > +} > > > > So instead of this simply guard the pattern with #if GIMPLE as we already > > do for some. That saves in generic-match.c code size. > > Ok. > > > --- gcc/gimple-match-head.c.jj2018-02-13 09:33:31.107560174 +0100 > > > +++ gcc/gimple-match-head.c 2018-03-27 18:48:21.205369113 +0200 > > > @@ -840,3 +840,55 @@ canonicalize_math_after_vectorization_p > > > { > > >return !cfun || (cfun->curr_properties & PROP_gimple_lvec) != 0; > > > } > > > + > > > +/* Return true if pow(cst, x) should be optimized into exp(log(cst) * x). > > > + As a workaround for SPEC CPU2017 628.pop2_s, don't do it if arg0 > > > + is 10.0, arg1 = phi_res + cst1 and phi_res = PHI > > > + where cst1 + cst2 is an exact integer, because then pow (10.0, arg1) > > > + will likely be exact, while exp (log (10.0) * arg1) might be not. */ > > > > So this looks somewhat like a very specific SPEC hack to me. > > Can we make this sligtly more generic and consider > > > > pow (integer, cst1 [+- cst2]) > > > > where the 2nd arg is an exact integer? Basically allow not only > > 10 for arg0 but any real_is_integer plus allow arg1 to be directly > > fed by the PHI, not only via an intermediate plus (here also consider > > a minus, negate or other binary op we can fed to const_binop). > > Like this? I'm just handling PLUS_EXPR and MINUS_EXPR and no +/- at all. > Furthermore, I've moved it to the exp replacement only, if we do exp2, it is > fine to optimize it even if cst1 + cst2 is integer, it is just the exp and > log that isn't exact for originally integral values. Yes, that looks good. > > Btw, did anybody file a defect with SPEC for this? > > I don't really have any SPEC experience, so will defer that to those > involved with SPEC benchmarking. The fix would be > to add some small epsilon to the >if (chlcnc(m) .le. chlamnt .and. & >chlamnt .le. chlcnc(m+1) ) then > comparisons, but bet they'll argue that the upstream pop does it that way. Probably... Thanks, Richard. > 2018-03-28 Jakub Jelinek > > PR tree-optimization/82004 > * gimple-match-head.c (optimize_pow_to_exp): New function. > * match.pd (pow(C,x) -> exp(log(C)*x)): Wrap with #if GIMPLE. > Don't fold to exp if optimize_pow_to_exp is false. > > * gcc.dg/pr82004.c: New test. > > --- gcc/gimple-match-head.c.jj2018-03-27 19:08:32.047824741 +0200 > +++ gcc/gimple-match-head.c 2018-03-28 15:30:42.687565271 +0200 > @@ -840,3 +840,71 @@ canonicalize_math_after_vectorization_p > { >return !cfun || (cfun->curr_properties & PROP_gimple_lvec) != 0; > } > + > +/* Return true if pow(cst, x) should be optimized into exp(log(cst) * x). > + As a workaround for SPEC CPU2017 628.pop2_s, don't do it if arg0 > + is an exact integer, arg1 = phi_res +/- cst1 and phi_res = PHI > + where cst2 +/- cst1 is an exact integer, because then pow (arg0, arg1) > + will likely be exact, while exp (log (arg0) * arg1) might be not. > + Also don't do it if arg1 is phi_res above and cst2 is an exact integer. > */ > + > +static bool > +optimize_pow_to_exp (tree arg0, tree arg1) > +{ > + gcc_assert (TREE_CODE (arg0) == REAL_CST); > + if (!real_isinteger (TREE_REAL_CST_PTR (arg0), TYPE_MODE (TREE_TYPE > (arg0 > +return true; > + > + if (TREE_CODE (arg1) != SSA_NAME) > +return true; > + > + gimple *def = SSA_NAME_DEF_STMT (arg1); > + gphi *phi = dyn_cast (def); > + tree cst1 = NULL_TREE; > + enum tree_code code = ERROR_MARK; > + if (!phi) > +{ > + if (!is_gimple_assign (def)) > + return true; > + code = gimple_assign_rhs_code (def); > + switch (code) > + { > + case PLUS_EXPR: > + case MINUS_EXPR: > + break; > + default: > + return true; > + } > + if (TREE_CODE (gimple_assign_rhs1 (def)) != SSA_NAME > + || TREE_CODE (gimple_assign_rhs2 (def)) != REAL_CST) > + return true; > + > + cst1 = gimple_assign_rhs2 (def); > + > + phi = dyn_cast (SSA_NAME_DEF_STMT (gimple_assign_rhs1 (def))); > + if (!phi) > + return true; > +} > + > + tree cst2 = NULL_TREE; > + int n = gimple_phi_num_args (phi); > + for (int i = 0; i < n; i++) > +{ > + tree arg = PHI_ARG_DEF (phi, i); > + if (TREE_CODE (arg) != REAL_CST) > + continue; > + else if (cst2 == NULL_TREE) > + cst2 = arg; > + else if (!operand_equal_p (cst2, arg, 0)) > + return true; > +} > + > + if (cst1 && cst2) > +cst2 = const_binop (code, TREE_TYPE (cst2), cst2, cst1); > + if (cst2 > + && TREE_CODE (cst2) == REAL_CST > + && real_isinteger (TREE_REAL_CST_PTR (cst2), > +
[PATCH, GCC-7, GCC-6][ARM][PR target/84826] Backport Fix ICE in extract_insn, at recog.c:2304 on arm-linux-gnueabihf
Hi This patch is a request to backport r258777 and r258805 to gcc-7-branch and gcc-6-branch. The same ICE occurs in both the branches with -fstack-check. Thus the test case directive has been changed. The discussion on the patch that went into trunk is: https://gcc.gnu.org/ml/gcc-patches/2018-03/msg01120.html Testing : Regtested on both the branches with arm-none-linux-gnueabihf Is this ok for gcc-7 and gcc-6? Sudi ChangeLog entries: *** gcc/ChangeLog *** 2018-03-28 Sudakshina Das Backport from mainline 2018-03-22 Sudakshina Das PR target/84826 * config/arm/arm.h (machine_function): Add static_chain_stack_bytes. * config/arm/arm.c (arm_compute_static_chain_stack_bytes): Avoid re-computing once computed. (arm_expand_prologue): Compute machine->static_chain_stack_bytes. (arm_init_machine_status): Initialize machine->static_chain_stack_bytes. *** gcc/testsuite/ChangeLog *** 2018-03-28 Sudakshina Das * gcc.target/arm/pr84826.c: Change dg-option to -fstack-check. Backport from mainline 2018-03-23 Sudakshina Das PR target/84826 * gcc.target/arm/pr84826.c: Add dg directive. Backport from mainline 2018-03-22 Sudakshina Das PR target/84826 * gcc.target/arm/pr84826.c: New test. diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 25953f5..68a6fa5 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -1420,6 +1420,9 @@ typedef struct GTY(()) machine_function machine_mode thumb1_cc_mode; /* Set to 1 after arm_reorg has started. */ int after_arm_reorg; + /* The number of bytes used to store the static chain register on the + stack, above the stack frame. */ + int static_chain_stack_bytes; } machine_function; #endif diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 6f7ca43..886bcfa 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -19097,6 +19097,11 @@ arm_r3_live_at_start_p (void) static int arm_compute_static_chain_stack_bytes (void) { + /* Once the value is updated from the init value of -1, do not + re-compute. */ + if (cfun->machine->static_chain_stack_bytes != -1) +return cfun->machine->static_chain_stack_bytes; + /* See the defining assertion in arm_expand_prologue. */ if (IS_NESTED (arm_current_func_type ()) && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM) @@ -21395,6 +21400,11 @@ arm_expand_prologue (void) emit_insn (gen_movsi (stack_pointer_rtx, r1)); } + /* Let's compute the static_chain_stack_bytes required and store it. Right + now the value must the -1 as stored by arm_init_machine_status (). */ + cfun->machine->static_chain_stack_bytes += arm_compute_static_chain_stack_bytes (); + /* The static chain register is the same as the IP register. If it is clobbered when creating the frame, we need to save and restore it. */ clobber_ip = IS_NESTED (func_type) @@ -24542,6 +24552,7 @@ arm_init_machine_status (void) #if ARM_FT_UNKNOWN != 0 machine->func_type = ARM_FT_UNKNOWN; #endif + machine->static_chain_stack_bytes = -1; return machine; } diff --git a/gcc/testsuite/gcc.target/arm/pr84826.c b/gcc/testsuite/gcc.target/arm/pr84826.c new file mode 100644 index 000..563ce51 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr84826.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-options "-Ofast -fstack-check" } */ + +void d (void *); + +void a () +{ + int b; + void bar (int c) + { +if (__builtin_expect (c, 0)) + ++b; + } + d (bar); +}
[PATCH] Fix wrong use-after-scope sanitization for omp variable (PR sanitizer/85081).
Hi. I'm sending Jakub's patch, where I removed the guard in asan_poison_variable. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Martin gcc/ChangeLog: 2018-03-28 Jakub Jelinek Martin Liska PR sanitizer/85081 * gimplify.c (asan_poison_variable): Don't do the check for gimplify_omp_ctxp here. (gimplify_decl_expr): Do it here. (gimplify_target_expr): Likewise. gcc/testsuite/ChangeLog: 2018-03-28 Jakub Jelinek Martin Liska PR sanitizer/85081 * g++.dg/asan/pr85081.C: New test. --- gcc/gimplify.c | 10 -- gcc/testsuite/g++.dg/asan/pr85081.C | 20 2 files changed, 24 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/g++.dg/asan/pr85081.C diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 115f80bef9b..c32869b4c59 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1168,10 +1168,6 @@ static void asan_poison_variable (tree decl, bool poison, gimple_stmt_iterator *it, bool before) { - /* When within an OMP context, do not emit ASAN_MARK internal fns. */ - if (gimplify_omp_ctxp) -return; - tree unit_size = DECL_SIZE_UNIT (decl); tree base = build_fold_addr_expr (decl); @@ -1689,7 +1685,8 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p) && !TREE_STATIC (decl) && !DECL_HAS_VALUE_EXPR_P (decl) && DECL_ALIGN (decl) <= MAX_SUPPORTED_STACK_ALIGNMENT - && dbg_cnt (asan_use_after_scope)) + && dbg_cnt (asan_use_after_scope) + && !gimplify_omp_ctxp) { asan_poisoned_variables->add (decl); asan_poison_variable (decl, false, seq_p); @@ -6614,7 +6611,8 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p) } if (asan_poisoned_variables && DECL_ALIGN (temp) <= MAX_SUPPORTED_STACK_ALIGNMENT - && dbg_cnt (asan_use_after_scope)) + && dbg_cnt (asan_use_after_scope) + && !gimplify_omp_ctxp) { tree asan_cleanup = build_asan_poison_call_expr (temp); if (asan_cleanup) diff --git a/gcc/testsuite/g++.dg/asan/pr85081.C b/gcc/testsuite/g++.dg/asan/pr85081.C new file mode 100644 index 000..d7dec311450 --- /dev/null +++ b/gcc/testsuite/g++.dg/asan/pr85081.C @@ -0,0 +1,20 @@ +/* PR sanitizer/85081 */ +/* { dg-do run } */ +/* { dg-options "-fopenmp-simd" } */ +/* { dg-require-effective-target fopenmp } */ + +inline const int& max(const int& a, const int& b) +{ + return a < b ? b : a; +} + +int main() +{ + #pragma omp simd + for ( int i = 0; i < 20; ++i ) + { +const int j = max(i, 1); + } + + return 0; +}
Re: [PATCH] Improve adc discovery during combine on x86 (PR target/85095, take 2)
On Wed, Mar 28, 2018 at 2:54 PM, Jakub Jelinek wrote: > On Wed, Mar 28, 2018 at 08:51:07AM +0200, Uros Bizjak wrote: >> > --- gcc/config/i386/i386.md.jj 2018-03-27 12:54:54.685244368 +0200 >> > +++ gcc/config/i386/i386.md 2018-03-27 19:38:43.891451026 +0200 >> > @@ -6854,6 +6854,23 @@ (define_insn "add3_carry" >> > (set_attr "pent_pair" "pu") >> > (set_attr "mode" "")]) >> > >> > +(define_insn "*add3_carry0" >> >> Please name this "*add3_carry_0". You will also need to >> introduce "*addsi3_carry_zext_0". Probably minus patterns have the >> same problem, simplify-rtx probably removes (minus ... const_rtx0), >> too. >> >> > + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") >> > + (plus:SWI >> > + (match_operator:SWI 3 "ix86_carry_flag_operator" >> > + [(match_operand 2 "flags_reg_operand") (const_int 0)]) >> > + (match_operand:SWI 1 "nonimmediate_operand" "0"))) >> > + (clobber (reg:CC FLAGS_REG))] >> > + "ix86_unary_operator_ok (PLUS, mode, operands)" >> > +{ >> > + operands[4] = const0_rtx; >> > + return "adc{}\t{%4, %0|%0, %4}"; >> >> Just use "$0" ("0" in intel syntax) in the insn template. > > So, like this, if it passes bootstrap/regtest? The testcases use all > 4 new patterns now. Yes, this is OK for mainline and eventual backports to release branches. Thanks, Uros. > 2018-03-28 Jakub Jelinek > > PR target/85095 > * config/i386/i386.md (*add3_carry_0, *addsi3_carry_zext_0, > *sub3_carry_0, *subsi3_carry_zext_0): New patterns. > > * gcc.target/i386/pr85095-1.c: New test. > * gcc.target/i386/pr85095-2.c: New test. > * gcc.c-torture/execute/pr85095.c: New test. > > --- gcc/config/i386/i386.md.jj 2018-03-27 21:56:42.855454830 +0200 > +++ gcc/config/i386/i386.md 2018-03-28 14:12:35.430079720 +0200 > @@ -6854,6 +6854,20 @@ (define_insn "add3_carry" > (set_attr "pent_pair" "pu") > (set_attr "mode" "")]) > > +(define_insn "*add3_carry_0" > + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") > + (plus:SWI > + (match_operator:SWI 3 "ix86_carry_flag_operator" > + [(match_operand 2 "flags_reg_operand") (const_int 0)]) > + (match_operand:SWI 1 "nonimmediate_operand" "0"))) > + (clobber (reg:CC FLAGS_REG))] > + "ix86_unary_operator_ok (PLUS, mode, operands)" > + "adc{}\t{$0, %0|%0, 0}" > + [(set_attr "type" "alu") > + (set_attr "use_carry" "1") > + (set_attr "pent_pair" "pu") > + (set_attr "mode" "")]) > + > (define_insn "*addsi3_carry_zext" >[(set (match_operand:DI 0 "register_operand" "=r") > (zero_extend:DI > @@ -6870,6 +6884,20 @@ (define_insn "*addsi3_carry_zext" > (set_attr "pent_pair" "pu") > (set_attr "mode" "SI")]) > > +(define_insn "*addsi3_carry_zext_0" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (zero_extend:DI > + (plus:SI (match_operator:SI 2 "ix86_carry_flag_operator" > + [(reg FLAGS_REG) (const_int 0)]) > + (match_operand:SI 1 "register_operand" "0" > + (clobber (reg:CC FLAGS_REG))] > + "TARGET_64BIT" > + "adc{l}\t{$0, %k0|%k0, 0}" > + [(set_attr "type" "alu") > + (set_attr "use_carry" "1") > + (set_attr "pent_pair" "pu") > + (set_attr "mode" "SI")]) > + > ;; There is no point to generate ADCX instruction. ADC is shorter and faster. > > (define_insn "addcarry" > @@ -6926,6 +6954,20 @@ (define_insn "sub3_carry" > (set_attr "pent_pair" "pu") > (set_attr "mode" "")]) > > +(define_insn "*sub3_carry_0" > + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") > + (minus:SWI > + (match_operand:SWI 1 "nonimmediate_operand" "0") > + (match_operator:SWI 3 "ix86_carry_flag_operator" > + [(match_operand 2 "flags_reg_operand") (const_int 0)]))) > + (clobber (reg:CC FLAGS_REG))] > + "ix86_unary_operator_ok (MINUS, mode, operands)" > + "sbb{}\t{$0, %0|%0, 0}" > + [(set_attr "type" "alu") > + (set_attr "use_carry" "1") > + (set_attr "pent_pair" "pu") > + (set_attr "mode" "")]) > + > (define_insn "*subsi3_carry_zext" >[(set (match_operand:DI 0 "register_operand" "=r") > (zero_extend:DI > @@ -6941,6 +6983,21 @@ (define_insn "*subsi3_carry_zext" >[(set_attr "type" "alu") > (set_attr "use_carry" "1") > (set_attr "pent_pair" "pu") > + (set_attr "mode" "SI")]) > + > +(define_insn "*subsi3_carry_zext_0" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (zero_extend:DI > + (minus:SI > + (match_operand:SI 1 "register_operand" "0") > + (match_operator:SI 2 "ix86_carry_flag_operator" > + [(reg FLAGS_REG) (const_int 0)] > + (clobber (reg:CC FLAGS_REG))] > + "TARGET_64BIT" > + "sbb{l}\t{$0, %k0|%k0, 0}" > + [(set_attr "type" "alu") > + (set_attr "use_carry" "1") > + (set_attr "pent_pair" "pu") > (set_attr "mode" "SI")]) > > (define_insn "sub3_carry_ccc" > --- gcc/testsuite/gcc.target/i386/pr85095-1
Re: [PATCH] Fix wrong use-after-scope sanitization for omp variable (PR sanitizer/85081).
On Wed, Mar 28, 2018 at 04:14:45PM +0200, Martin Liška wrote: > Hi. > > I'm sending Jakub's patch, where I removed the guard in asan_poison_variable. > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Martin > > gcc/ChangeLog: > > 2018-03-28 Jakub Jelinek > Martin Liska > > PR sanitizer/85081 > * gimplify.c (asan_poison_variable): Don't do the check for > gimplify_omp_ctxp here. > (gimplify_decl_expr): Do it here. > (gimplify_target_expr): Likewise. > > gcc/testsuite/ChangeLog: > > 2018-03-28 Jakub Jelinek > Martin Liska > > PR sanitizer/85081 > * g++.dg/asan/pr85081.C: New test. I guess it is ok for trunk for now, but for stage1 I'd like analysis on why you've added that !gimplify_omp_ctxp stuff at all. E.g. for local vars inside of OpenMP/OpenACC constructs there should be no reason why they couldn't be unpoisoned and poisoned inside of the OpenMP region. Jakub
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
On 03/21/2018 11:34 AM, Jakub Jelinek wrote: On Wed, Mar 14, 2018 at 02:54:00PM +0100, Martin Liška wrote: The variable is not named very well, shouldn't it be avoid_libcall or something similar? Perhaps the variable should have a comment describing what it is. Do you need separate argument for that bool and is_move_done, rather than just the flag being that some pointer to bool is non-NULL? Can you please explain me how to replace the 2 new arguments? So you have one bool arg and one pointer arg into which the function stores true and optionally based on that other bool arg and other conditions stores false. I'm suggesting just one bool * argument, which is NULL if the bool arg would be false, and non-NULL otherwise. The single argument still could be called bool *avoid_libcall, and you'd just if (avoid_libcall) { *avoid_libcall = true; return retval; } instead of emitting a libcall, the caller would initialize the bool variable to false. Jakub Got it. I'm sending updated version of the patch. Hope I've had enough fantasy to write it nice. Tested on both ppc64le and x86_64. Martin >From d766330364aa2a23512f7d4e60491b634c5d0523 Mon Sep 17 00:00:00 2001 From: marxin Date: Wed, 14 Mar 2018 09:44:18 +0100 Subject: [PATCH] Introduce new libc_func_speed target hook (PR middle-end/81657). gcc/ChangeLog: 2018-03-14 Martin Liska PR middle-end/81657 * builtins.c (expand_builtin_memory_copy_args): Handle situation when libc library provides a fast mempcpy implementation/ * config/i386/i386-protos.h (gnu_libc_func_speed): New. * config/i386/i386.c (enum libc_speed): Likewise. (ix86_libc_func_speed): Likewise. (TARGET_LIBC_FUNC_SPEED): Likewise. * coretypes.h (enum libc_speed): Likewise. * doc/tm.texi: Document new target hook. * doc/tm.texi.in: Likewise. * expr.c (emit_block_move_hints): Handle libc bail out argument. * expr.h (emit_block_move_hints): Add new parameters. * target.def: Add new hook. * targhooks.c (enum libc_speed): New enum. (default_libc_func_speed): Provide a default hook implementation. * targhooks.h (default_libc_func_speed): Likewise. gcc/testsuite/ChangeLog: 2018-03-14 Martin Liska * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target and others. --- gcc/builtins.c | 15 ++- gcc/config/i386/i386-protos.h | 2 ++ gcc/config/i386/i386.c | 24 gcc/coretypes.h | 7 +++ gcc/doc/tm.texi | 4 gcc/doc/tm.texi.in | 1 + gcc/expr.c | 11 ++- gcc/expr.h | 3 ++- gcc/target.def | 7 +++ gcc/targhooks.c | 9 + gcc/targhooks.h | 1 + gcc/testsuite/gcc.dg/string-opt-1.c | 5 +++-- 12 files changed, 84 insertions(+), 5 deletions(-) diff --git a/gcc/builtins.c b/gcc/builtins.c index 487d9d58db2..d3cd93ffbfa 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -3651,13 +3651,26 @@ expand_builtin_memory_copy_args (tree dest, tree src, tree len, src_mem = get_memory_rtx (src, len); set_mem_align (src_mem, src_align); + /* emit_block_move_hints can generate a library call to memcpy function. + In situations when a libc library provides fast implementation + of mempcpy, then it's better to call mempcpy directly. */ + bool avoid_libcall += (endp == 1 + && targetm.libc_func_speed ((int)BUILT_IN_MEMPCPY) == LIBC_FAST_SPEED + && target != const0_rtx); + /* Copy word part most expediently. */ + bool libcall_avoided = false; dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx, CALL_EXPR_TAILCALL (exp) && (endp == 0 || target == const0_rtx) ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL, expected_align, expected_size, - min_size, max_size, probable_max_size); + min_size, max_size, probable_max_size, + avoid_libcall ? &libcall_avoided: NULL); + + if (libcall_avoided) +return NULL_RTX; if (dest_addr == 0) { diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index ef7c818986f..d3fc515845b 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -47,6 +47,8 @@ extern void ix86_reset_previous_fndecl (void); extern bool ix86_using_red_zone (void); +extern enum libc_speed gnu_libc_func_speed (int fn); + #ifdef RTX_CODE extern int standard_80387_constant_p (rtx); extern const char *standard_80387_constant_opcode (rtx); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index b4f6aec1434..aadf9fa5ac3 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2735,6 +2735,27 @@ ix86_using_red_zone (void) && (!cfun->machine->has_local_indirect_jump || cfun->machine->indirect_branch_type == indirect_branch_keep)); } + +/* This hook determines whether a function from libc has a fast
Re: [C++ Patch] PR 85028 ("[8 Regression] ICE on invalid C++ code: in tsubst_default_argument, at cp/pt.c:12340")
OK. On Wed, Mar 28, 2018 at 8:25 AM, Paolo Carlini wrote: > Hi, > > as I said in the audit trail, in its way this error recovery issue is > somewhat interesting: Jason's r251422 added some code at the beginning of > tsubst_default_argument, included the gcc_assert that triggers here. In > fact, parmtype is only used in the assertion thus the error recovery check > could be moved inside the assertion, something like (appears to also pass > testing, lightly tested, so far): > > gcc_assert (parmtype == error_mark_node || > same_type_ignoring_top_level_qualifiers_p (type, parmtype)); > > and for this bug we would be back to the gcc-7 status, thus we would not ICE > and we would issue *2* errors, one for the parameter and then one more for > the default argument itself, at instantiation time. In fact, some other > compilers also do that. Or, as I have below, we can return early, after the > first error. Tested x86_64-linux. > > Thanks, Paolo. > > /// >
Re: [PR middle-end/70359] uncoalesce IVs outside of loops
On Wed, Mar 28, 2018 at 1:37 PM, Richard Biener wrote: > On Tue, Mar 20, 2018 at 2:36 PM, Richard Biener > wrote: >> On Mon, Mar 19, 2018 at 6:08 PM, Aldy Hernandez wrote: >>> Hi Richard. >>> >>> As discussed in the PR, the problem here is that we have two different >>> iterations of an IV live outside of a loop. This inhibits us from using >>> autoinc/dec addressing on ARM, and causes extra lea's on x86. >>> >>> An abbreviated example is this: >>> >>> loop: >>> # p_9 = PHI >>> p_20 = p_9 + 18446744073709551615; >>> goto loop >>> p_24 = p_9 + 18446744073709551614; >>> MEM[(char *)p_20 + -1B] = 45; >>> >>> Here we have both the previous IV (p_9) and the current IV (p_20) used >>> outside of the loop. On Arm this keeps us from using auto-dec addressing, >>> because one use is -2 and the other one is -1. >>> >>> With the attached patch we attempt to rewrite out-of-loop uses of the IV in >>> terms of the current/last IV (p_20 in the case above). With it, we end up >>> with: >>> >>> p_24 = p_20 + 18446744073709551615; >>> *p_24 = 45; >>> >>> ...which helps both x86 and Arm. >>> >>> As you have suggested in comment 38 on the PR, I handle specially >>> out-of-loop IV uses of the form IV+CST and propagate those accordingly >>> (along with the MEM_REF above). Otherwise, in less specific cases, we un-do >>> the IV increment, and use that value in all out-of-loop uses. For instance, >>> in the attached testcase, we rewrite: >>> >>> george (p_9); >>> >>> into >>> >>> _26 = p_20 + 1; >>> ... >>> george (_26); >>> >>> The attached testcase tests the IV+CST specific case, as well as the more >>> generic case with george(). >>> >>> Although the original PR was for ARM, this behavior can be noticed on x86, >>> so I tested on x86 with a full bootstrap + tests. I also ran the specific >>> test on an x86 cross ARM build and made sure we had 2 auto-dec with the >>> test. For the original test (slightly different than the testcase in this >>> patch), with this patch we are at 104 bytes versus 116 without it. There is >>> still the issue of a division optimization which would further reduce the >>> code size. I will discuss this separately as it is independent from this >>> patch. >>> >>> Oh yeah, we could make this more generic, and maybe handle any multiple of >>> the constant, or perhaps *= and /=. Perhaps something for next stage1... >>> >>> OK for trunk? >> >> Sorry for noticing so late but you use loop properties like ->latch and >> ->loop_father (via flow_bb_inside_loop_p) that may not be valid at >> this point given expand doesn't do loop_optimizer_init/finalize. Generally >> you may face loops with multiple latches (->latch == NULL) here and >> loops may have multiple exits. You probably are not hitting any >> of those problems because is_iv_plus_constant is quite restrictive >> (doesn't handle PLUS_EXPR or MINUS_EXPR). >> >> Also the latch doesn't need to be the block with the exit conditional. >> >> So first of all you'd need to wrap insert_backedge_copies () with >> >> loop_optimizer_init (AVOID_CFG_MODIFICATIONS); >> ... >> loop_optimizer_finalize (); >> >> that is required to fixup an eventually broken state. Then you >> probably should restrict handling to PHIs in BBs with >> bb->loop_father->header == bb (aka loop header PHIs), >> use get_loop_exit_edges when the IV increment is of OK form >> and then use dominator info to query in which exit block to >> insert compensation (or simply refuse to do anything for >> loops with multiple exits). >> >> So if you have more cycles to work on this that would be nice, >> otherwise I can certainly take over from here... > > Ok, so I've finally taken a closer look. There's a thing that goes > wrong unpatched with the current insert_backedge_copies code. > It does detect the coalescing conflict but instead of solving it > in a way so the backedge copy can be coalesced it inserts a > copy on that edge (actually before the exit test). > It could have done better by inserting > a copy before the IV increment and adjusting out-of-loop uses > to use that. There's also another IV the existing code triggers on > un-helpfully inserting a copy between two uses (exit test and > producer of the backedge value) instead of sinking the producer > after the last use (splitting the latch edge) (doesn't help on x86 > where we combine div and mod later, recreating the conflict). So here's the attempt at fixing the existing code. Basically the idea is that instead of inserting the required backedge copy on the backedge (without splitting it, so in its source), insert it before the definition of the backedge value and adjust downstream uses that would cause the coalescing conflict by the copy destination. For the testcase in question where the previous code worked for neither of the two IVs it now works for both and on x86 we get .L2: movl%ecx, %eax xorl%edx, %edx movq%rbx, %rbp decq%rbx divl
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
On Wed, Mar 28, 2018 at 04:18:45PM +0200, Martin Liška wrote: > 2018-03-14 Martin Liska > > PR middle-end/81657 > * builtins.c (expand_builtin_memory_copy_args): Handle situation > when libc library provides a fast mempcpy implementation/ > * config/i386/i386-protos.h (gnu_libc_func_speed): New. > * config/i386/i386.c (enum libc_speed): Likewise. > (ix86_libc_func_speed): Likewise. > (TARGET_LIBC_FUNC_SPEED): Likewise. > * coretypes.h (enum libc_speed): Likewise. > * doc/tm.texi: Document new target hook. > * doc/tm.texi.in: Likewise. > * expr.c (emit_block_move_hints): Handle libc bail out argument. > * expr.h (emit_block_move_hints): Add new parameters. > * target.def: Add new hook. > * targhooks.c (enum libc_speed): New enum. > (default_libc_func_speed): Provide a default hook > implementation. > * targhooks.h (default_libc_func_speed): Likewise. > > gcc/testsuite/ChangeLog: > > 2018-03-14 Martin Liska > > * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target > and others. >/* Copy word part most expediently. */ > + bool libcall_avoided = false; >dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx, >CALL_EXPR_TAILCALL (exp) >&& (endp == 0 || target == const0_rtx) >? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL, >expected_align, expected_size, > - min_size, max_size, probable_max_size); > + min_size, max_size, probable_max_size, > + avoid_libcall ? &libcall_avoided: NULL); Missing space before :. > + > + if (libcall_avoided) > +return NULL_RTX; > > --- a/gcc/config/i386/i386-protos.h > +++ b/gcc/config/i386/i386-protos.h > @@ -47,6 +47,8 @@ extern void ix86_reset_previous_fndecl (void); > > extern bool ix86_using_red_zone (void); > > +extern enum libc_speed gnu_libc_func_speed (int fn); Here you declare gnu_libc_func_speed. > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -2735,6 +2735,27 @@ ix86_using_red_zone (void) > && (!cfun->machine->has_local_indirect_jump > || cfun->machine->indirect_branch_type == indirect_branch_keep)); > } > + > +/* This hook determines whether a function from libc has a fast > implementation > + FN is present at the runtime. We override it for i386 and glibc C library > + as this combination provides fast implementation of mempcpy function. */ > + > +enum libc_speed > +ix86_libc_func_speed (int fn) But define a different function. > +{ > + enum built_in_function f = (built_in_function)fn; > + > + if (!OPTION_GLIBC) > +return LIBC_UNKNOWN_SPEED; OPTION_GLIBC is only defined if linux.h is included, so I think you break all other x86 targets this way. The hook should have linux in the name, perhaps defined only in config/i386/linux.h and redefined in i386.c through #ifdef SUBTARGET_LIBC_FUNC_SPEED #undef TARGET_LIBC_FUNC_SPEED #define TARGET_LIBC_FUNC_SPEED SUBTARGET_LIBC_FUNC_SPEED #endif or something similar. Otherwise LGTM, but please get approval also from Richi (for the generic part) and Uros (for the backend side). Jakub
[PATCH] Improve adc discovery during combine on x86 (PR target/85095, take 2)
On Wed, Mar 28, 2018 at 08:51:07AM +0200, Uros Bizjak wrote: > > --- gcc/config/i386/i386.md.jj 2018-03-27 12:54:54.685244368 +0200 > > +++ gcc/config/i386/i386.md 2018-03-27 19:38:43.891451026 +0200 > > @@ -6854,6 +6854,23 @@ (define_insn "add3_carry" > > (set_attr "pent_pair" "pu") > > (set_attr "mode" "")]) > > > > +(define_insn "*add3_carry0" > > Please name this "*add3_carry_0". You will also need to > introduce "*addsi3_carry_zext_0". Probably minus patterns have the > same problem, simplify-rtx probably removes (minus ... const_rtx0), > too. > > > + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") > > + (plus:SWI > > + (match_operator:SWI 3 "ix86_carry_flag_operator" > > + [(match_operand 2 "flags_reg_operand") (const_int 0)]) > > + (match_operand:SWI 1 "nonimmediate_operand" "0"))) > > + (clobber (reg:CC FLAGS_REG))] > > + "ix86_unary_operator_ok (PLUS, mode, operands)" > > +{ > > + operands[4] = const0_rtx; > > + return "adc{}\t{%4, %0|%0, %4}"; > > Just use "$0" ("0" in intel syntax) in the insn template. So, like this, if it passes bootstrap/regtest? The testcases use all 4 new patterns now. 2018-03-28 Jakub Jelinek PR target/85095 * config/i386/i386.md (*add3_carry_0, *addsi3_carry_zext_0, *sub3_carry_0, *subsi3_carry_zext_0): New patterns. * gcc.target/i386/pr85095-1.c: New test. * gcc.target/i386/pr85095-2.c: New test. * gcc.c-torture/execute/pr85095.c: New test. --- gcc/config/i386/i386.md.jj 2018-03-27 21:56:42.855454830 +0200 +++ gcc/config/i386/i386.md 2018-03-28 14:12:35.430079720 +0200 @@ -6854,6 +6854,20 @@ (define_insn "add3_carry" (set_attr "pent_pair" "pu") (set_attr "mode" "")]) +(define_insn "*add3_carry_0" + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (plus:SWI + (match_operator:SWI 3 "ix86_carry_flag_operator" + [(match_operand 2 "flags_reg_operand") (const_int 0)]) + (match_operand:SWI 1 "nonimmediate_operand" "0"))) + (clobber (reg:CC FLAGS_REG))] + "ix86_unary_operator_ok (PLUS, mode, operands)" + "adc{}\t{$0, %0|%0, 0}" + [(set_attr "type" "alu") + (set_attr "use_carry" "1") + (set_attr "pent_pair" "pu") + (set_attr "mode" "")]) + (define_insn "*addsi3_carry_zext" [(set (match_operand:DI 0 "register_operand" "=r") (zero_extend:DI @@ -6870,6 +6884,20 @@ (define_insn "*addsi3_carry_zext" (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) +(define_insn "*addsi3_carry_zext_0" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI + (plus:SI (match_operator:SI 2 "ix86_carry_flag_operator" + [(reg FLAGS_REG) (const_int 0)]) + (match_operand:SI 1 "register_operand" "0" + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT" + "adc{l}\t{$0, %k0|%k0, 0}" + [(set_attr "type" "alu") + (set_attr "use_carry" "1") + (set_attr "pent_pair" "pu") + (set_attr "mode" "SI")]) + ;; There is no point to generate ADCX instruction. ADC is shorter and faster. (define_insn "addcarry" @@ -6926,6 +6954,20 @@ (define_insn "sub3_carry" (set_attr "pent_pair" "pu") (set_attr "mode" "")]) +(define_insn "*sub3_carry_0" + [(set (match_operand:SWI 0 "nonimmediate_operand" "=m") + (minus:SWI + (match_operand:SWI 1 "nonimmediate_operand" "0") + (match_operator:SWI 3 "ix86_carry_flag_operator" + [(match_operand 2 "flags_reg_operand") (const_int 0)]))) + (clobber (reg:CC FLAGS_REG))] + "ix86_unary_operator_ok (MINUS, mode, operands)" + "sbb{}\t{$0, %0|%0, 0}" + [(set_attr "type" "alu") + (set_attr "use_carry" "1") + (set_attr "pent_pair" "pu") + (set_attr "mode" "")]) + (define_insn "*subsi3_carry_zext" [(set (match_operand:DI 0 "register_operand" "=r") (zero_extend:DI @@ -6941,6 +6983,21 @@ (define_insn "*subsi3_carry_zext" [(set_attr "type" "alu") (set_attr "use_carry" "1") (set_attr "pent_pair" "pu") + (set_attr "mode" "SI")]) + +(define_insn "*subsi3_carry_zext_0" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI + (minus:SI + (match_operand:SI 1 "register_operand" "0") + (match_operator:SI 2 "ix86_carry_flag_operator" + [(reg FLAGS_REG) (const_int 0)] + (clobber (reg:CC FLAGS_REG))] + "TARGET_64BIT" + "sbb{l}\t{$0, %k0|%k0, 0}" + [(set_attr "type" "alu") + (set_attr "use_carry" "1") + (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) (define_insn "sub3_carry_ccc" --- gcc/testsuite/gcc.target/i386/pr85095-1.c.jj2018-03-28 14:30:17.929554377 +0200 +++ gcc/testsuite/gcc.target/i386/pr85095-1.c 2018-03-28 14:44:04.082960231 +0200 @@ -0,0 +1,33 @@ +/* PR target/85095 * +/* { dg-do compile } */ +/* { dg-options "-O2 -masm=att" } */ + +unsigned int +foo (unsigned int a, unsigned int b) +{ + a += b; + if (a < b) a++; + return a; +} + +#ifdef __x86_
[PATCH] Add pow -> exp hack for SPEC2k17 628.pop2_s (PR tree-optimization/82004, take 2)
On Wed, Mar 28, 2018 at 09:46:52AM +0200, Richard Biener wrote: > > +/* Return true if pow(cst, x) should be optimized into exp(log(cst) * x). > > */ > > + > > +static bool > > +optimize_pow_to_exp (tree arg0, tree arg1) > > +{ > > + return false; > > +} > > So instead of this simply guard the pattern with #if GIMPLE as we already > do for some. That saves in generic-match.c code size. Ok. > > --- gcc/gimple-match-head.c.jj 2018-02-13 09:33:31.107560174 +0100 > > +++ gcc/gimple-match-head.c 2018-03-27 18:48:21.205369113 +0200 > > @@ -840,3 +840,55 @@ canonicalize_math_after_vectorization_p > > { > >return !cfun || (cfun->curr_properties & PROP_gimple_lvec) != 0; > > } > > + > > +/* Return true if pow(cst, x) should be optimized into exp(log(cst) * x). > > + As a workaround for SPEC CPU2017 628.pop2_s, don't do it if arg0 > > + is 10.0, arg1 = phi_res + cst1 and phi_res = PHI > > + where cst1 + cst2 is an exact integer, because then pow (10.0, arg1) > > + will likely be exact, while exp (log (10.0) * arg1) might be not. */ > > So this looks somewhat like a very specific SPEC hack to me. > Can we make this sligtly more generic and consider > > pow (integer, cst1 [+- cst2]) > > where the 2nd arg is an exact integer? Basically allow not only > 10 for arg0 but any real_is_integer plus allow arg1 to be directly > fed by the PHI, not only via an intermediate plus (here also consider > a minus, negate or other binary op we can fed to const_binop). Like this? I'm just handling PLUS_EXPR and MINUS_EXPR and no +/- at all. Furthermore, I've moved it to the exp replacement only, if we do exp2, it is fine to optimize it even if cst1 + cst2 is integer, it is just the exp and log that isn't exact for originally integral values. > Btw, did anybody file a defect with SPEC for this? I don't really have any SPEC experience, so will defer that to those involved with SPEC benchmarking. The fix would be to add some small epsilon to the if (chlcnc(m) .le. chlamnt .and. & chlamnt .le. chlcnc(m+1) ) then comparisons, but bet they'll argue that the upstream pop does it that way. 2018-03-28 Jakub Jelinek PR tree-optimization/82004 * gimple-match-head.c (optimize_pow_to_exp): New function. * match.pd (pow(C,x) -> exp(log(C)*x)): Wrap with #if GIMPLE. Don't fold to exp if optimize_pow_to_exp is false. * gcc.dg/pr82004.c: New test. --- gcc/gimple-match-head.c.jj 2018-03-27 19:08:32.047824741 +0200 +++ gcc/gimple-match-head.c 2018-03-28 15:30:42.687565271 +0200 @@ -840,3 +840,71 @@ canonicalize_math_after_vectorization_p { return !cfun || (cfun->curr_properties & PROP_gimple_lvec) != 0; } + +/* Return true if pow(cst, x) should be optimized into exp(log(cst) * x). + As a workaround for SPEC CPU2017 628.pop2_s, don't do it if arg0 + is an exact integer, arg1 = phi_res +/- cst1 and phi_res = PHI + where cst2 +/- cst1 is an exact integer, because then pow (arg0, arg1) + will likely be exact, while exp (log (arg0) * arg1) might be not. + Also don't do it if arg1 is phi_res above and cst2 is an exact integer. */ + +static bool +optimize_pow_to_exp (tree arg0, tree arg1) +{ + gcc_assert (TREE_CODE (arg0) == REAL_CST); + if (!real_isinteger (TREE_REAL_CST_PTR (arg0), TYPE_MODE (TREE_TYPE (arg0 +return true; + + if (TREE_CODE (arg1) != SSA_NAME) +return true; + + gimple *def = SSA_NAME_DEF_STMT (arg1); + gphi *phi = dyn_cast (def); + tree cst1 = NULL_TREE; + enum tree_code code = ERROR_MARK; + if (!phi) +{ + if (!is_gimple_assign (def)) + return true; + code = gimple_assign_rhs_code (def); + switch (code) + { + case PLUS_EXPR: + case MINUS_EXPR: + break; + default: + return true; + } + if (TREE_CODE (gimple_assign_rhs1 (def)) != SSA_NAME + || TREE_CODE (gimple_assign_rhs2 (def)) != REAL_CST) + return true; + + cst1 = gimple_assign_rhs2 (def); + + phi = dyn_cast (SSA_NAME_DEF_STMT (gimple_assign_rhs1 (def))); + if (!phi) + return true; +} + + tree cst2 = NULL_TREE; + int n = gimple_phi_num_args (phi); + for (int i = 0; i < n; i++) +{ + tree arg = PHI_ARG_DEF (phi, i); + if (TREE_CODE (arg) != REAL_CST) + continue; + else if (cst2 == NULL_TREE) + cst2 = arg; + else if (!operand_equal_p (cst2, arg, 0)) + return true; +} + + if (cst1 && cst2) +cst2 = const_binop (code, TREE_TYPE (cst2), cst2, cst1); + if (cst2 + && TREE_CODE (cst2) == REAL_CST + && real_isinteger (TREE_REAL_CST_PTR (cst2), +TYPE_MODE (TREE_TYPE (cst2 +return false; + return true; +} --- gcc/match.pd.jj 2018-03-27 19:08:36.336826318 +0200 +++ gcc/match.pd2018-03-28 15:19:50.208231654 +0200 @@ -4006,6 +4006,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* pow(C,x) -> exp(log(C)*x) if C > 0, or if C is a positive power
Re: [PATCH] Fix wrong use-after-scope sanitization for omp variable (PR sanitizer/85081).
On 03/28/2018 04:17 PM, Jakub Jelinek wrote: On Wed, Mar 28, 2018 at 04:14:45PM +0200, Martin Liška wrote: Hi. I'm sending Jakub's patch, where I removed the guard in asan_poison_variable. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Martin gcc/ChangeLog: 2018-03-28 Jakub Jelinek Martin Liska PR sanitizer/85081 * gimplify.c (asan_poison_variable): Don't do the check for gimplify_omp_ctxp here. (gimplify_decl_expr): Do it here. (gimplify_target_expr): Likewise. gcc/testsuite/ChangeLog: 2018-03-28 Jakub Jelinek Martin Liska PR sanitizer/85081 * g++.dg/asan/pr85081.C: New test. I guess it is ok for trunk for now, but for stage1 I'd like analysis on why you've added that !gimplify_omp_ctxp stuff at all. E.g. for local vars inside of OpenMP/OpenACC constructs there should be no reason why they couldn't be unpoisoned and poisoned inside of the OpenMP region. Jakub Good, I've just installed the patch. I'll write that to my TODO list for next stage1. Martin
Re: [PATCH] Fix typos (PR other/84819).
On 03/20/2018 04:49 PM, Martin Liška wrote: On 03/20/2018 04:28 PM, Gerald Pfeifer wrote: On Tue, 20 Mar 2018, Martin Liška wrote: There's patch for couple of documentation issues reported in the PR. Looks good to me, thanks! Just, should there be an "a" in "use jump table"? Looks good. And 'an indirect jump.' ? Martin Gerald Ok, I installed the patch with both adjustments as r258925. If there's some additional comment I'll do a follow-up. Martin
Hear from Columbia University Medical Center, UCLA, Colorado State Universtiy & More
CANNABINOID THERAPEUTICS SYMPOSIUM June 6 - 7, 2018 | Crowne Plaza Redondo Beach & Marina Hotel | Redondo Beach, CA --- SIGN UP TODAY: http://links.infocastevents.mkt8115.com/ctt?kn=5&ms=MzM2NTI0NzUS1&r=NjkyMTk1NzM3MTk0S0&b=2&j=MTI0MzQyNDM2NgS2&mt=1&rt=0 --- Join the evolving conversation around how cannabis can be better integrated into mainstream medical treatments and standards of care at the Cannabinoid Therapeutics Symposium. The inaugural symposium will kick off with senior researchers exploring topics such as: The role of exocannabinoids & endocannabinoids in the treatment of chronic inflammation associated with a majority of clinical disorders A controlled study of the analgesic effects of cannabis combined with low-dose opioids Patient study on cannabis use for chronic pain in the Minnesota Medical Cannabis Program Clinical case studies on CBD oils, cannabinoids and opioids in pain management United States of Chronic Pain & the Opiods Crisis Featured Speakers Include: Ziva Cooper, PhD: COLUMBIA UNIVERSITY MEDICAL CENTER Ziva Cooper, PhD Associate Professor of Clinical Neurobiology (in Psychology) COLUMBIA UNIVERSITY MEDICAL CENTER Chris Evans, PhD: UCLA Chris Evans, PhD Investigator, Center for Translational Technologies; Director, Brain Research Institute UCLA Barbara A. Brett, PhD: CORADO STATE UNIVERSITY - PUEBLO Barbara A. Brett, PhD Associate Professor, Dept. of Psychology COLORADO STATE UNIVERSITY - PUEBLO Bonni Goldstein, MD: CANNA-CENTERS Sue Sisley, MD PI, Cannabis for Veterans PTSD Study; PI, Cannabis and Pain Study; President UNIVERSITY OF MICHIGAN IRB; SCOTTSDALE RESEARCH INSTITUTE --- SIGN UP TODAY: http://links.infocastevents.mkt8115.com/ctt?kn=9&ms=MzM2NTI0NzUS1&r=NjkyMTk1NzM3MTk0S0&b=2&j=MTI0MzQyNDM2NgS2&mt=1&rt=0 - Unsubscribe http://www.pages03.net/informationforecastinc/SubscriptionPreferences/SubPreferences?spMailingID=33652475&spUserID=NjkyMTk1NzM3MTk0S0&spJobID=MTI0MzQyNDM2NgS2&spReportId=MTI0MzQyNDM2NgS2
Re: Full range of PCBs inquiry
Dear Friends, I am Mr. Aaron from FullyGold Holdings, we can provide high-end PCBs(Printed Circuit Boards) to our customer side. Specialize in mulitlayer(4L~50L) PCB fields more than 13 years, also including rigid, rigid-flex, FPC, and hybrid lam., and high performance PCB, our many clients from Germany, Italy, USA and so on. Here is our company website: http://www.flxpcb.com.cn/, pls check and if you have any PCB demands , please contact me ! I am looking forward to hear your feedback soon. Best regards, Mr. Aaron Quality is always the first criterion we care about from Fully Gold's
Re: [PATCH. rs6000] Fix PR84912: ICE using -m32 on __builtin_divde*, patch #2
On 3/27/18 5:02 PM, Segher Boessenkool wrote: >> @@ -15952,6 +15953,10 @@ rs6000_invalid_builtin (enum rs6000_buil >> name); >>else if ((fnmask & RS6000_BTM_FLOAT128) != 0) >> error ("builtin function %qs requires the %qs option", name, >> "-mfloat128"); >> + else if ((fnmask & (RS6000_BTM_POPCNTD | RS6000_BTM_POWERPC64)) >> + == (RS6000_BTM_POPCNTD | RS6000_BTM_POWERPC64)) >> +error ("builtin function %qs requires the %qs and %qs options", >> + name, "-mcpu=power7 (or newer)", "-m64 or -mpowerpc64"); > > This does not work for translation, and it quotes the wrong things. > Each %qs should be for exactly one option string. I'm confused. :-) What is it I need to do to fix this? I just cut/pasted usage higher up in the function, so does that need fixing too or ??? Peter
Re: [patch, fortran] Simplify constants which come from parameter arrays
Hi Thomas, If I am not mistaken, the patch causes: FAIL: gfortran.dg/pr71935.f90 -O (test for excess errors) FAIL: gfortran.dg/substr_6.f90 -O0 execution test FAIL: gfortran.dg/substr_6.f90 -O1 execution test FAIL: gfortran.dg/substr_6.f90 -O2 execution test FAIL: gfortran.dg/substr_6.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/substr_6.f90 -O3 -g execution test FAIL: gfortran.dg/substr_6.f90 -Os execution test For gfortran.dg/pr71935.f90 I get two additional errors on top of the expected warnings: /opt/gcc/_clean/gcc/testsuite/gfortran.dg/pr71935.f90:5:21: print *, sizeof(z(3)) ! { dg-warning "is out of bounds" } 1 Warning: Array reference at (1) is out of bounds (3 > 2) in dimension 1 /opt/gcc/_clean/gcc/testsuite/gfortran.dg/pr71935.f90:5:21: print *, sizeof(z(3)) ! { dg-warning "is out of bounds" } 1 Error: Index in dimension 1 is out of bounds at (1) /opt/gcc/_clean/gcc/testsuite/gfortran.dg/pr71935.f90:6:23: print *, c_sizeof(z(3))! { dg-warning "is out of bounds" } 1 Warning: Array reference at (1) is out of bounds (3 > 2) in dimension 1 /opt/gcc/_clean/gcc/testsuite/gfortran.dg/pr71935.f90:6:23: print *, c_sizeof(z(3))! { dg-warning "is out of bounds" } 1 Error: Index in dimension 1 is out of bounds at (1) For gfortran.dg/substr_6.f90, I reach STOP 1 when running the executable. TIA Dominique
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
On 03/28/2018 04:31 PM, Jakub Jelinek wrote: On Wed, Mar 28, 2018 at 04:18:45PM +0200, Martin Liška wrote: 2018-03-14 Martin Liska PR middle-end/81657 * builtins.c (expand_builtin_memory_copy_args): Handle situation when libc library provides a fast mempcpy implementation/ * config/i386/i386-protos.h (gnu_libc_func_speed): New. * config/i386/i386.c (enum libc_speed): Likewise. (ix86_libc_func_speed): Likewise. (TARGET_LIBC_FUNC_SPEED): Likewise. * coretypes.h (enum libc_speed): Likewise. * doc/tm.texi: Document new target hook. * doc/tm.texi.in: Likewise. * expr.c (emit_block_move_hints): Handle libc bail out argument. * expr.h (emit_block_move_hints): Add new parameters. * target.def: Add new hook. * targhooks.c (enum libc_speed): New enum. (default_libc_func_speed): Provide a default hook implementation. * targhooks.h (default_libc_func_speed): Likewise. gcc/testsuite/ChangeLog: 2018-03-14 Martin Liska * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target and others. /* Copy word part most expediently. */ + bool libcall_avoided = false; dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx, CALL_EXPR_TAILCALL (exp) && (endp == 0 || target == const0_rtx) ? BLOCK_OP_TAILCALL : BLOCK_OP_NORMAL, expected_align, expected_size, -min_size, max_size, probable_max_size); +min_size, max_size, probable_max_size, +avoid_libcall ? &libcall_avoided: NULL); Missing space before :. Fixed. + + if (libcall_avoided) +return NULL_RTX; --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -47,6 +47,8 @@ extern void ix86_reset_previous_fndecl (void); extern bool ix86_using_red_zone (void); +extern enum libc_speed gnu_libc_func_speed (int fn); Here you declare gnu_libc_func_speed. --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2735,6 +2735,27 @@ ix86_using_red_zone (void) && (!cfun->machine->has_local_indirect_jump || cfun->machine->indirect_branch_type == indirect_branch_keep)); } + +/* This hook determines whether a function from libc has a fast implementation + FN is present at the runtime. We override it for i386 and glibc C library + as this combination provides fast implementation of mempcpy function. */ + +enum libc_speed +ix86_libc_func_speed (int fn) But define a different function. You are right. +{ + enum built_in_function f = (built_in_function)fn; + + if (!OPTION_GLIBC) +return LIBC_UNKNOWN_SPEED; OPTION_GLIBC is only defined if linux.h is included, so I think you break all other x86 targets this way. That said the proper way is probably to define the function in linux64.c (should we also use it on i386, or nobody cares?). And corresponding declaration will be in config/linux-protos.h. The hook should have linux in the name, perhaps defined only in config/i386/linux.h and redefined in i386.c through #ifdef SUBTARGET_LIBC_FUNC_SPEED #undef TARGET_LIBC_FUNC_SPEED #define TARGET_LIBC_FUNC_SPEED SUBTARGET_LIBC_FUNC_SPEED #endif or something similar. I adopted this suggested mechanism. Thanks for review. Martin Otherwise LGTM, but please get approval also from Richi (for the generic part) and Uros (for the backend side). Jakub >From 4378fa93aafac3d28eaae48c9a41a2f2ef5e1d18 Mon Sep 17 00:00:00 2001 From: marxin Date: Wed, 14 Mar 2018 09:44:18 +0100 Subject: [PATCH] Introduce new libc_func_speed target hook (PR middle-end/81657). gcc/ChangeLog: 2018-03-14 Martin Liska PR middle-end/81657 * builtins.c (expand_builtin_memory_copy_args): Handle situation when libc library provides a fast mempcpy implementation/ * config/linux-protos.h (ix86_linux_libc_func_speed): New. * config/linux.c (ix86_linux_libc_func_speed): Likewise. (TARGET_LIBC_FUNC_SPEED): Likewise. * config/i386/linux64.h (SUBTARGET_LIBC_FUNC_SPEED): Define macro. * coretypes.h (enum libc_speed): Likewise. * doc/tm.texi: Document new target hook. * doc/tm.texi.in: Likewise. * expr.c (emit_block_move_hints): Handle libc bail out argument. * expr.h (emit_block_move_hints): Add new parameters. * target.def: Add new hook. * targhooks.c (enum libc_speed): New enum. (default_libc_func_speed): Provide a default hook implementation. * targhooks.h (default_libc_func_speed): Likewise. gcc/testsuite/ChangeLog: 2018-03-28 Martin Liska * gcc.dg/string-opt-1.c: gcc/testsuite/ChangeLog: 2018-03-14 Martin Liska * gcc.dg/string-opt-1.c: Adjust scans for i386 and glibc target and others. --- gcc/builtins.c | 15 ++- gcc/config/i386/i3
Re: [PATCH. rs6000] Fix PR84912: ICE using -m32 on __builtin_divde*, patch #2
On Wed, Mar 28, 2018 at 10:38:49AM -0500, Peter Bergner wrote: > On 3/27/18 5:02 PM, Segher Boessenkool wrote: > >> @@ -15952,6 +15953,10 @@ rs6000_invalid_builtin (enum rs6000_buil > >> name); > >>else if ((fnmask & RS6000_BTM_FLOAT128) != 0) > >> error ("builtin function %qs requires the %qs option", name, > >> "-mfloat128"); > >> + else if ((fnmask & (RS6000_BTM_POPCNTD | RS6000_BTM_POWERPC64)) > >> + == (RS6000_BTM_POPCNTD | RS6000_BTM_POWERPC64)) > >> +error ("builtin function %qs requires the %qs and %qs options", > >> + name, "-mcpu=power7 (or newer)", "-m64 or -mpowerpc64"); > > > > This does not work for translation, and it quotes the wrong things. > > Each %qs should be for exactly one option string. > > I'm confused. :-) What is it I need to do to fix this? I just cut/pasted > usage higher up in the function, so does that need fixing too or ??? It should be something like +error ("builtin function %qs requires the %qs (or newer), and " "%qs or %qs options", + name, "-mcpu=power7", "-m64", "-mpowerpc64"); I don't see other such strings that quote incorrectly? Segher
Re: [PATCH] Prefer mempcpy to memcpy on x86_64 target (PR middle-end/81657).
On Wed, Mar 28, 2018 at 06:30:21PM +0200, Martin Liška wrote: > --- a/gcc/config/linux.c > +++ b/gcc/config/linux.c > @@ -37,3 +37,24 @@ linux_libc_has_function (enum function_class fn_class) > >return false; > } > + > +/* This hook determines whether a function from libc has a fast > implementation > + FN is present at the runtime. We override it for i386 and glibc C library > + as this combination provides fast implementation of mempcpy function. */ > + > +enum libc_speed > +ix86_linux_libc_func_speed (int fn) Putting a ix86_ function into config/linux.c used by most linux targets is weird. Either we multiple linux targets with mempcpy fast, then name it somehow cpu neutral and let all those CPUs pick it up in config/*/linux.h. And yes, we do care about i?86-linux. Or it is for x86 only, and then it shouldn't be in config/linux.c, but either e.g. static inline in config/i386/linux.h, or we need config/i386/linux.c if we don't have it already. Jakub
Re: [PATCH] [PR c++/84979] improve auto handling in explicit tmpl args for concepts
On Wed, Mar 28, 2018 at 5:06 AM, Alexandre Oliva wrote: > On Mar 23, 2018, Jason Merrill wrote: > >> On Fri, Mar 23, 2018 at 3:38 PM, Alexandre Oliva wrote: >>> + /* Concepts allows 'auto' in template arguments, even multiple >>> + 'auto's in a single argument. > >> I think that's only intended for class templates; > > Aah, that explains a lot! Dang, it was so close! > > It actually makes sense; was there any discussion about standardizing > this, with reasons to reject this construct, or is it just that we > aren't there yet? I wouldn't be surprised if it were to appear in some > future version of C++. If what were to appear? I'm not sure what you mean. There is still effort to get more of the Concepts TS into the standard. We only want 'auto' in places where it could be deduced, and there's no way in [temp.deduct.type] to deduce from explicit template arguments to a non-type template. > Is this (in the patch below) the best spot to test for it? This seems like a plausible spot, but is there a reason not to check in cp_parser_template_id? Jason
[PATCH,rs6000] Remove __builtin_fctid and __builtin_fctiw
GCC Maintainers: The following patch reverts commit 253238 to add support for the fctid and fctiw builtins for Steve Munroe. gcc/ChangeLog: 2017-09-27 Carl Love* config/rs6000/rs6000-builtin.def (BU_FP_1MISC_1): Add define macro. (FCTID, FCTIW): Add BU_FP_MISC_1 macro expansion for builtins. * config/rs6000/rs6000.md (lrintsfsi2): Add define_insn for the fctiw instruction. gcc/testsuite/ChangeLog: 2017-09-27 Carl Love * gcc.target/powerpc/builtin-fctid-fctiw-runnable.c: New test file for the __builtin_fctid and __builtin_fctiw. The commit is being revert due to issues with builtins not working correctly with -mcpu=power7. There have been some other concerns about the builtins and if we really should have these undocumented builtins. The decision was made to pull them before they got used. I have tested the following mainline patch on powerpc64-unknown-linux-gnu (Power 8 BE) powerpc64le-unknown-linux-gnu (Power 8 LE) powerpc64le-unknown-linux-gnu (Power 9 LE) with no regressions. Please let me know if the patch looks OK for the GCC 7 branch. Carl Love -- gcc/ChangeLog: 2018-03-20 Carl Love Reverting patch: 2017-09-27 Carl Love * config/rs6000/rs6000-builtin.def: Remove macro expansion for FCTIW and FCTID. Remove macro definition BU_FP_MISC_1. * config/rs6000/rs6000.md: Remove define_insn lrintsfsi2. gcc/testsuite/ChangeLog: 2018-03-20 Carl Love Reverting patch: 2017-09-27 Carl Love * gcc.target/powerpc/builtin-fctid-fctiw-runnable.c: Remove test file. --- gcc/config/rs6000/rs6000-builtin.def | 14 --- gcc/config/rs6000/rs6000.md| 8 -- .../powerpc/builtin-fctid-fctiw-runnable.c | 137 - 3 files changed, 159 deletions(-) delete mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-fctid-fctiw-runnable.c diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 9942d65be..cd9a56e1c 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -615,17 +615,6 @@ | RS6000_BTC_BINARY), \ CODE_FOR_ ## ICODE) /* ICODE */ - -/* Miscellaneous builtins for instructions added prior to ISA 2.04. These - operate on floating point registers. */ -#define BU_FP_MISC_1(ENUM, NAME, ATTR, ICODE) \ - RS6000_BUILTIN_1 (MISC_BUILTIN_ ## ENUM, /* ENUM */ \ - "__builtin_" NAME, /* NAME */ \ - RS6000_BTM_HARD_FLOAT, /* MASK */ \ - (RS6000_BTC_ ## ATTR/* ATTR */ \ -| RS6000_BTC_UNARY), \ - CODE_FOR_ ## ICODE) /* ICODE */ - /* Miscellaneous builtins for instructions added in ISA 2.06. These instructions don't require either the DFP or VSX options, just the basic ISA 2.06 (popcntd) enablement since they operate on general purpose @@ -1880,9 +1869,6 @@ BU_VSX_OVERLOAD_X (XL_BE,"xl_be") BU_VSX_OVERLOAD_X (XST, "xst") BU_VSX_OVERLOAD_X (XST_BE, "xst_be") -/* 1 argument builtins pre ISA 2.04. */ -BU_FP_MISC_1 (FCTID, "fctid",CONST, lrintdfdi2) -BU_FP_MISC_1 (FCTIW, "fctiw",CONST, lrintsfsi2) /* 2 argument CMPB instructions added in ISA 2.05. */ BU_P6_2 (CMPB_32,"cmpb_32",CONST, cmpbsi3) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 7b285efb1..eb77bad71 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -5910,14 +5910,6 @@ [(set_attr "type" "fpload") (set_attr "length" "16")]) -(define_insn "lrintsfsi2" - [(set (match_operand:SI 0 "gpc_reg_operand" "=d") - (unspec:SI [(match_operand:DF 1 "gpc_reg_operand
Re: [PATCH. rs6000] Fix PR84912: ICE using -m32 on __builtin_divde*, patch #2
On 3/28/18 12:59 PM, Segher Boessenkool wrote: > It should be something like > > +error ("builtin function %qs requires the %qs (or newer), and " > "%qs or %qs options", > +name, "-mcpu=power7", "-m64", "-mpowerpc64"); > > I don't see other such strings that quote incorrectly? Ah, I guess I misunderstood what you were saying. So ok for trunk with that change then? Peter
Re: [PATCH] [PR c++/84943] allow folding of array indexing indirect_ref
On Wed, Mar 28, 2018 at 2:18 AM, Alexandre Oliva wrote: > On Mar 23, 2018, Jason Merrill wrote: > >> On Fri, Mar 23, 2018 at 4:55 PM, Jason Merrill wrote: >>> On Fri, Mar 23, 2018 at 12:44 PM, Jason Merrill wrote: Seems like cp_fold should update CALL_EXPR_FN with "callee" if non-null. >>> >>> Did you try this? That should avoid it being ADDR_EXPR of a decl. > >> Oh, I was assuming the ICE was in the middle-end, but it's in >> build_call_a. And it looks like the problem isn't that it's an >> ADDR_EXPR of a decl, but that the function isn't marked TREE_USED. > > Well, yeah. cp_build_function_call_vec marks the function as used when > function is a FUNCTION_DECL. In this testcase, it's INDIRECT_REF of > ADDR_EXPR of FUNCTION_DECL. It should have been marked as used before we get to cp_build_function_call_vec. finish_id_expression doesn't mark it because "done" is false, because we're in a postfix-expression. It looks like cp_build_addr_expr_1 already calls mark_used for single static member functions, it should probably do the same for single non-member functions. Jason
[PATCH] Fix -Wduplicated-branches with -g (PR c/85094)
Hi! With the introduction of DEBUG_BEGIN_STMT/-gstatement-frontiers on by default, the -Wduplicated-branches warning doesn't work anymore with -g, because operand_equal_p in OEP_LEXICOGRAPHIC mode doesn't consider DEBUG_BEGIN_STMTs as equal unless they are pointer equal. For the warning we either need to ignore them completely (but that would need more changes, as the hashing doesn't ignore them), or at least what the second hunk does, consider them equal. The first hunk is just an optimization, if -Wduplicated-branches is used on extremely deeply nested STATEMENT_LISTs, then for -fchecking we'd be computing the hashes etc. at the level of every stmt of every STATEMENT_LIST; we need to do that just once at the toplevel. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-03-28 Jakub Jelinek PR c/85094 * fold-const.c (operand_equal_p): Handle DEBUG_BEGIN_STMT. For STATEMENT_LIST, pass down OEP_LEXICOGRAPHIC and maybe OEP_NO_HASH_CHECK for recursive call, to avoid exponential checking. * c-c++-common/Wduplicated-branches-14.c: New test. --- gcc/fold-const.c.jj 2018-03-27 12:54:43.657242009 +0200 +++ gcc/fold-const.c2018-03-28 16:20:40.024574216 +0200 @@ -3479,7 +3479,8 @@ operand_equal_p (const_tree arg0, const_ if (tsi_end_p (tsi1) && tsi_end_p (tsi2)) return 1; if (!operand_equal_p (tsi_stmt (tsi1), tsi_stmt (tsi2), - OEP_LEXICOGRAPHIC)) + flags & (OEP_LEXICOGRAPHIC +| OEP_NO_HASH_CHECK))) return 0; } } @@ -3492,6 +3493,10 @@ operand_equal_p (const_tree arg0, const_ if (flags & OEP_LEXICOGRAPHIC) return OP_SAME_WITH_NULL (0); return 0; + case DEBUG_BEGIN_STMT: + if (flags & OEP_LEXICOGRAPHIC) + return 1; + return 0; default: return 0; } --- gcc/testsuite/c-c++-common/Wduplicated-branches-14.c.jj 2018-03-28 16:20:00.966553266 +0200 +++ gcc/testsuite/c-c++-common/Wduplicated-branches-14.c2018-03-28 16:19:07.264524448 +0200 @@ -0,0 +1,16 @@ +/* PR c/85094 */ +/* { dg-do compile } */ +/* { dg-options "-O1 -Wduplicated-branches -g" } */ + +extern int g; + +void +foo (int r) +{ + if (r < 64) +g -= 48; + else if (r < 80) /* { dg-warning "this condition has identical branches" } */ +g -= 64 - 45; + else +g -= 80 - 61; +} Jakub
[PATCH] Fix _blendm
Hi! When looking at PR85090, I've looked into tmp-mddump.md and noticed there some instructions that can't assemble - vblendmd etc., there is only vpblendmd or vblendmps. I couldn't make a testcase that would reproduce this though, seems the masked *mov_internal is used instead. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-03-28 Jakub Jelinek * config/i386/sse.md (_blendm): Use . --- gcc/config/i386/sse.md.jj 2018-03-13 09:02:28.714937979 +0100 +++ gcc/config/i386/sse.md 2018-03-28 17:46:49.928240731 +0200 @@ -1138,7 +1138,7 @@ (define_insn "_blendm" (match_operand:V48_AVX512VL 1 "register_operand" "v") (match_operand: 3 "register_operand" "Yk")))] "TARGET_AVX512F" - "vblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}" + "vblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}" [(set_attr "type" "ssemov") (set_attr "prefix" "evex") (set_attr "mode" "")]) Jakub
[PATCH 2/2] Show pertinent parameter (PR c++/85110)
This followup patch updates the specific error-handling path to add a note showing the pertinent parameter decl, taking the output from: test.cc: In function 'void caller(const char*)': test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for argument '2' to 'void callee(int, const char**, int)' callee (1, fmt, 3); ^~~ to: test.cc: In function 'void caller(const char*)': test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for argument '2' to 'void callee(int, const char**, int)' callee (1, fmt, 3); ^~~ test.cc:1:36: note: initializing argument 2 of 'void callee(int, const char**, int)' void callee (int one, const char **two, int three); ~^~~ Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu; adds a further 18 PASS results to g++.sum. Again, not a regression as such, but I've been calling out the underlined arguments as a feature of gcc 8, so would be good to fix. OK for trunk? gcc/cp/ChangeLog: PR c++/85110 * call.c (get_fndecl_argument_location): Make non-static. * cp-tree.h (get_fndecl_argument_location): New decl. * typeck.c (convert_for_assignment): When complaining due to conversions for an argument, show the location of the parameter within the decl. gcc/testsuite/ChangeLog: PR c++/85110 * g++.dg/diagnostic/param-type-mismatch-2.C: Update for the cases where we now show the pertinent parameter. --- gcc/cp/call.c | 2 +- gcc/cp/cp-tree.h| 2 ++ gcc/cp/typeck.c | 10 +++--- .../g++.dg/diagnostic/param-type-mismatch-2.C | 21 ++--- 4 files changed, 28 insertions(+), 7 deletions(-) diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 1a87f99..e1a0639 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -6598,7 +6598,7 @@ maybe_print_user_conv_context (conversion *convs) ARGNUM is zero based, -1 indicates the `this' argument of a method. Return the location of the FNDECL itself if there are problems. */ -static location_t +location_t get_fndecl_argument_location (tree fndecl, int argnum) { int i; diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index d5382c2..b45880d 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -5965,6 +5965,8 @@ extern bool can_convert_arg (tree, tree, tree, int, tsubst_flags_t); extern bool can_convert_arg_bad(tree, tree, tree, int, tsubst_flags_t); +extern location_t get_fndecl_argument_location (tree, int); + /* A class for recording information about access failures (e.g. private fields), so that we can potentially supply a fix-it hint about diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index e733c79..742b2e9 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -8781,9 +8781,13 @@ convert_for_assignment (tree type, tree rhs, parmnum, complain, flags); } else if (fndecl) - error_at (EXPR_LOC_OR_LOC (rhs, input_location), - "cannot convert %qH to %qI for argument %qP to %qD", - rhstype, type, parmnum, fndecl); + { + error_at (EXPR_LOC_OR_LOC (rhs, input_location), + "cannot convert %qH to %qI for argument %qP to %qD", + rhstype, type, parmnum, fndecl); + inform (get_fndecl_argument_location (fndecl, parmnum), + " initializing argument %P of %qD", parmnum, fndecl); + } else switch (errtype) { diff --git a/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C b/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C index ae84248..39497a5 100644 --- a/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C +++ b/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C @@ -4,7 +4,7 @@ /* decl, with argname. */ -extern int callee_1 (int one, const char **two, float three); +extern int callee_1 (int one, const char **two, float three); // { dg-line callee_1 } int test_1 (int first, const char *second, float third) { @@ -13,11 +13,16 @@ int test_1 (int first, const char *second, float third) return callee_1 (first, second, third); ^~ { dg-end-multiline-output "" } */ + // { dg-message "initializing argument 2 of 'int callee_1\\(int, const char\\*\\*, float\\)'" "" { target *-*-* } callee_1 } + /* { dg-begin-multiline-output "" } + extern int callee_1 (int one, const char **two, float three); + ~^~~ + { dg-end-multiline-output "" } */ } /* decl, without argname
[PATCH 1/2] More underlining of bad arguments (PR c++/85110)
As of r256448, the C++ frontend underlines many bad arguments in its diagnostics; those where perform_overload_resolution returns a non-NULL candidate, but there's a failure in convert_like_real. However, for the case where perform_overload_resolution fails, but there's a single non-viable candidate, the error is diagnosed by cp_build_function_call_vec, and that currently doesn't underline the bad argument: $ cat test.cc void callee (int one, const char **two, int three); void caller (const char *fmt) { callee (1, fmt, 3); } We emit: $ g++ test.cc test.cc: In function 'void caller(const char*)': test.cc:6:20: error: cannot convert 'const char*' to 'const char**' for argument '2' to 'void callee(int, const char**, int)' callee (1, fmt, 3); ^ It's going through convert_for_assignment, and implicitly using input_location. This patch updates convert_for_assignment for this case, using an EXPR_LOCATION if there is one, or falling back to input_location otherwise, underlining the argument in question: test.cc: In function 'void caller(const char*)': test.cc:6:14: error: cannot convert 'const char*' to 'const char**' for argument '2' to 'void callee(int, const char**, int)' callee (1, fmt, 3); ^~~ Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu; adds 117 PASS results to g++.sum. Not a regression as such, but I've been calling out the underlined arguments as a feature of gcc 8, so would be good to fix. OK for trunk? gcc/cp/ChangeLog: PR c++/85110 * typeck.c (convert_for_assignment): When complaining due to conversions for an argument, attempt to use the location of the argument. gcc/testsuite/ChangeLog: PR c++/85110 * g++.dg/diagnostic/param-type-mismatch-2.C: New test. --- gcc/cp/typeck.c| 5 +- .../g++.dg/diagnostic/param-type-mismatch-2.C | 175 + 2 files changed, 178 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c index 0c2ebd1..e733c79 100644 --- a/gcc/cp/typeck.c +++ b/gcc/cp/typeck.c @@ -8781,8 +8781,9 @@ convert_for_assignment (tree type, tree rhs, parmnum, complain, flags); } else if (fndecl) - error ("cannot convert %qH to %qI for argument %qP to %qD", - rhstype, type, parmnum, fndecl); + error_at (EXPR_LOC_OR_LOC (rhs, input_location), + "cannot convert %qH to %qI for argument %qP to %qD", + rhstype, type, parmnum, fndecl); else switch (errtype) { diff --git a/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C b/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C new file mode 100644 index 000..ae84248 --- /dev/null +++ b/gcc/testsuite/g++.dg/diagnostic/param-type-mismatch-2.C @@ -0,0 +1,175 @@ +// { dg-options "-fdiagnostics-show-caret" } + +/* A collection of calls where argument 2 is of the wrong type. */ + +/* decl, with argname. */ + +extern int callee_1 (int one, const char **two, float three); + +int test_1 (int first, const char *second, float third) +{ + return callee_1 (first, second, third); // { dg-error "27: cannot convert 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int callee_1\\(int, const char\\*\\*, float\\)'" } + /* { dg-begin-multiline-output "" } + return callee_1 (first, second, third); + ^~ + { dg-end-multiline-output "" } */ +} + +/* decl, without argname. */ + +extern int callee_2 (int, const char **, float); + +int test_2 (int first, const char *second, float third) +{ + return callee_2 (first, second, third); // { dg-error "27: cannot convert 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int callee_2\\(int, const char\\*\\*, float\\)'" } + /* { dg-begin-multiline-output "" } + return callee_2 (first, second, third); + ^~ + { dg-end-multiline-output "" } */ +} + +/* defn, with argname. */ + +static int callee_3 (int one, const char **two, float three) +{ + return callee_2 (one, two, three); +} + +int test_3 (int first, const char *second, float third) +{ + return callee_3 (first, second, third); // { dg-error "27: cannot convert 'const char\\*' to 'const char\\*\\*' for argument '2' to 'int callee_3\\(int, const char\\*\\*, float\\)'" } + /* { dg-begin-multiline-output "" } + return callee_3 (first, second, third); + ^~ + { dg-end-multiline-output "" } */ +} + +/* static member, with argname. */ + +struct s4 { static int member_1 (int one, const char **two, float three); }; // { dg-line s4_member_1 } + +int test_4 (int first, const char *second, float third) +{ + return s4::member_1 (first, second, third);
Re: [PATCH,rs6000] Remove __builtin_fctid and __builtin_fctiw
Hi Carl, On Wed, Mar 28, 2018 at 11:51:09AM -0700, Carl Love wrote: > gcc/ChangeLog: > > 2018-03-20 Carl Love > > Reverting patch: > 2017-09-27 Carl Love > > * config/rs6000/rs6000-builtin.def: Remove macro expansion for > FCTIW and FCTID. Remove macro definition BU_FP_MISC_1. > * config/rs6000/rs6000.md: Remove define_insn lrintsfsi2. Please just say 2018-03-28 Carl Love Revert 2017-09-27 Carl Love * config/rs6000/rs6000-builtin.def (BU_FP_1MISC_1): Add define macro. (FCTID, FCTIW): Add BU_FP_MISC_1 macro expansion for builtins. * config/rs6000/rs6000.md (lrintsfsi2): Add define_insn for the fctiw instruction. i.e. the exact changelog of what you revert (or the part you revert, if the revert is partial). It's easier for people who want to find things in history, and easier for you to write too :-) Okay for trunk with that change (and the similar one for the testsuite changelog). Oh and don't forget "svn rm" :-) Thanks! Segher
Re: [PATCH. rs6000] Fix PR84912: ICE using -m32 on __builtin_divde*, patch #2
On Wed, Mar 28, 2018 at 01:57:34PM -0500, Peter Bergner wrote: > On 3/28/18 12:59 PM, Segher Boessenkool wrote: > > It should be something like > > > > +error ("builtin function %qs requires the %qs (or newer), and " > >"%qs or %qs options", > > + name, "-mcpu=power7", "-m64", "-mpowerpc64"); > > > > I don't see other such strings that quote incorrectly? > > Ah, I guess I misunderstood what you were saying. So ok for trunk > with that change then? "Something like", I haven't tested anything. Please do test :-) Okay with that. Thanks! Segher
Re: [PATCH. rs6000] Fix PR84912: ICE using -m32 on __builtin_divde*, patch #2
On 3/28/18 4:13 PM, Segher Boessenkool wrote: > On Wed, Mar 28, 2018 at 01:57:34PM -0500, Peter Bergner wrote: >> On 3/28/18 12:59 PM, Segher Boessenkool wrote: >>> It should be something like >>> >>> +error ("builtin function %qs requires the %qs (or newer), and " >>>"%qs or %qs options", >>> + name, "-mcpu=power7", "-m64", "-mpowerpc64"); >>> >>> I don't see other such strings that quote incorrectly? >> >> Ah, I guess I misunderstood what you were saying. So ok for trunk >> with that change then? > > "Something like", I haven't tested anything. Please do test :-) > > Okay with that. Thanks! Tested and committed...both patches. Thanks. Do we care enough to fix these on the release branches? If so, I can backport them easily, since they're not that involved. I'll leave it up to you to decide. Peter
Re: [PATCH. rs6000] Fix PR84912: ICE using -m32 on __builtin_divde*, patch #2
On Wed, Mar 28, 2018 at 07:13:36PM -0500, Peter Bergner wrote: > On 3/28/18 4:13 PM, Segher Boessenkool wrote: > > On Wed, Mar 28, 2018 at 01:57:34PM -0500, Peter Bergner wrote: > >> On 3/28/18 12:59 PM, Segher Boessenkool wrote: > >>> It should be something like > >>> > >>> +error ("builtin function %qs requires the %qs (or newer), and " > >>> "%qs or %qs options", > >>> +name, "-mcpu=power7", "-m64", "-mpowerpc64"); > >>> > >>> I don't see other such strings that quote incorrectly? > >> > >> Ah, I guess I misunderstood what you were saying. So ok for trunk > >> with that change then? > > > > "Something like", I haven't tested anything. Please do test :-) > > > > Okay with that. Thanks! > > Tested and committed...both patches. Thanks. > > Do we care enough to fix these on the release branches? If so, I > can backport them easily, since they're not that involved. > I'll leave it up to you to decide. If it's easy, yes please. After the usual wait. Segher