Re: [4.8] backport fixes for wrong-code PR57425 and PR57569
On Sat, Mar 15, 2014 at 7:05 PM, Mikael Pettersson wrote: > This backports the fixes for wrong-code bugs PR57425 and PR57569, > both marked as 4.8 regressions, from mainline to 4.8 branch. > > Tested since June last year on x86_64, powerpc64, sparc64, armv5tel, > and m68k without regressions. According to Bill Schmidt it also > fixes a wrong-code problem for powerpc64le on IBM's 4.8 branch. > > Ok for 4.8 branch? Ok. Thanks, Richard. > Thanks, > > /Mikael > > (I don't have commit rights, but Bill has agreed to do the commit if > this backport is approved.) > > > gcc/ > > 2014-03-15 Mikael Pettersson > > Backport from mainline: > > 2013-06-20 Joern Rennecke > > PR rtl-optimization/57425 > PR rtl-optimization/57569 > * alias.c (write_dependence_p): Remove parameters mem_mode and > canon_mem_addr. Add parameters x_mode, x_addr and x_canonicalized. > Changed all callers. > (canon_anti_dependence): Get comments and semantics in sync. > Add parameter mem_canonicalized. Changed all callers. > * rtl.h (canon_anti_dependence): Update prototype. > > 2013-06-16 Joern Rennecke > > PR rtl-optimization/57425 > PR rtl-optimization/57569 > * alias.c (write_dependence_p): Add new parameters mem_mode, > canon_mem_addr and mem_canonicalized. Change type of writep to bool. > Changed all callers. > (canon_anti_dependence): New function. > * cse.c (check_dependence): Use canon_anti_dependence. > * cselib.c (cselib_invalidate_mem): Likewise. > * rtl.h (canon_anti_dependence): Declare. > > gcc/testsuite/ > > 2014-03-15 Mikael Pettersson > > Backport from mainline: > > 2013-06-16 Joern Rennecke > > PR rtl-optimization/57425 > PR rtl-optimization/57569 > * gcc.dg/torture/pr57425-1.c, gcc.dg/torture/pr57425-2.c: New files. > * gcc.dg/torture/pr57425-3.c, gcc.dg/torture/pr57569.c: Likewise. > > --- gcc-4.8.2/gcc/alias.c.~1~ 2013-03-05 10:40:38.0 +0100 > +++ gcc-4.8.2/gcc/alias.c 2014-03-15 18:18:31.402652881 +0100 > @@ -156,7 +156,9 @@ static int insert_subset_children (splay > static alias_set_entry get_alias_set_entry (alias_set_type); > static bool nonoverlapping_component_refs_p (const_rtx, const_rtx); > static tree decl_for_component_ref (tree); > -static int write_dependence_p (const_rtx, const_rtx, int); > +static int write_dependence_p (const_rtx, > + const_rtx, enum machine_mode, rtx, > + bool, bool, bool); > > static void memory_modified_1 (rtx, const_rtx, void *); > > @@ -2558,15 +2560,24 @@ canon_true_dependence (const_rtx mem, en > } > > /* Returns nonzero if a write to X might alias a previous read from > - (or, if WRITEP is nonzero, a write to) MEM. */ > + (or, if WRITEP is true, a write to) MEM. > + If X_CANONCALIZED is true, then X_ADDR is the canonicalized address of X, > + and X_MODE the mode for that access. > + If MEM_CANONICALIZED is true, MEM is canonicalized. */ > > static int > -write_dependence_p (const_rtx mem, const_rtx x, int writep) > +write_dependence_p (const_rtx mem, > + const_rtx x, enum machine_mode x_mode, rtx x_addr, > + bool mem_canonicalized, bool x_canonicalized, bool writep) > { > - rtx x_addr, mem_addr; > + rtx mem_addr; >rtx base; >int ret; > > + gcc_checking_assert (x_canonicalized > + ? (x_addr != NULL_RTX && x_mode != VOIDmode) > + : (x_addr == NULL_RTX && x_mode == VOIDmode)); > + >if (MEM_VOLATILE_P (x) && MEM_VOLATILE_P (mem)) > return 1; > > @@ -2590,17 +2601,21 @@ write_dependence_p (const_rtx mem, const >if (MEM_ADDR_SPACE (mem) != MEM_ADDR_SPACE (x)) > return 1; > > - x_addr = XEXP (x, 0); >mem_addr = XEXP (mem, 0); > - if (!((GET_CODE (x_addr) == VALUE > -&& GET_CODE (mem_addr) != VALUE > -&& reg_mentioned_p (x_addr, mem_addr)) > - || (GET_CODE (x_addr) != VALUE > - && GET_CODE (mem_addr) == VALUE > - && reg_mentioned_p (mem_addr, x_addr > + if (!x_addr) > { > - x_addr = get_addr (x_addr); > - mem_addr = get_addr (mem_addr); > + x_addr = XEXP (x, 0); > + if (!((GET_CODE (x_addr) == VALUE > +&& GET_CODE (mem_addr) != VALUE > +&& reg_mentioned_p (x_addr, mem_addr)) > + || (GET_CODE (x_addr) != VALUE > + && GET_CODE (mem_addr) == VALUE > + && reg_mentioned_p (mem_addr, x_addr > + { > + x_addr = get_addr (x_addr); > + if (!mem_canonicalized) > + mem_addr = get_addr (mem_addr); > + } > } > >if (! writep) > @@ -2616,11 +2631,16 @@ write_dependence_p (const_rtx mem, const > GET_MODE (mem))) > return 0; > > - x_addr = canon_rtx (x_addr); > - mem_ad
Re: C++ PATCH for c++/58678 (devirt vs. KDE)
> Honza suggested that if the destructor for an abstract class can't > ever be called through the vtable, the front end could avoid > referring to it from the vtable. This patch replaces such a > destructor with __cxa_pure_virtual in the vtable. > > Tested x86_64-pc-linux-gnu, applying to trunk. Thank you! would preffer different marker than cxa_pure_virtual in the vtable, most probably simply NULL. The reason is that __cxa_pure_virtual will appear as a possible target in the list and it will prevent devirtualiztion to happen when we end up with __cxa_pure_virtual and real destructor in the list of possible targets. gimple_get_virt_method_for_vtable knows that lookup in vtable that do not result in FUNCTION_DECL should be translated to BUILTIN_UNREACHABLE and ipa-devirt drops these from list of targets, unlike __cxa_pure_virtual that stays. Other problem with cxa_pure_virtual is that it needs external relocation. I sort of wonedered if we don't want to produce hidden comdat wrapper for it, so C++ programs are easier to relocate. I will still keep the patch to mark ABSTACT classes by BINFO flag and will send out the patch I made to make ABSTRACT classes to be ignored for anonymous namespace types. It seems to make difference for libreoffice that uses a lot of abstracts. What do you think of the following patch that makes ipa-devirt to conclude that destructor calls are never done on types in construction. If effect of doing so is undefined, I think it is safe to drop them from list of targets and that really helps to reduce lists down. Index: ipa-devirt.c === --- ipa-devirt.c(revision 208492) +++ ipa-devirt.c(working copy) @@ -1511,7 +1558,10 @@ possible_polymorphic_call_targets (tree target = NULL; } - maybe_record_node (nodes, target, inserted, can_refer, &complete); + /* Destructors are never called through construction virtual tables, + because the type is always known. */ + if (target && DECL_CXX_DESTRUCTOR_P (target)) +context.maybe_in_construction = false; if (target) {
[PATCH][match-and-simplify] Commit bootstrap workaround
This temporarily adds -fpermissive to the gimple-match.c compile to allow bootstrapping. Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages. Richard. 2014-03-17 Richard Biener * Makefile.in (gimple-match.o-warn): Temporarily add -fpermissive to allow bootstrapping. Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 208609) +++ gcc/Makefile.in (working copy) @@ -196,7 +196,7 @@ GCC_WARN_CXXFLAGS = $(LOOSE_WARN) $($(@D # flex output may yield harmless "no previous prototype" warnings build/gengtype-lex.o-warn = -Wno-error gengtype-lex.o-warn = -Wno-error -gimple-match.o-warn = -Wno-error +gimple-match.o-warn = -Wno-error -fpermissive # All warnings have to be shut off in stage1 if the compiler used then # isn't gcc; configure determines that. WARN_CFLAGS will be either
[patch testsuite]: Correct testcase for LLP64 targets
Hi, this patch corrects a regression seen in gcc.c-torture/compile/20010327-1.c for LLP64 targets. ChangeLog 2013-03-17 Kai Tietz * gcc.c-torture/compile/20010327-1.c: Adjust testcase for LLP64 targets. Ok for apply? Regards, Kai Index: gcc.c-torture/compile/20010327-1.c === --- gcc.c-torture/compile/20010327-1.c(Revision 208594) +++ gcc.c-torture/compile/20010327-1.c(Arbeitskopie) @@ -1,4 +1,5 @@ /* { dg-require-effective-target ptr32plus } */ +/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { llp64 } } { "*" } { "" } } */ /* This testcase tests whether GCC can produce static initialized data that references addresses of size 'unsigned long', even if that's not
[PATCH] Fix gfortran.dg/unlimited_polymorphic_13.f90
Tested on {x86_64,m68k}-suse-linux and installed as obvious. Andreas. PR testsuite/58851 * gfortran.dg/unlimited_polymorphic_13.f90: Properly compute storage size. --- gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90 | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90 index 0e27b17..8225738 100644 --- a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90 +++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_13.f90 @@ -23,18 +23,24 @@ contains integer :: k integer :: sz +sz = 0 select case (k) case (4) sz = storage_size(r1)*2 +end select +select case (k) case (8) sz = storage_size(r2)*2 - case (10) +end select +select case (k) + case (real_kinds(size(real_kinds)-1)) sz = storage_size(r3)*2 - case (16) +end select +select case (k) + case (real_kinds(size(real_kinds))) sz = storage_size(r4)*2 - case default - call abort() end select +if (sz .eq. 0) call abort() if (storage_size(o) /= sz) call abort() -- 1.9.0 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
[C++ Patch] PR 59571
Hi, noticed this issue, which looks simple to fix. The ICE happens in cxx_eval_constant_expression, because it cannot handle a CAST_EXPR (or any othe *_CAST, for that matter). In fact check_narrowing calls maybe_constant_value, and, because we are in a template, the latter faces the unfolded CAST_EXPR. Thus it seems easy to just use fold_non_dependent_expr_sfinae. Tested x86_64-linux. Thanks, Paolo. /// PS: looking forward, I'm wondering if some semantics/typeck functions shouldn't try harder before building a tree node and returning, eg, instead of just checking processing_template_decl, actually checking if type and expr are dependent? Does this kind of audit make sense for next Stage 1? /cp 2014-03-17 Paolo Carlini PR c++/59571 * typeck2.c (check_narrowing): Use fold_non_dependent_expr_sfinae. /testsuite 2014-03-17 Paolo Carlini PR c++/59571 * g++.dg/cpp0x/constexpr-ice13.C: New. Index: cp/typeck2.c === --- cp/typeck2.c(revision 208605) +++ cp/typeck2.c(working copy) @@ -861,7 +861,7 @@ check_narrowing (tree type, tree init) return; } - init = maybe_constant_value (init); + init = maybe_constant_value (fold_non_dependent_expr_sfinae (init, tf_none)); if (TREE_CODE (type) == INTEGER_TYPE && TREE_CODE (ftype) == REAL_TYPE) Index: testsuite/g++.dg/cpp0x/constexpr-ice13.C === --- testsuite/g++.dg/cpp0x/constexpr-ice13.C(revision 0) +++ testsuite/g++.dg/cpp0x/constexpr-ice13.C(working copy) @@ -0,0 +1,8 @@ +// PR c++/59571 +// { dg-do compile { target c++11 } } + +template +struct foo +{ + static constexpr int bar{(int)-1}; +};
[PATCH, 5/n] Handle CCMP in ifcvt to make it work with cmov
Hi, The patch enhances ifcvt to handle conditional compare instruction (ccmp) to make it work with cmov. For ccmp, ALLOW_CC_MODE is set to TRUE when calling canonicalize_condition. And the backend does not need to generate additional "compare (CC, 0)" for it. Bootstrap and no check regression on X84-64, ARM Chromebook and qemu-aarch64. Is it OK for next stage1? Thanks! -Zhenqiang ChangeLog: 2014-03-17 Zhenqiang Chen * ifcvt.c (struct noce_if_info): Add a new field ccmp_p. (noce_emit_cmove): Allow ccmp condition. (noce_get_alt_condition): Call canonicalize_condition with ccmp_p. (noce_get_condition): Set ALLOW_CC_MODE to TRUE for ccmp. (noce_process_if_block): Set ccmp_p for ccmp. * recog.h (ccmp_insn_p): New prototype. * recog.c (ccmp_insn_p): Make it global. * config/aarch64/aarch64.md (movcc): Handle ccmp_cc. testsuite/ChangeLog: 2014-03-17 Zhenqiang Chen * gcc.target/aarch64/ccmn-csel-1.c: New testcase. * gcc.target/aarch64/ccmn-csel-2.c: New testcase. * gcc.target/aarch64/ccmn-csel-3.c: New testcase. * gcc.target/aarch64/ccmp-csel-1.c: New testcase. * gcc.target/aarch64/ccmp-csel-2.c: New testcase. * gcc.target/aarch64/ccmp-csel-3.c: New testcase. diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 79aa2f3..4e18bb2 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -786,6 +786,9 @@ struct noce_if_info /* Estimated cost of the particular branch instruction. */ int branch_cost; + + /* The COND is a conditional compare. */ + bool ccmp_p; }; static rtx noce_emit_store_flag (struct noce_if_info *, rtx, int, int); @@ -1407,9 +1410,16 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code, end_sequence (); } - /* Don't even try if the comparison operands are weird. */ - if (! general_operand (cmp_a, GET_MODE (cmp_a)) - || ! general_operand (cmp_b, GET_MODE (cmp_b))) + /* Don't even try if the comparison operands are weird + except conditional compare. */ + if (if_info->ccmp_p) +{ + if (!(GET_MODE_CLASS (GET_MODE (cmp_a)) == MODE_CC +|| GET_MODE_CLASS (GET_MODE (cmp_b)) == MODE_CC)) +return NULL_RTX; +} + else if (! general_operand (cmp_a, GET_MODE (cmp_a)) + || ! general_operand (cmp_b, GET_MODE (cmp_b))) return NULL_RTX; #if HAVE_conditional_move @@ -1849,7 +1859,7 @@ noce_get_alt_condition (struct noce_if_info *if_info, rtx target, } cond = canonicalize_condition (if_info->jump, cond, reverse, - earliest, target, false, true); + earliest, target, if_info->ccmp_p, true); if (! cond || ! reg_mentioned_p (target, cond)) return NULL; @@ -2300,6 +2310,7 @@ noce_get_condition (rtx jump, rtx *earliest, bool then_else_reversed) { rtx cond, set, tmp; bool reverse; + int allow_cc_mode = false; if (! any_condjump_p (jump)) return NULL_RTX; @@ -2333,10 +2344,21 @@ noce_get_condition (rtx jump, rtx *earliest, bool then_else_reversed) return cond; } + /* For conditional compare, set ALLOW_CC_MODE to TRUE. */ + if (targetm.gen_ccmp_first) +{ + rtx prev = prev_nonnote_nondebug_insn (jump); + if (prev + && NONJUMP_INSN_P (prev) + && BLOCK_FOR_INSN (prev) == BLOCK_FOR_INSN (jump) + && ccmp_insn_p (prev)) +allow_cc_mode = true; +} + /* Otherwise, fall back on canonicalize_condition to do the dirty work of manipulating MODE_CC values and COMPARE rtx codes. */ tmp = canonicalize_condition (jump, cond, reverse, earliest, -NULL_RTX, false, true); +NULL_RTX, allow_cc_mode, true); /* We don't handle side-effects in the condition, like handling REG_INC notes and making sure no duplicate conditions are emitted. */ @@ -2577,6 +2599,11 @@ noce_process_if_block (struct noce_if_info *if_info) if_info->a = a; if_info->b = b; + if (targetm.gen_ccmp_first) +if (GET_MODE_CLASS (GET_MODE (XEXP (if_info->cond, 0))) == MODE_CC +|| GET_MODE_CLASS (GET_MODE (XEXP (if_info->cond, 1))) == MODE_CC) + if_info->ccmp_p = true; + /* Try optimizations in some approximation of a useful order. */ /* ??? Should first look to see if X is live incoming at all. If it isn't, we don't need anything but an unconditional set. */ diff -aru gcc/gcc/recog.c ccmp-all/gcc/recog.c --- a/gcc/recog.c2014-03-13 16:45:00.524945484 +0800 +++ b/gcc/recog.c2014-03-13 15:30:22.468912473 +0800 @@ -556,7 +556,7 @@ #define CODE_FOR_extzvCODE_FOR_nothing #endif -static bool +bool ccmp_insn_p (rtx object) { rtx x = PATTERN (object); diff -aru gcc/gcc/recog.h ccmp-all/gcc/recog.h --- a/gcc/recog.h2014-03-13 16:44:42.284945350 +0800 +++ b/gcc/recog.h2014-03-13 15:30:22.348912472 +0800 @@ -360,5 +360,6 @@ extern const struct insn_data_d insn_data[]; extern int peep2_current_count; +bool ccmp_insn_p (rtx); #endif /* GCC_RECOG_H */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aa
Re: [patch testsuite]: Correct testcase for LLP64 targets
Hi Kai, > Index: gcc.c-torture/compile/20010327-1.c > === > --- gcc.c-torture/compile/20010327-1.c(Revision 208594) > +++ gcc.c-torture/compile/20010327-1.c(Arbeitskopie) > @@ -1,4 +1,5 @@ > /* { dg-require-effective-target ptr32plus } */ > +/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { llp64 } } { "*" } { "" } } > */ the usual comments apply: * add a comment/PR reference as the first arg to dg-skip-if explaining the skip * omit the default args { "*" } { "" } Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch testsuite]: Correct testcase for LLP64 targets
On Mon, Mar 17, 2014 at 10:50:35AM +0100, Rainer Orth wrote: > Hi Kai, > > > Index: gcc.c-torture/compile/20010327-1.c > > === > > --- gcc.c-torture/compile/20010327-1.c(Revision 208594) > > +++ gcc.c-torture/compile/20010327-1.c(Arbeitskopie) > > @@ -1,4 +1,5 @@ > > /* { dg-require-effective-target ptr32plus } */ > > +/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { llp64 } } { "*" } { "" } > > } */ > > the usual comments apply: > > * add a comment/PR reference as the first arg to dg-skip-if explaining > the skip > > * omit the default args { "*" } { "" } Or perhaps just drop dg-require-effective-target directive and instead do /* { dg-do compile { target { ptr32plus && ! llp64 } } } */ Jakub
Re: [patch testsuite]: Correct testcase for LLP64 targets
2014-03-17 10:53 GMT+01:00 Jakub Jelinek : > On Mon, Mar 17, 2014 at 10:50:35AM +0100, Rainer Orth wrote: >> Hi Kai, >> >> > Index: gcc.c-torture/compile/20010327-1.c >> > === >> > --- gcc.c-torture/compile/20010327-1.c(Revision 208594) >> > +++ gcc.c-torture/compile/20010327-1.c(Arbeitskopie) >> > @@ -1,4 +1,5 @@ >> > /* { dg-require-effective-target ptr32plus } */ >> > +/* { dg-skip-if "" { { i?86-*-* x86_64-*-* } && { llp64 } } { "*" } { "" >> > } } */ >> >> the usual comments apply: >> >> * add a comment/PR reference as the first arg to dg-skip-if explaining >> the skip >> >> * omit the default args { "*" } { "" } > > Or perhaps just drop dg-require-effective-target directive and instead do > /* { dg-do compile { target { ptr32plus && ! llp64 } } } */ > > Jakub Yeah, omitting the dg-require-effective-target directive looks to me like the best thing to do. To add a skip-directive is superflous. Ok with patch following patch? Kai Index: 20010327-1.c === --- 20010327-1.c(Revision 208594) +++ 20010327-1.c(Arbeitskopie) @@ -1,4 +1,4 @@ -/* { dg-require-effective-target ptr32plus } */ +/* { dg-do compile { ptr32plus && !llp64 } } */ /* This testcase tests whether GCC can produce static initialized data that references addresses of size 'unsigned long', even if that's not
Re: [PATCH][AARCH64]combine "ubfiz" and "orr" with bfi when certain condition meets.
On 16/03/14 12:30, Renlin Li wrote: > Hi all, > > Thank you for your suggestions, Richard. I have updated the patch > accordingly. > > This is an optimization patch which will combine "ubfiz" and "orr" > insns with a single "bfi" when certain conditions meet. > > tmp = (x & m) | ( (y & n) << lsb) can be presented using > > and tmp, x, m > bfi tmp, y, #lsb, #width > > if ((n+1) == 2^width) && (m & n << lsb) == 0. > > A small test case is also added to verify it. > > Is this Okay for stage-1? > > Kind regards, > Renlin Li > This looks to me more like a 3 into two split operation where combine needs some help to do the split, since the transformation is non-trivial. As such, I think you just need a define_split rather than a define_insn_and_split (there's also no obvious reason why we would want to defer this split until after register allocation). Furthermore, you have an early-clobber situation here: it's important that y and tmp aren't in the same register. You appear to try to cater for this by using an operand tie, but that's unnecessary in general (the AND operation can write any usable register) and won't work in the specific case where x = y. R. > > gcc/ChangeLog: > > 2014-03-14 Renlin Li > > * config/aarch64/aarch64.md (*combine_bfi2, > *combine_bfi3): New. > > gcc/testsuite: > > 2014-03-14 Renlin Li > > * gcc.target/aarch64/combine_and_orr_1.c: New. > > > patch.diff > > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 99a6ac8..6c2798b 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -3115,6 +3115,53 @@ >[(set_attr "type" "bfm")] > ) > > +(define_insn_and_split "*combine_bfi2" > + [(set (match_operand:GPI 0 "register_operand" "=r") > +(ior:GPI (and:GPI (ashift:GPI (match_operand:GPI 1 > "register_operand" "r") > + (match_operand 2 "const_int_operand" > "n")) > + (match_operand 3 "const_int_operand" "n")) > + (zero_extend:GPI (match_operand:SHORT 4 "register_operand" > "0"] > + "exact_log2 ((INTVAL (operands[3]) >> INTVAL (operands[2])) + 1) >= 0 > + && <= INTVAL (operands[2])" > + "#" > + "&& reload_completed" > + [(set (match_dup 0) > +(zero_extend:GPI (match_dup 4))) > + (set (zero_extract:GPI (match_dup 0) > + (match_dup 3) > + (match_dup 2)) > + (match_dup 1))] > + { > + int tmp = (INTVAL (operands[3]) >> INTVAL (operands[2])) + 1; > + operands[3] = GEN_INT (exact_log2 (tmp)); > + } > + [(set_attr "type" "multiple")] > +) > + > +(define_insn_and_split "*combine_bfi3" > + [(set (match_operand:GPI 0 "register_operand" "=r") > +(ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0") > + (match_operand 2 "aarch64_logical_immediate" "n")) > + (and:GPI (ashift:GPI (match_operand:GPI 3 > "register_operand" "r") > + (match_operand 4 "const_int_operand" > "n")) > + (match_operand 5 "const_int_operand" "n"] > + "exact_log2 ((INTVAL (operands[5]) >> INTVAL (operands[4])) + 1) >= 0 > + && (INTVAL (operands[2]) & INTVAL (operands[5])) == 0" > + "#" > + "&& reload_completed" > + [(set (match_dup 0) > +(and:GPI (match_dup 1) (match_dup 2))) > + (set (zero_extract:GPI (match_dup 0) > + (match_dup 5) > + (match_dup 4)) > + (match_dup 3))] > + { > + int tmp = (INTVAL (operands[5]) >> INTVAL (operands[4])) + 1; > + operands[5] = GEN_INT (exact_log2 (tmp)); > + } > + [(set_attr "type" "multiple")] > +) > + > (define_insn "*extr_insv_lower_reg" >[(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r") > (match_operand 1 "const_int_operand" "n") > diff --git a/gcc/testsuite/gcc.target/aarch64/combine_and_orr_1.c > b/gcc/testsuite/gcc.target/aarch64/combine_and_orr_1.c > new file mode 100644 > index 000..b2c0194 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/combine_and_orr_1.c > @@ -0,0 +1,51 @@ > +/* { dg-do run } */ > +/* { dg-options "-save-temps -O2" } */ > + > +extern void abort (void); > + > +unsigned int __attribute__ ((noinline)) > +foo1 (unsigned int major, unsigned int minor) > +{ > + unsigned int tmp = (minor & 0xff) | ((major & 0xfff) << 8); > + return tmp; > +} > + > +unsigned int __attribute__ ((noinline)) > +foo2 (unsigned int major, unsigned int minor) > +{ > + unsigned int tmp = (minor & 0x1f) | ((major & 0xfff) << 8); > + return tmp; > +} > + > +int > +main (void) > +{ > + unsigned int major[10] = {1947662, 484254, 193508, 4219233, 2211215, > + 3998162, 4240676, 1034099, 54412, 3195572}; > + unsigned int minor[10] = {1027568, 21481, 2746675, 3121857, 2471080, > + 3158801, 237587, 813307, 4073168, 1503494}; > + > + unsigned in
Re: Try to catch up _GLIBCXX_RESOLVE_LIB_DEFECTS comments and documentation.
On 16 March 2014 16:09, Ed Smith-Rowland wrote: > OK, thinking further on it I actually agree with not mentioning DRs on a > partially baked standard. We advertise that support for new standards is > experimental. I don't think it does any harm to add comments during the C++1y/C++1z process to note that we've incorporated a particular DR against an earlier working paper, because it's not always obvious which draft our work-in-progress follows, but once the standard is finished I'd be in favour of removing those comments. Implementing those DRs is implied by implementing the finished standard.
Re: [patch testsuite]: Correct testcase for LLP64 targets
Sorry, I repost last patch with small correction in dg-do directive. The ! in there needs additional framing, and I missed the target keyword. Regards, Kai Index: 20010327-1.c === --- 20010327-1.c(Revision 208594) +++ 20010327-1.c(Arbeitskopie) @@ -1,4 +1,4 @@ -/* { dg-require-effective-target ptr32plus } */ +/* { dg-do compile { target { ptr32plus && { ! llp64 } } } } */ /* This testcase tests whether GCC can produce static initialized data that references addresses of size 'unsigned long', even if that's not
[PATCH] Expand OpenMP SIMD even with -fno-tree-loop-optimize (PR middle-end/60534)
This patch ensures that we properly expand gomp SIMD builtins even with -fno-tree-loop-optimize. The problem was that we didn't run the loop vectorization at all. -fno-tree-loop-vectorize already contains similar hack. Regtested/bootstrapped on x86_64-linux, ok for trunk (or for 5.0?)? 2014-03-17 Marek Polacek PR middle-end/60534 * omp-low.c (omp_max_vf): Treat -fno-tree-loop-optimize the same as -fno-tree-loop-vectorize. (expand_omp_simd): Likewise. testsuite/ * gcc.dg/gomp/pr60534.c: New test. diff --git gcc/omp-low.c gcc/omp-low.c index 91c8656..fdf3367 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -2931,7 +2931,8 @@ omp_max_vf (void) || optimize_debug || (!flag_tree_loop_vectorize && (global_options_set.x_flag_tree_loop_vectorize - || global_options_set.x_flag_tree_vectorize))) + || global_options_set.x_flag_tree_vectorize + || global_options_set.x_flag_tree_loop_optimize))) return 1; int vs = targetm.vectorize.autovectorize_vector_sizes (); @@ -6834,11 +6835,12 @@ expand_omp_simd (struct omp_region *region, struct omp_for_data *fd) loop->simduid = OMP_CLAUSE__SIMDUID__DECL (simduid); cfun->has_simduid_loops = true; } - /* If not -fno-tree-loop-vectorize, hint that we want to vectorize -the loop. */ + /* If not -fno-tree-loop-vectorize of -fno-tree-loop-optimize, + hint that we want to vectorize the loop. */ if ((flag_tree_loop_vectorize || (!global_options_set.x_flag_tree_loop_vectorize - && !global_options_set.x_flag_tree_vectorize)) + && !global_options_set.x_flag_tree_vectorize + && !global_options_set.x_flag_tree_loop_optimize)) && loop->safelen > 1) { loop->force_vect = true; diff --git gcc/testsuite/gcc.dg/gomp/pr60534.c gcc/testsuite/gcc.dg/gomp/pr60534.c index e69de29..f8a6bdc 100644 --- gcc/testsuite/gcc.dg/gomp/pr60534.c +++ gcc/testsuite/gcc.dg/gomp/pr60534.c @@ -0,0 +1,16 @@ +/* PR middle-end/60534 */ +/* { dg-do compile } */ +/* { dg-options "-fopenmp -O -fno-tree-loop-optimize" } */ + +extern int d[]; + +int +foo (int a) +{ + int c = 0; + int l; +#pragma omp simd reduction(+: c) + for (l = 0; l < a; ++l) +c += d[l]; + return c; +} Marek
Re: [patch testsuite]: Correct testcase for LLP64 targets
On Mon, Mar 17, 2014 at 12:01:41PM +0100, Kai Tietz wrote: > Sorry, I repost last patch with small correction in dg-do directive. > The ! in there needs additional framing, and I missed the target > keyword. > > Regards, > Kai > > Index: 20010327-1.c > === > --- 20010327-1.c(Revision 208594) > +++ 20010327-1.c(Arbeitskopie) > @@ -1,4 +1,4 @@ > -/* { dg-require-effective-target ptr32plus } */ > +/* { dg-do compile { target { ptr32plus && { ! llp64 } } } } */ Ok with proper ChangeLog entry and the double space before && replaced with a single space. Jakub
Re: [PATCH] Expand OpenMP SIMD even with -fno-tree-loop-optimize (PR middle-end/60534)
On Mon, Mar 17, 2014 at 12:01:54PM +0100, Marek Polacek wrote: > This patch ensures that we properly expand gomp SIMD builtins even with > -fno-tree-loop-optimize. The problem was that we didn't run the > loop vectorization at all. -fno-tree-loop-vectorize already contains > similar hack. > > Regtested/bootstrapped on x86_64-linux, ok for trunk (or for 5.0?)? > > 2014-03-17 Marek Polacek > > PR middle-end/60534 > * omp-low.c (omp_max_vf): Treat -fno-tree-loop-optimize the same > as -fno-tree-loop-vectorize. > (expand_omp_simd): Likewise. > testsuite/ > * gcc.dg/gomp/pr60534.c: New test. > > diff --git gcc/omp-low.c gcc/omp-low.c > index 91c8656..fdf3367 100644 > --- gcc/omp-low.c > +++ gcc/omp-low.c > @@ -2931,7 +2931,8 @@ omp_max_vf (void) >|| optimize_debug >|| (!flag_tree_loop_vectorize > && (global_options_set.x_flag_tree_loop_vectorize > - || global_options_set.x_flag_tree_vectorize))) > + || global_options_set.x_flag_tree_vectorize > + || global_options_set.x_flag_tree_loop_optimize))) No. IMHO this needs to be: || optimize_debug + || !flag_no_tree_loop_optimize || (!flag_tree_loop_vectorize && (global_options_set.x_flag_tree_loop_vectorize > @@ -6834,11 +6835,12 @@ expand_omp_simd (struct omp_region *region, struct > omp_for_data *fd) > loop->simduid = OMP_CLAUSE__SIMDUID__DECL (simduid); > cfun->has_simduid_loops = true; > } > - /* If not -fno-tree-loop-vectorize, hint that we want to vectorize > - the loop. */ > + /* If not -fno-tree-loop-vectorize of -fno-tree-loop-optimize, > + hint that we want to vectorize the loop. */ >if ((flag_tree_loop_vectorize > || (!global_options_set.x_flag_tree_loop_vectorize > - && !global_options_set.x_flag_tree_vectorize)) > +&& !global_options_set.x_flag_tree_vectorize > +&& !global_options_set.x_flag_tree_loop_optimize)) Similarly, here it should be added as + && flag_tree_loop_optimize > && loop->safelen > 1) The thing is, if -fno-tree-loop-optimize (whether explicitly added by user or implicitly through other options, then the loop will be never vectorized. It doesn't matter if -ftree-vectorize was on or not in that case. The magic with global_options_set is there to make the loop vectorized if either -ftree-loop-vectorize is on (implicitly or explicitly), or at least optimizing and not disabled explicitly (-fno-tree-vectorize), we then force the vectorization on for the specific loops. But -fno-tree-loop-optimize means the whole loop optimization pipeline is not performed, at that point forcing it on and disabling all other loop optimizations might be too problematic/error prone. E.g. you could try -fopenmp -O -fno-tree-loop-optimize -ftree-vectorize or -fopenmp -O3 -fno-tree-loop-optimize etc. Jakub
Re: [PATCH] x86: Define _mm*_undefined_*
On 16 Mar 07:12, Ulrich Drepper wrote: > [This patch is so far really meant for commenting. I haven't tested it > at all yet.] > > Intel's intrinsic specification includes one set which currently is not > defined in gcc's headers: the _mm*_undefined_* intrinsics. What specification are talking about? As far as I know they are present in ICC headers, but not in manuals such as: http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html > The purpose of these instrinsics (currently three classes, three formats > each) is to create a pseudo-value the compiler does not assume is > uninitialized without incurring any code doing so. The purpose is to > use these intrinsics in places where it is known the value of a register > is never used. This is already important with AVX2 and becomes really > crucial with AVX512. > > Currently three different techniques are used: > > - _mm*_setzero_*() is used. Even though the XOR operation does not > cost anything it still messes with the instruction scheduling and > more code is generated. > > - another parameter is duplicated. This leads most of the time to > one additional move instruction. > > - uninitialized variables are used (this is in new AVX512 code). The > compiler should generate warnings for these headers. I haven't > tried it. Uninitialized variables certainly are bad. Replacing them with setzero/undefined is a good idea. Also in most AVX512 cases those values shouldn't be present in code. They are either optimized away in case of -1 mask or result in zero-masking being applied. Do you know of any cases where xor is generated (except for destination in gather/scatter) > > Using the _mm*_undefined_*() intrinsics is much cleaner and also > potentially allows to generate better code. > > For now the implementation uses an inline asm to suggest to the compiler > that the variable is initialized. This does not prevent a real register > to be allocated for this purpose but it saves the XOR instruction. > > The correct and optimal implementation will require a compiler built-in > which will do something different based on how the value is used: > > - if the value is never modified then any register should be picked. > In function/intrinsic calls the parameter simply need not be loaded at > all. > > - if the value is modified (and allocated to a register or memory > location) no initialization for the variable is needed (equivalent > to the asm now). > > > The questions are: > > - is there interest in adding the necessary compiler built-in? > > - if yes, anyone interested in working on this? > > - and: is it worth adding a patch like the on here in the meantime? > > As it stands now gcc's instrinsics are not complete and programs following > Intel's manuals can fail to compile. > Compatibility with ICC is certainly good. I tried your patch, and undefined is similar in behavior to setzero, but it also clobbers flags. Maybe just define it to setzero for now? > > > 2014-03-16 Ulrich Drepper > > * config/i386/avxintrin.h (_mm256_undefined_si256): Define. > (_mm256_undefined_ps): Define. > (_mm256_undefined_pd): Define. > * config/i386/emmintrin.h (_mm_undefined_si128): Define. > (_mm_undefined_pd): Define. > * config/i386/xmmintrin.h (_mm_undefined_ps): Define. > * config/i386/avx512fintrin.h (_mm512_undefined_si512): Define. > (_mm512_undefined_ps): Define. > (_mm512_undefined_pd): Define. > Use _mm*_undefined_*. > * config/i386/avx2intrin.h: Use _mm*_undefined_*. >
Re: [PATCH,GCC/Thumb1] Correctly reset the variable after_arm_reorg for Thumb1 target
On 17/03/14 02:51, Terry Guo wrote: > Hi, > > I am working on another patch and found this per-function variable isn't > correctly reset for Thumb1 target. Currently no ICE will be triggered > because we don't call function arm_split_constants for Thumb1 target. This > patch intends to define this variable in machine_function struct in arm.h. > In this way, the variable will be correctly reset and ready for being used > for Thumb1 target in future. > > Tested with gcc regression test for Thumb1 target cortex-m0. No new > regressions. > > Is it ok to trunk? > > BR, > Terry > > 2014-03-17 Terry Guo > > * config/arm/arm.h (machine_function): Define variable > after_arm_reorg here. > * config/arm/arm.c (after_arm_reorg): Remove the definition. > (arm_split_constant): Update the way to access variable > after_arm_reorg. > (arm_reorg): Ditto. > (arm_output_function_epilogue): Remove the reset of after_arm_reorg. > > > reset-after_arm_reorg-thumb1-v3.txt > > > diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h > index 7ca47a7..982ed48 100644 > --- a/gcc/config/arm/arm.h > +++ b/gcc/config/arm/arm.h > @@ -1543,6 +1543,9 @@ typedef struct GTY(()) machine_function >rtx thumb1_cc_op1; >/* Also record the CC mode that is supported. */ >enum machine_mode thumb1_cc_mode; > + /* Set to 1 after arm_reorg has started. Reset to 0 at the start of > + the next function. */ The reset comment is no-longer relevant. Please remove. Ok for stage1 with that change. R.
[PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
Hello, Patch in the bottom allows to use ymmXX and zmmXX register names in inline asm statements as well as in `register` variables definitions. New tests pass. Bootstrap pass. Is it ok for trunk? Do we need to backport it to 4.8? gcc/ * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add ymm and zmm register names. testsuite/ * gcc.target/i386/avx-additional-reg-names.c: New. * gcc.target/i386/avx512f-additional-reg-names.c: Ditto. -- Thanks, K commit c3884af93c105115bc1e4d02fa824d24420c5bbf Author: Kirill Yukhin Date: Mon Mar 17 14:56:06 2014 +0400 [AVX, AVX-512]. Extend ADDITIONAL_REGISTER_NAMES to Ymms and Zmms. --- gcc/config/i386/i386.h | 28 +- .../gcc.target/i386/avx-additional-reg-names.c | 9 +++ .../gcc.target/i386/avx512f-additional-reg-names.c | 9 +++ 3 files changed, 40 insertions(+), 6 deletions(-) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index c80878b..c5c1d58 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -2016,12 +2016,28 @@ do { \ /* Table of additional register names to use in user input. */ #define ADDITIONAL_REGISTER_NAMES \ -{ { "eax", 0 }, { "edx", 1 }, { "ecx", 2 }, { "ebx", 3 }, \ - { "esi", 4 }, { "edi", 5 }, { "ebp", 6 }, { "esp", 7 }, \ - { "rax", 0 }, { "rdx", 1 }, { "rcx", 2 }, { "rbx", 3 }, \ - { "rsi", 4 }, { "rdi", 5 }, { "rbp", 6 }, { "rsp", 7 }, \ - { "al", 0 }, { "dl", 1 }, { "cl", 2 }, { "bl", 3 }, \ - { "ah", 0 }, { "dh", 1 }, { "ch", 2 }, { "bh", 3 } } +{ { "eax", 0 }, { "edx", 1 }, { "ecx", 2 }, { "ebx", 3 }, \ + { "esi", 4 }, { "edi", 5 }, { "ebp", 6 }, { "esp", 7 }, \ + { "rax", 0 }, { "rdx", 1 }, { "rcx", 2 }, { "rbx", 3 }, \ + { "rsi", 4 }, { "rdi", 5 }, { "rbp", 6 }, { "rsp", 7 }, \ + { "al", 0 }, { "dl", 1 }, { "cl", 2 }, { "bl", 3 }, \ + { "ah", 0 }, { "dh", 1 }, { "ch", 2 }, { "bh", 3 }, \ + { "ymm0", 21}, { "ymm1", 22}, { "ymm2", 23}, { "ymm3", 24}, \ + { "ymm4", 25}, { "ymm5", 26}, { "ymm6", 27}, { "ymm7", 28}, \ + { "ymm8", 45}, { "ymm9", 46}, { "ymm10", 47}, { "ymm11", 48}, \ + { "ymm12", 49}, { "ymm13", 50}, { "ymm14", 51}, { "ymm15", 52}, \ + { "ymm16", 53}, { "ymm17", 54}, { "ymm18", 55}, { "ymm19", 56}, \ + { "ymm20", 57}, { "ymm21", 58}, { "ymm22", 59}, { "ymm23", 60}, \ + { "ymm24", 61}, { "ymm25", 62}, { "ymm26", 63}, { "ymm27", 64}, \ + { "ymm28", 65}, { "ymm29", 66}, { "ymm30", 67}, { "ymm31", 68}, \ + { "zmm0", 21}, { "zmm1", 22}, { "zmm2", 23}, { "zmm3", 24}, \ + { "zmm4", 25}, { "zmm5", 26}, { "zmm6", 27}, { "zmm7", 28}, \ + { "zmm8", 45}, { "zmm9", 46}, { "zmm10", 47}, { "zmm11", 48}, \ + { "zmm12", 49}, { "zmm13", 50}, { "zmm14", 51}, { "zmm15", 52}, \ + { "zmm16", 53}, { "zmm17", 54}, { "zmm18", 55}, { "zmm19", 56}, \ + { "zmm20", 57}, { "zmm21", 58}, { "zmm22", 59}, { "zmm23", 60}, \ + { "zmm24", 61}, { "zmm25", 62}, { "zmm26", 63}, { "zmm27", 64}, \ + { "zmm28", 65}, { "zmm29", 66}, { "zmm30", 67}, { "zmm31", 68} } /* Note we are omitting these since currently I don't know how to get gcc to use these, since they want the same but different diff --git a/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c b/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c new file mode 100644 index 000..d984bff --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx" } */ + +void foo () +{ + register int ymm_var asm ("ymm4"); + + __asm__ __volatile__("vxorpd %%ymm0, %%ymm0, %%ymm7\n" : : : "ymm7" ); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c new file mode 100644 index 000..1bd428a --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512f" } */ + +void foo () +{ + register int zmm_var asm ("zmm9"); + + __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" ); +}
Re: [RFC][gomp4] Offloading: Add device initialization and host->target function mapping
Ping. 2014-03-12 21:56 GMT+04:00 Ilya Verbin : > Hi Thomas, > > Here is a new version of this patch (it was discussed in other thread: > http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00573.html ) with ChangeLog. > Bootstrap and make check passed. > Ok to commit? -- Ilya
Consolidate GCC web pages documentation (4/3)
This nearly brings us to the goal of having just one page covering this and simplifies language in about.html a bit on the way. Applied. Gerald Index: about.html === RCS file: /cvs/gcc/wwwdocs/htdocs/about.html,v retrieving revision 1.20 diff -u -r1.20 about.html --- about.html 17 Feb 2014 01:03:10 - 1.20 +++ about.html 15 Mar 2014 11:43:01 - @@ -14,14 +14,14 @@ contributors. The web effort was originally led by Jeff Law. For the last decade -or so Gerald Pfeifer has been leading the effort, but again, there are +or so Gerald Pfeifer has been leading the effort, but there are lots of people who contribute. The web pages are under CVS control and you can http://gcc.gnu.org/cgi-bin/cvsweb.cgi/wwwdocs/";>browse the repository online. -The pages on gcc.gnu.org are updated "live" (that is, directly after a -change has been made); www.gnu.org is updated once a day at 4:00 -0700 +The pages on gcc.gnu.org are updated "live" directly after a +change has been made; www.gnu.org is updated once a day at 4:00 -0700 (PDT). Please send feedback, problem reports and patches to our @@ -83,6 +83,13 @@ list. +As changes are checked in, the respective pages are preprocessed +via the script wwwdocs/bin/preprocess which in turn +uses a tool called MetaHTML. Among others, this preprocessing +adds CSS style sheets, XML and HTML headers, and our standard +footer. The MetaHTML style sheet is in +wwwdocs/htdocs/style.mhtml. + The host system Index: projects/web.html === RCS file: /cvs/gcc/wwwdocs/htdocs/projects/web.html,v retrieving revision 1.15 diff -u -r1.15 web.html --- projects/web.html 17 Feb 2014 01:03:10 - 1.15 +++ projects/web.html 15 Mar 2014 11:43:01 - @@ -11,12 +11,5 @@ Contributing changes to our web pages is simple. -As changes are checked in, the respective pages are preprocessed -via the script wwwdocs/bin/preprocess which in turn -uses a tool called MetaHTML. Among others, this preprocessing -adds CSS style sheets, XML and HTML headers, and our standard -footer. The MetaHTML style sheet is in -wwwdocs/htdocs/style.mhtml. -
Re: [patch, libgfortran] PR46800 Handle CTRL-D correctly with STDIN
On Mon, Mar 17, 2014 at 12:50 AM, Jerry DeLisle wrote: > Hi all. > > The problem here was that when reading a value from STDIN and the user just > entered an empty entry (LF), > we would end up getting nested into a second read (via next_char) and the user > would have to press CTRL-D twice to get out of the read. (The correct behavior > is to only hit CTRL-D once which sends us the EOF. > > This was caused by a call to eat_separator right after we did the initial > read. > The eat_separator function then tries to read again and we get a condition of > waiting for user input on that read. The patch eliminates this call to > eat_separator. This requires explicitly checking for the comma and end-of-line > conditions which are also done in eat_separator. > > Regression tested on x86-64-gnu. No test case can be done since it require > terminal input to read. > > OK for trunk? Ok, thanks for the patch. I wonder, would it be possible to set up some dejagnu testcases with multiple programs communicating via pipes or such, we occasionally seem to have regressions dealing with non-seekable files/terminals and such which go undetected for a long time, since we're not regularly testing it? -- Janne Blomqvist
Re: [C++ Patch] PR 59571
On 03/17/2014 05:38 AM, Paolo Carlini wrote: noticed this issue, which looks simple to fix. The ICE happens in cxx_eval_constant_expression, because it cannot handle a CAST_EXPR (or any othe *_CAST, for that matter). In fact check_narrowing calls maybe_constant_value, and, because we are in a template, the latter faces the unfolded CAST_EXPR. Thus it seems easy to just use fold_non_dependent_expr_sfinae. Tested x86_64-linux. OK. PS: looking forward, I'm wondering if some semantics/typeck functions shouldn't try harder before building a tree node and returning, eg, instead of just checking processing_template_decl, actually checking if type and expr are dependent? Does this kind of audit make sense for next Stage 1? In general adding fold_non_dependent_expr where it's needed is the right answer, because normal operation creates tree patterns that tsubst doesn't understand how to deal with. I suppose it might work to always fully build non-instantiation-dependent expressions, wrap them in NON_DEPENDENT_EXPR, and then unshare its operand at instantiation time. But that would be a significant change with unclear benefit. Jason
Re: [PATCH] Fix PR60505
On Fri, 14 Mar 2014, Cong Hou wrote: > On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener wrote: > > On Fri, 14 Mar 2014, Jakub Jelinek wrote: > > > >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote: > >> > > Consider this fact and if there are alias checks, we can safely remove > >> > > the epilogue if the maximum trip count of the loop is less than or > >> > > equal to the calculated threshold. > >> > > >> > You have to consider n % vf != 0, so an argument on only maximum > >> > trip count or threshold cannot work. > >> > >> Well, if you only check if maximum trip count is <= vf and you know > >> that for n < vf the vectorized loop + it's epilogue path will not be taken, > >> then perhaps you could, but it is a very special case. > >> Now, the question is when we are guaranteed we enter the scalar versioned > >> loop instead for n < vf, is that in case of versioning for alias or > >> versioning for alignment? > > > > I think neither - I have plans to do the cost model check together > > with the versioning condition but didn't get around to implement that. > > That would allow stronger max bounds for the epilogue loop. > > In vect_transform_loop(), check_profitability will be set to true if > th >= VF-1 and the number of iteration is unknown (we only consider > unknown trip count here), where th is calculated based on the > parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum > value VF-1. If the loop needs to be versioned, then > check_profitability with true value will be passed to > vect_loop_versioning(), in which an enhanced loop bound check > (considering cost) will be built. So I think if the loop is versioned > and n < VF, then we must enter the scalar version, and in this case > removing epilogue should be safe when the maximum trip count <= th+1. You mean exactly in the case where the profitability check ensures that n % vf == 0? Thus effectively if n == maximum trip count? That's quite a special case, no? Richard. -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
Re: [PATCH] Fix PR60505
On Mon, Mar 17, 2014 at 02:44:29PM +0100, Richard Biener wrote: > You mean exactly in the case where the profitability check ensures > that n % vf == 0? Thus effectively if n == maximum trip count? > That's quite a special case, no? Indeed it is. But I guess that is pretty much the only case where the following optimizers can fold the array accesses in the (unneeded) epilogue loop from some non-constant indexes to constant ones (because, it knows that the vector loop will iterate in that case exactly once). Jakub
Re: [PATCH] Expand OpenMP SIMD even with -fno-tree-loop-optimize (PR middle-end/60534)
On Mon, Mar 17, 2014 at 12:16:08PM +0100, Jakub Jelinek wrote: > No. IMHO this needs to be: > || optimize_debug > + || !flag_no_tree_loop_optimize > || (!flag_tree_loop_vectorize > && (global_options_set.x_flag_tree_loop_vectorize I presume you mean !flag_tree_loop_optimize. > > @@ -6834,11 +6835,12 @@ expand_omp_simd (struct omp_region *region, struct > > omp_for_data *fd) > > loop->simduid = OMP_CLAUSE__SIMDUID__DECL (simduid); > > cfun->has_simduid_loops = true; > > } > > - /* If not -fno-tree-loop-vectorize, hint that we want to vectorize > > -the loop. */ > > + /* If not -fno-tree-loop-vectorize of -fno-tree-loop-optimize, > > + hint that we want to vectorize the loop. */ > >if ((flag_tree_loop_vectorize > >|| (!global_options_set.x_flag_tree_loop_vectorize > > - && !global_options_set.x_flag_tree_vectorize)) > > + && !global_options_set.x_flag_tree_vectorize > > + && !global_options_set.x_flag_tree_loop_optimize)) > > Similarly, here it should be added as > > + && flag_tree_loop_optimize > > && loop->safelen > 1) > > The thing is, if -fno-tree-loop-optimize (whether explicitly added by user > or implicitly through other options, then the loop will be never vectorized. > It doesn't matter if -ftree-vectorize was on or not in that case. > > The magic with global_options_set is there to make the loop vectorized > if either -ftree-loop-vectorize is on (implicitly or explicitly), or > at least optimizing and not disabled explicitly (-fno-tree-vectorize), > we then force the vectorization on for the specific loops. > > But -fno-tree-loop-optimize means the whole loop optimization pipeline is > not performed, at that point forcing it on and disabling all other loop > optimizations might be too problematic/error prone. > > E.g. you could try -fopenmp -O -fno-tree-loop-optimize -ftree-vectorize > or -fopenmp -O3 -fno-tree-loop-optimize etc. :( sorry, fixed. No ICE with these options. Regtested on x86_64-linux, ok now? 2014-03-17 Marek Polacek PR middle-end/60534 * omp-low.c (omp_max_vf): Treat -fno-tree-loop-optimize the same as -fno-tree-loop-vectorize. (expand_omp_simd): Likewise. testsuite/ * gcc.dg/gomp/pr60534.c: New test. diff --git gcc/omp-low.c gcc/omp-low.c index 91c8656..24ef3c8 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -2929,6 +2929,7 @@ omp_max_vf (void) { if (!optimize || optimize_debug + || !flag_tree_loop_optimize || (!flag_tree_loop_vectorize && (global_options_set.x_flag_tree_loop_vectorize || global_options_set.x_flag_tree_vectorize))) @@ -6839,6 +6840,7 @@ expand_omp_simd (struct omp_region *region, struct omp_for_data *fd) if ((flag_tree_loop_vectorize || (!global_options_set.x_flag_tree_loop_vectorize && !global_options_set.x_flag_tree_vectorize)) + && flag_tree_loop_optimize && loop->safelen > 1) { loop->force_vect = true; diff --git gcc/testsuite/gcc.dg/gomp/pr60534.c gcc/testsuite/gcc.dg/gomp/pr60534.c index e69de29..f8a6bdc 100644 --- gcc/testsuite/gcc.dg/gomp/pr60534.c +++ gcc/testsuite/gcc.dg/gomp/pr60534.c @@ -0,0 +1,16 @@ +/* PR middle-end/60534 */ +/* { dg-do compile } */ +/* { dg-options "-fopenmp -O -fno-tree-loop-optimize" } */ + +extern int d[]; + +int +foo (int a) +{ + int c = 0; + int l; +#pragma omp simd reduction(+: c) + for (l = 0; l < a; ++l) +c += d[l]; + return c; +} Marek
Re: [PATCH] Expand OpenMP SIMD even with -fno-tree-loop-optimize (PR middle-end/60534)
On Mon, Mar 17, 2014 at 02:49:41PM +0100, Marek Polacek wrote: > 2014-03-17 Marek Polacek > > PR middle-end/60534 > * omp-low.c (omp_max_vf): Treat -fno-tree-loop-optimize the same > as -fno-tree-loop-vectorize. > (expand_omp_simd): Likewise. > testsuite/ > * gcc.dg/gomp/pr60534.c: New test. Ok, thanks. Jakub
Re: C++ PATCH for c++/58678 (devirt vs. KDE)
On 03/17/2014 04:39 AM, Jan Hubicka wrote: Thank you! would preffer different marker than cxa_pure_virtual in the vtable, most probably simply NULL. The reason is that __cxa_pure_virtual will appear as a possible target in the list and it will prevent devirtualization to happen when we end up with __cxa_pure_virtual and real destructor in the list of possible targets. Hmm? __cxa_pure_virtual is not considered likely, so why wouldn't devirtualization choose the real function instead? gimple_get_virt_method_for_vtable knows that lookup in vtable that do not result in FUNCTION_DECL should be translated to BUILTIN_UNREACHABLE and ipa-devirt drops these from list of targets, unlike __cxa_pure_virtual that stays. I don't see the reason for that distinction; either way you get undefined behavior. The only purpose of __cxa_pure_virtual is to give a friendly diagnostic before terminating the program. Other problem with cxa_pure_virtual is that it needs external relocation. I sort of wondered if we don't want to produce hidden comdat wrapper for it, so C++ programs are easier to relocate. Sure, that would make sense. What do you think of the following patch that makes ipa-devirt to conclude that destructor calls are never done on types in construction. If effect of doing so is undefined, I think it is safe to drop them from list of targets and that really helps to reduce lists down. That looks good to me. Jason
Re: [PATCH] Fix PR c++/60391
OK. Jason
Re: [PATCH] Fix PR c++/60390
On 03/16/2014 04:44 PM, Adam Butcher wrote: + if (parser->num_classes_being_defined == 0) + while (scope->kind == sk_class) + { + parent_scope = scope; + scope = scope->level_chain; + } + else + while (scope->kind == sk_class + && !TYPE_BEING_DEFINED (scope->this_entity)) + { + parent_scope = scope; + scope = scope->level_chain; + } The special case for 0 seems like an unnecessary optimization. OK without it. Jason
Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation
Hi! On Sat, 8 Mar 2014 18:50:15 +0400, Ilya Verbin wrote: > --- a/libgomp/libgomp.map > +++ b/libgomp/libgomp.map > @@ -208,6 +208,7 @@ GOMP_3.0 { > > GOMP_4.0 { >global: > + GOMP_offload_register; > GOMP_barrier_cancel; > GOMP_cancel; > GOMP_cancellation_point; Now that the GOMP_4.0 symbol version is being used in GCC trunk, and will be in the GCC 4.9 release, can we still add new symbols to it here? (Jakub?) > --- a/libgomp/plugin-host.c > +++ b/libgomp/plugin-host.c > +const int TARGET_TYPE_HOST = 0; We'll have to see whether this (that is, libgomp/target.c:enum target_type) should live in a shared header file, but OK for the moment. > +void > +device_run (void *fn_ptr, void *vars) > +{ > +#ifdef DEBUG > + printf ("libgomp plugin: %s:%s (%p, %p)\n", __FILE__, __FUNCTION__, fn_ptr, > + vars); > +#endif > + > + void (*fn)(void *) = (void (*)(void *)) fn_ptr; > + > + fn (vars); > +} Why not make fn_ptr a proper function pointer? Ah, because of GOMP_target passing (void *) tgt_fn->tgt->tgt_start for the !TARGET_TYPE_HOST case... Would it make sense to have device_run return a value to make it able to indicate to libgomp that the function cannot be run on the device (for whatever reason), and libgomp should use host-fallback execution? (Probably that needs more thought and discussion, OK to defer.) > --- a/libgomp/target.c > +++ b/libgomp/target.c > +enum target_type { > + TARGET_TYPE_HOST, > + TARGET_TYPE_INTEL_MIC > +}; (As discussed above, but OK to defer.) > @@ -120,15 +140,26 @@ struct gomp_device_descr > TARGET construct. */ >int id; > > + /* This is the TYPE of device. */ > + int type; Use enum target_type instead of int? > +/* This function should be called from every offload image. It gets the > + descriptor of the host func and var tables HOST_TABLE, TYPE of the target, > + and TARGET_DATA needed by target plugin (target tables, etc.) */ > +void > +GOMP_offload_register (void *host_table, int type, void *target_data) > +{ > + offload_images = realloc (offload_images, > + (num_offload_images + 1) > + * sizeof (struct offload_image_descr)); > + > + if (offload_images == NULL) > +return; Fail silently, or use gomp_realloc to fail loudly? > @@ -701,16 +836,25 @@ gomp_find_available_plugins (void) > - out: > +out: Emacs wants the space to be there, so I assume that's the coding standard to use. ;-) >if (dir) > closedir (dir); > + free (offload_images); I suggest to set offload_images = NULL, for clarity. > + num_offload_images = 0; > } We may need to revisit this later: currently it's not possible to register additional plugins after libgomp has initialized (gomp_target_init, gomp_find_available_plugins just executed once), but should that ever be made possible, we'd need to preserve offload_images. OK to commit, thanks! Grüße, Thomas pgpoKh9FyzF7I.pgp Description: PGP signature
Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
On Mon, Mar 17, 2014 at 4:53 AM, Kirill Yukhin wrote: > Hello, > Patch in the bottom allows to use ymmXX and zmmXX > register names in inline asm statements as well as > in `register` variables definitions. > > New tests pass. > Bootstrap pass. > > Is it ok for trunk? > Do we need to backport it to 4.8? > > gcc/ > * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add > ymm and zmm register names. > > testsuite/ > * gcc.target/i386/avx-additional-reg-names.c: New. > * gcc.target/i386/avx512f-additional-reg-names.c: Ditto. > > -- > Thanks, K > > commit c3884af93c105115bc1e4d02fa824d24420c5bbf > Author: Kirill Yukhin > Date: Mon Mar 17 14:56:06 2014 +0400 > > [AVX, AVX-512]. Extend ADDITIONAL_REGISTER_NAMES to Ymms and Zmms. > --- > gcc/config/i386/i386.h | 28 > +- > .../gcc.target/i386/avx-additional-reg-names.c | 9 +++ > .../gcc.target/i386/avx512f-additional-reg-names.c | 9 +++ > 3 files changed, 40 insertions(+), 6 deletions(-) > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > index c80878b..c5c1d58 100644 > --- a/gcc/config/i386/i386.h > +++ b/gcc/config/i386/i386.h > @@ -2016,12 +2016,28 @@ do { > \ > /* Table of additional register names to use in user input. */ > > #define ADDITIONAL_REGISTER_NAMES \ > -{ { "eax", 0 }, { "edx", 1 }, { "ecx", 2 }, { "ebx", 3 }, \ > - { "esi", 4 }, { "edi", 5 }, { "ebp", 6 }, { "esp", 7 }, \ > - { "rax", 0 }, { "rdx", 1 }, { "rcx", 2 }, { "rbx", 3 }, \ > - { "rsi", 4 }, { "rdi", 5 }, { "rbp", 6 }, { "rsp", 7 }, \ > - { "al", 0 }, { "dl", 1 }, { "cl", 2 }, { "bl", 3 }, \ > - { "ah", 0 }, { "dh", 1 }, { "ch", 2 }, { "bh", 3 } } > +{ { "eax", 0 }, { "edx", 1 }, { "ecx", 2 }, { "ebx", 3 }, \ > + { "esi", 4 }, { "edi", 5 }, { "ebp", 6 }, { "esp", 7 }, \ > + { "rax", 0 }, { "rdx", 1 }, { "rcx", 2 }, { "rbx", 3 }, \ > + { "rsi", 4 }, { "rdi", 5 }, { "rbp", 6 }, { "rsp", 7 }, \ > + { "al", 0 }, { "dl", 1 }, { "cl", 2 }, { "bl", 3 }, \ > + { "ah", 0 }, { "dh", 1 }, { "ch", 2 }, { "bh", 3 }, \ > + { "ymm0", 21}, { "ymm1", 22}, { "ymm2", 23}, { "ymm3", 24}, \ > + { "ymm4", 25}, { "ymm5", 26}, { "ymm6", 27}, { "ymm7", 28}, \ > + { "ymm8", 45}, { "ymm9", 46}, { "ymm10", 47}, { "ymm11", 48}, > \ > + { "ymm12", 49}, { "ymm13", 50}, { "ymm14", 51}, { "ymm15", 52}, \ > + { "ymm16", 53}, { "ymm17", 54}, { "ymm18", 55}, { "ymm19", 56}, \ > + { "ymm20", 57}, { "ymm21", 58}, { "ymm22", 59}, { "ymm23", 60}, \ > + { "ymm24", 61}, { "ymm25", 62}, { "ymm26", 63}, { "ymm27", 64}, \ > + { "ymm28", 65}, { "ymm29", 66}, { "ymm30", 67}, { "ymm31", 68}, \ > + { "zmm0", 21}, { "zmm1", 22}, { "zmm2", 23}, { "zmm3", 24}, \ > + { "zmm4", 25}, { "zmm5", 26}, { "zmm6", 27}, { "zmm7", 28}, \ > + { "zmm8", 45}, { "zmm9", 46}, { "zmm10", 47}, { "zmm11", 48}, > \ > + { "zmm12", 49}, { "zmm13", 50}, { "zmm14", 51}, { "zmm15", 52}, \ > + { "zmm16", 53}, { "zmm17", 54}, { "zmm18", 55}, { "zmm19", 56}, \ > + { "zmm20", 57}, { "zmm21", 58}, { "zmm22", 59}, { "zmm23", 60}, \ > + { "zmm24", 61}, { "zmm25", 62}, { "zmm26", 63}, { "zmm27", 64}, \ > + { "zmm28", 65}, { "zmm29", 66}, { "zmm30", 67}, { "zmm31", 68} } > > /* Note we are omitting these since currently I don't know how > to get gcc to use these, since they want the same but different > diff --git a/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c > b/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c > new file mode 100644 > index 000..d984bff > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx-additional-reg-names.c > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx" } */ > + > +void foo () > +{ > + register int ymm_var asm ("ymm4"); > + > + __asm__ __volatile__("vxorpd %%ymm0, %%ymm0, %%ymm7\n" : : : "ymm7" ); > +} Doesn't GCC generate the same code with xmm? > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c > b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c > new file mode 100644 > index 000..1bd428a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c > @@ -0,0 +1,9 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512f" } */ > + > +void foo () > +{ > + register int zmm_var asm ("zmm9"); > + > + __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" ); > +} Doesn't GCC generate the same code with xmm? -- H.J.
Re: [PATCH] Fix PR60505
On Mon, Mar 17, 2014 at 6:44 AM, Richard Biener wrote: > On Fri, 14 Mar 2014, Cong Hou wrote: > >> On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener wrote: >> > On Fri, 14 Mar 2014, Jakub Jelinek wrote: >> > >> >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote: >> >> > > Consider this fact and if there are alias checks, we can safely remove >> >> > > the epilogue if the maximum trip count of the loop is less than or >> >> > > equal to the calculated threshold. >> >> > >> >> > You have to consider n % vf != 0, so an argument on only maximum >> >> > trip count or threshold cannot work. >> >> >> >> Well, if you only check if maximum trip count is <= vf and you know >> >> that for n < vf the vectorized loop + it's epilogue path will not be >> >> taken, >> >> then perhaps you could, but it is a very special case. >> >> Now, the question is when we are guaranteed we enter the scalar versioned >> >> loop instead for n < vf, is that in case of versioning for alias or >> >> versioning for alignment? >> > >> > I think neither - I have plans to do the cost model check together >> > with the versioning condition but didn't get around to implement that. >> > That would allow stronger max bounds for the epilogue loop. >> >> In vect_transform_loop(), check_profitability will be set to true if >> th >= VF-1 and the number of iteration is unknown (we only consider >> unknown trip count here), where th is calculated based on the >> parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum >> value VF-1. If the loop needs to be versioned, then >> check_profitability with true value will be passed to >> vect_loop_versioning(), in which an enhanced loop bound check >> (considering cost) will be built. So I think if the loop is versioned >> and n < VF, then we must enter the scalar version, and in this case >> removing epilogue should be safe when the maximum trip count <= th+1. > > You mean exactly in the case where the profitability check ensures > that n % vf == 0? Thus effectively if n == maximum trip count? > That's quite a special case, no? Yes, it is a special case. But it is in this special case that those warnings are thrown out. Also, I think declaring an array with VF*N as length is not unusual. thanks, Cong > > Richard. > > -- > Richard Biener > SUSE / SUSE Labs > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
Re: Fwd: [RFC][gomp4] Offloading patches (2/3): Add tables generation
On Mon, Mar 17, 2014 at 04:00:11PM +0100, Thomas Schwinge wrote: > Hi! > > On Sat, 8 Mar 2014 18:50:15 +0400, Ilya Verbin wrote: > > --- a/libgomp/libgomp.map > > +++ b/libgomp/libgomp.map > > @@ -208,6 +208,7 @@ GOMP_3.0 { > > > > GOMP_4.0 { > >global: > > + GOMP_offload_register; > > GOMP_barrier_cancel; > > GOMP_cancel; > > GOMP_cancellation_point; > > Now that the GOMP_4.0 symbol version is being used in GCC trunk, and will > be in the GCC 4.9 release, can we still add new symbols to it here? > (Jakub?) If GCC 4.9 release will not include that symbol, then it must be in a new symbol version, e.g. GOMP_4.1 (note, the fact that GOMP_ symbol version matched now the OpenMP standard version wasn't always true and might not be true always either (or we could use GOMP_4.0.1 symver). Jakub
Re: [PATCH][AArch64] vqneg and vqabs intrinsics implementation
On 12 February 2014 10:54, Alex Velenko wrote: > Hi, > > This patch implements vqneg_s64, vqnegd_s64, vqabs_s64 and > vqabsd_s64 AArch64 intrinsics. Regression tests added. > Run full regression with no regressions. > > Is patch OK? > > Thanks, > Alex > > gcc/ > > 2014-02-12 Alex Velenko > > * gcc/config/aarch64/aarch64-simd.md (aarch64_s): > Pattern extended. > * config/aarch64/aarch64-simd-builtins.def (sqneg): Iterator > extended. > (sqabs): Likewise. > * config/aarch64/arm_neon.h (vqneg_s64): New intrinsic. > (vqnegd_s64): Likewise. > (vqabs_s64): Likewise. > (vqabsd_s64): Likewise. > > gcc/testsuite/ > > 2014-02-12 Alex Velenko > > *gcc.target/aarch64/vqneg_s64_1.c: New testcase. > *gcc.target/aarch64/vqabs_s64_1.c: New testcase. OK for stage-1 /Marcus
Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
On Mon, Mar 17, 2014 at 4:12 PM, H.J. Lu wrote: >> Patch in the bottom allows to use ymmXX and zmmXX >> register names in inline asm statements as well as >> in `register` variables definitions. >> >> New tests pass. >> Bootstrap pass. >> >> Is it ok for trunk? >> Do we need to backport it to 4.8? >> >> gcc/ >> * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add >> ymm and zmm register names. >> >> testsuite/ >> * gcc.target/i386/avx-additional-reg-names.c: New. >> * gcc.target/i386/avx512f-additional-reg-names.c: Ditto. > Doesn't GCC generate the same code with xmm? > >> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c >> b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c >> new file mode 100644 >> index 000..1bd428a >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c >> @@ -0,0 +1,9 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-mavx512f" } */ >> + >> +void foo () >> +{ >> + register int zmm_var asm ("zmm9"); >> + >> + __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" ); >> +} > > Doesn't GCC generate the same code with xmm? It does, but the situation is the same as with %eax vs. %rax names. So, I think the patch is OK for mainline, and similar patch involving only %ymm names for AVX-enabled branches. Uros.
Re: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk
On Fri, Mar 07, 2014 at 09:21:48PM +0100, Thomas Schwinge wrote: > Maybe it's just too late on a Friday evening, but I don't understand this > change, part of r204863. GF_OMP_FOR_KIND_FOR has the value zero; > shouldn't this comparison have remained unchanged? Is the following > (untested) patch OK for trunk? Does this need a test case? > > commit f3c7834ecbedc50e04223d24b1b671fc8a62c169 > Author: Thomas Schwinge > Date: Fri Mar 7 21:11:43 2014 +0100 > > Restore check for OpenMP for construct. > > gcc/ > * omp-low.c (lower_rec_input_clauses) : Restore > check for GF_OMP_FOR_KIND_FOR. Ok for trunk, sorry for the delay. > diff --git gcc/omp-low.c gcc/omp-low.c > index 4dc3956..713a4ae 100644 > --- gcc/omp-low.c > +++ gcc/omp-low.c > @@ -3915,7 +3915,7 @@ lower_rec_input_clauses (tree clauses, gimple_seq > *ilist, gimple_seq *dlist, >/* Don't add any barrier for #pragma omp simd or >#pragma omp distribute. */ >if (gimple_code (ctx->stmt) != GIMPLE_OMP_FOR > - || gimple_omp_for_kind (ctx->stmt) & GF_OMP_FOR_KIND_FOR) > + || gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_FOR) > gimple_seq_add_stmt (ilist, build_omp_barrier (NULL_TREE)); > } > Jakub
Re: [PATCH][AARCH64]Amend AArch64 frame layout comment.
On 16/03/14 11:25, Renlin Li wrote: > Hi all, > > This is a simple patch to update the AArch64 frame layout comment in > the source code. > frame_pointer should point above the local_variables section as we > define FRAME_GROWS_DOWNWARD = 1. > > Is this Okay for stage-4? > OK. R.
Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
On Mon, Mar 17, 2014 at 9:52 AM, Uros Bizjak wrote: > On Mon, Mar 17, 2014 at 4:12 PM, H.J. Lu wrote: > >>> Patch in the bottom allows to use ymmXX and zmmXX >>> register names in inline asm statements as well as >>> in `register` variables definitions. >>> >>> New tests pass. >>> Bootstrap pass. >>> >>> Is it ok for trunk? >>> Do we need to backport it to 4.8? >>> >>> gcc/ >>> * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add >>> ymm and zmm register names. >>> >>> testsuite/ >>> * gcc.target/i386/avx-additional-reg-names.c: New. >>> * gcc.target/i386/avx512f-additional-reg-names.c: Ditto. > >> Doesn't GCC generate the same code with xmm? >> >>> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c >>> b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c >>> new file mode 100644 >>> index 000..1bd428a >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c >>> @@ -0,0 +1,9 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-mavx512f" } */ >>> + >>> +void foo () >>> +{ >>> + register int zmm_var asm ("zmm9"); >>> + >>> + __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" ); >>> +} >> >> Doesn't GCC generate the same code with xmm? > > It does, but the situation is the same as with %eax vs. %rax names. > So, I think the patch is OK for mainline, and similar patch involving > only %ymm names for AVX-enabled branches. > If I want to write codes with asm statements which can be compiled with GCC 4.6 and above, I will use xmm instead of ymm. It makes ymm less attractive. -- H.J.
Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
On Mon, Mar 17, 2014 at 10:11 AM, H.J. Lu wrote: > On Mon, Mar 17, 2014 at 9:52 AM, Uros Bizjak wrote: >> On Mon, Mar 17, 2014 at 4:12 PM, H.J. Lu wrote: >> Patch in the bottom allows to use ymmXX and zmmXX register names in inline asm statements as well as in `register` variables definitions. New tests pass. Bootstrap pass. Is it ok for trunk? Do we need to backport it to 4.8? gcc/ * config/i386/i386.h (ADDITIONAL_REGISTER_NAMES): Add ymm and zmm register names. testsuite/ * gcc.target/i386/avx-additional-reg-names.c: New. * gcc.target/i386/avx512f-additional-reg-names.c: Ditto. >> >>> Doesn't GCC generate the same code with xmm? >>> diff --git a/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c new file mode 100644 index 000..1bd428a --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512f-additional-reg-names.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512f" } */ + +void foo () +{ + register int zmm_var asm ("zmm9"); + + __asm__ __volatile__("vxorpd %%zmm0, %%zmm0, %%zmm7\n" : : : "zmm7" ); +} >>> >>> Doesn't GCC generate the same code with xmm? >> >> It does, but the situation is the same as with %eax vs. %rax names. >> So, I think the patch is OK for mainline, and similar patch involving >> only %ymm names for AVX-enabled branches. >> > > If I want to write codes with asm statements which can > be compiled with GCC 4.6 and above, I will use xmm > instead of ymm. It makes ymm less attractive. > BTW, in glibc, there are asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "xmm0" ); and asm volatile ("vmovdqa %0, %%ymm0" : : "x" (ymm) : "xmm0" ); -- H.J.
Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
On 17 Mar 17:52, Uros Bizjak wrote: > On Mon, Mar 17, 2014 at 4:12 PM, H.J. Lu wrote: > > >> Is it ok for trunk? > >> Do we need to backport it to 4.8? > It does, but the situation is the same as with %eax vs. %rax names. > So, I think the patch is OK for mainline, and similar patch involving > only %ymm names for AVX-enabled branches. Thanks, Uroš! Couple of questions. AVX-enabled branches are 4.8 and 4.7? I suspect that 4.6 is out of support. Second. I didn't understood point of HJ at all. Did you? (I'll try to reach him via internal IM). -- Thanks, K
Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
On 17 Mar 10:16, H.J. Lu wrote: > BTW, in glibc, there are > > asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "xmm0" ); Maybe. But I belive that this is much more clear to have instead: asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "zmm0" ); -- Thanks, K
Re: C++ PATCH for c++/58678 (devirt vs. KDE)
> On 03/17/2014 04:39 AM, Jan Hubicka wrote: > >Thank you! would preffer different marker than cxa_pure_virtual in the > >vtable, > >most probably simply NULL. > > > >The reason is that __cxa_pure_virtual will appear as a possible target in the > >list and it will prevent devirtualization to happen when we end up with > >__cxa_pure_virtual and real destructor in the list of possible targets. > > Hmm? __cxa_pure_virtual is not considered likely, so why wouldn't > devirtualization choose the real function instead? If you get list like ~foo(), __cxa_pure_virtual, you will get speculative devirtualization to ~foo. If you get ~foo(), NULL, the NULL will get translated to BUILTIN_UNREACHABLE and that will be dropped from the list, so you will end up with unconditional call of ~foo(). I think in general we can not skip cxa_pure_virtual, since people want friendly diagnostics on broken programs insted of getting devirtualized call to random other function. I was under impression in this case we know that the virtual table entry won't be used, so full devirtualization would be possible. > > >gimple_get_virt_method_for_vtable knows that lookup in vtable that do not > >result in FUNCTION_DECL should be translated to BUILTIN_UNREACHABLE and > >ipa-devirt drops these from list of targets, unlike __cxa_pure_virtual that > >stays. > > I don't see the reason for that distinction; either way you get > undefined behavior. The only purpose of __cxa_pure_virtual is to > give a friendly diagnostic before terminating the program. I can drop the handling of cxa_pure_virtual if unconditoinal devirtualization is desirable, or perhaps do it under some switch. Targets list containing one cxa_pure_virtual and one extra function are common. > > >Other problem with cxa_pure_virtual is that it needs external relocation. > >I sort of wondered if we don't want to produce hidden comdat wrapper for > >it, so C++ programs are easier to relocate. > > Sure, that would make sense. > > >What do you think of the following patch that makes ipa-devirt to conclude > >that destructor calls are never done on types in construction. > >If effect of doing so is undefined, I think it is safe to drop them from > >list of targets and that really helps to reduce lists down. > > That looks good to me. Thanks, I am away for next 4 days to allaska hut w/o electricity, will check my email afterwards, > > Jason
Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
On Mon, Mar 17, 2014 at 10:37 AM, Kirill Yukhin wrote: > On 17 Mar 10:16, H.J. Lu wrote: >> BTW, in glibc, there are >> >> asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "xmm0" ); > Maybe. But I belive that this is much more clear to have instead: >asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "zmm0" ); > My issue is this is a user-visible change. Code using ymm which works with GCC 4.9 won't work with the installed GCC 4.6/4.7/4.8. This change introduces GCC portability issues without significant benefit. -- H.J.
Re: [PATCH, i386, AVX, AVX-512] Extend ADDITION_REGISTER_NAMES to XMMs and YMMs.
On Mon, Mar 17, 2014 at 11:26:58AM -0700, H.J. Lu wrote: > On Mon, Mar 17, 2014 at 10:37 AM, Kirill Yukhin > wrote: > > On 17 Mar 10:16, H.J. Lu wrote: > >> BTW, in glibc, there are > >> > >> asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "xmm0" ); > > Maybe. But I belive that this is much more clear to have instead: > >asm volatile ("vmovdqa64 %0, %%zmm0" : : "x" (zmm) : "zmm0" ); > > > > My issue is this is a user-visible change. Code using ymm which > works with GCC 4.9 won't work with the installed GCC 4.6/4.7/4.8. > This change introduces GCC portability issues without significant > benefit. It is up to the user to decide if they want to be portable to older compilers or not. But it is useful and more intuitive if we allow specifying also the ymm and zmm forms. Jakub
[PATCH] Fix -fsanitize=undefined -flto (PR sanitizer/60535)
Hi! Apparently rest_of_decl_compilation only calls varpool_finalize_decl if not in_lto_p, so this patch calls it explicitly after that call to make sure with -flto we register the newly created vars with varpool as well. Additionally, the patch gives name to a few further builtin types, so that the null-4.c and overflow-int128.c tests don't fail with -flto (without the lto-lang.c change they printed as type name). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-03-17 Jakub Jelinek PR sanitizer/60535 * ubsan.c (ubsan_type_descriptor, ubsan_create_data): Call varpool_finalize_decl after rest_of_decl_compilation. lto/ * lto-lang.c (lto_init): Add NAME_TYPE for int128_integer_type_node and complex_{float,{,long_}double}_type_node. testsuite/ * c-c++-common/ubsan/null-1.c: Don't skip if -flto. * c-c++-common/ubsan/null-2.c: Likewise. * c-c++-common/ubsan/null-3.c: Likewise. * c-c++-common/ubsan/null-4.c: Likewise. * c-c++-common/ubsan/null-5.c: Likewise. * c-c++-common/ubsan/null-6.c: Likewise. * c-c++-common/ubsan/null-7.c: Likewise. * c-c++-common/ubsan/null-8.c: Likewise. * c-c++-common/ubsan/null-9.c: Likewise. * c-c++-common/ubsan/null-10.c: Likewise. * c-c++-common/ubsan/null-11.c: Likewise. * c-c++-common/ubsan/overflow-1.c: Likewise. * c-c++-common/ubsan/overflow-2.c: Likewise. * c-c++-common/ubsan/overflow-add-1.c: Likewise. * c-c++-common/ubsan/overflow-add-2.c: Likewise. * c-c++-common/ubsan/overflow-int128.c: Likewise. * c-c++-common/ubsan/overflow-mul-1.c: Likewise. * c-c++-common/ubsan/overflow-mul-2.c: Likewise. * c-c++-common/ubsan/overflow-mul-3.c: Likewise. * c-c++-common/ubsan/overflow-mul-4.c: Likewise. * c-c++-common/ubsan/overflow-negate-1.c: Likewise. * c-c++-common/ubsan/overflow-negate-2.c: Likewise. * c-c++-common/ubsan/overflow-sub-1.c: Likewise. * c-c++-common/ubsan/overflow-sub-2.c: Likewise. * c-c++-common/ubsan/pr59333.c: Likewise. * c-c++-common/ubsan/pr59503.c: Likewise. * c-c++-common/ubsan/pr59667.c: Likewise. * c-c++-common/ubsan/undefined-1.c: Likewise. * g++.dg/ubsan/pr59250.C: Likewise. * g++.dg/ubsan/pr59306.C: Likewise. --- gcc/ubsan.c.jj 2014-01-08 17:45:06.0 +0100 +++ gcc/ubsan.c 2014-03-17 14:09:40.280376415 +0100 @@ -391,6 +391,7 @@ ubsan_type_descriptor (tree type, bool w TREE_STATIC (ctor) = 1; DECL_INITIAL (decl) = ctor; rest_of_decl_compilation (decl, 1, 0); + varpool_finalize_decl (decl); /* Save the VAR_DECL into the hash table. */ decl_for_type_insert (type, decl); @@ -502,6 +503,7 @@ ubsan_create_data (const char *name, loc TREE_STATIC (ctor) = 1; DECL_INITIAL (var) = ctor; rest_of_decl_compilation (var, 1, 0); + varpool_finalize_decl (var); return var; } --- gcc/lto/lto-lang.c.jj 2014-03-10 10:50:15.0 +0100 +++ gcc/lto/lto-lang.c 2014-03-17 15:49:10.592371589 +0100 @@ -1222,6 +1222,13 @@ lto_init (void) NAME_TYPE (long_double_type_node, "long double"); NAME_TYPE (void_type_node, "void"); NAME_TYPE (boolean_type_node, "bool"); + NAME_TYPE (complex_float_type_node, "complex float"); + NAME_TYPE (complex_double_type_node, "complex double"); + NAME_TYPE (complex_long_double_type_node, "complex long double"); +#if HOST_BITS_PER_WIDE_INT >= 64 + if (targetm.scalar_mode_supported_p (TImode)) +NAME_TYPE (int128_integer_type_node, "__int128"); +#endif #undef NAME_TYPE /* Initialize LTO-specific data structures. */ --- gcc/testsuite/c-c++-common/ubsan/null-1.c.jj2013-11-19 21:56:24.566416519 +0100 +++ gcc/testsuite/c-c++-common/ubsan/null-1.c 2014-03-17 13:23:46.057000209 +0100 @@ -1,7 +1,6 @@ /* { dg-do run } */ /* { dg-options "-fsanitize=null -w" } */ /* { dg-shouldfail "ubsan" } */ -/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ int main (void) --- gcc/testsuite/c-c++-common/ubsan/null-2.c.jj2013-11-19 21:56:24.566416519 +0100 +++ gcc/testsuite/c-c++-common/ubsan/null-2.c 2014-03-17 13:23:46.06592 +0100 @@ -1,7 +1,6 @@ /* { dg-do run } */ /* { dg-options "-fsanitize=null -w" } */ /* { dg-shouldfail "ubsan" } */ -/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ int main (void) --- gcc/testsuite/c-c++-common/ubsan/null-3.c.jj2013-11-19 21:56:24.567416516 +0100 +++ gcc/testsuite/c-c++-common/ubsan/null-3.c 2014-03-17 13:23:46.063000958 +0100 @@ -1,7 +1,6 @@ /* { dg-do run } */ /* { dg-options "-fsanitize=null -w" } */ /* { dg-shouldfail "ubsan" } */ -/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ int foo (int *p) --- gcc/testsuite/c-c++-common/ubsan/null-4.c.jj2013-11-19 21:56:24.567416516 +0100 +++ gcc/testsuite/c-c++-common/ubsan/null-4.c 2014-03-17 15:37:15.977422737 +0100 @@ -1,7
Re: [PATCH] Fix up REG_CFA_ADJUST_CFA note creation in epilogue (PR target/60516)
On 03/17/2014 11:47 AM, Jakub Jelinek wrote: > 2014-03-17 Jakub Jelinek > > PR target/60516 > * config/i386/i386.c (ix86_expand_epilogue): Adjust REG_CFA_ADJUST_CFA > note creation for the 2010-08-31 changes. > > * gcc.target/i386/pr60516.c: New test. Ok. r~
[PATCH] Fix up REG_CFA_ADJUST_CFA note creation in epilogue (PR target/60516)
Hi! Since r163679 the pop pattern is no longer a PARALLEL, but uses POST_INC. That commit fixed another spot where REG_CFA_ADJUST_CFA note has been created from the pop insn pattern, but missed this spot which is rarely used (requires popping > 64KB arguments by callee). Bootstrapped/regtested on x86_64-linux and i686-linux, Kai has tested this on some mingw32 or what. Ok for trunk? 2014-03-17 Jakub Jelinek PR target/60516 * config/i386/i386.c (ix86_expand_epilogue): Adjust REG_CFA_ADJUST_CFA note creation for the 2010-08-31 changes. * gcc.target/i386/pr60516.c: New test. --- gcc/config/i386/i386.c.jj 2014-03-13 21:54:53.0 +0100 +++ gcc/config/i386/i386.c 2014-03-17 07:19:28.461411964 +0100 @@ -11708,8 +11708,9 @@ ix86_expand_epilogue (int style) m->fs.cfa_offset -= UNITS_PER_WORD; m->fs.sp_offset -= UNITS_PER_WORD; - add_reg_note (insn, REG_CFA_ADJUST_CFA, - copy_rtx (XVECEXP (PATTERN (insn), 0, 1))); + rtx x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD); + x = gen_rtx_SET (VOIDmode, stack_pointer_rtx, x); + add_reg_note (insn, REG_CFA_ADJUST_CFA, x); add_reg_note (insn, REG_CFA_REGISTER, gen_rtx_SET (VOIDmode, ecx, pc_rtx)); RTX_FRAME_RELATED_P (insn) = 1; --- gcc/testsuite/gcc.target/i386/pr60516.c.jj 2014-03-17 07:17:14.165158703 +0100 +++ gcc/testsuite/gcc.target/i386/pr60516.c 2014-03-17 07:16:59.343275735 +0100 @@ -0,0 +1,20 @@ +/* PR target/60516 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +struct S { char c[65536]; }; + +__attribute__((ms_abi, thiscall)) void +foo (void *x, struct S y) +{ +} + +__attribute__((ms_abi, fastcall)) void +bar (void *x, void *y, struct S z) +{ +} + +__attribute__((ms_abi, stdcall)) void +baz (struct S x) +{ +} Jakub
Re: [PATCH] BZ60501: Add addptr optab
On 2014-03-13, 7:37 AM, Andreas Krebbel wrote: On 13/03/14 12:25, Richard Biener wrote: On Thu, Mar 13, 2014 at 12:16 PM, Eric Botcazou wrote: --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -4720,6 +4720,17 @@ Add operand 2 and operand 1, storing the result in operand 0. All operands must have mode @var{m}. This can be used even on two-address machines, by means of constraints requiring operands 1 and 0 to be the same location. +@cindex @code{addptr@var{m}3} instruction pattern +@item @samp{addptr@var{m}3} +Like @code{addptr@var{m}3} but does never clobber the condition code. +It only needs to be defined if @code{add@var{m}3} either sets the +condition code or address calculations cannot be performed with the +normal add instructions due to other reasons. If adds used for +address calculations and normal adds are not compatible it is required +to expand a distinct pattern (e.g. using an unspec). The pattern is +used by LRA to emit address calculations. @code{add@var{m}3} is used +if @code{addptr@var{m}3} is not defined. I'm a bit skeptical of the "address calculations cannot be performed with the normal add instructions due to other reasons" part". Surely they can be performed on all architectures supported by GCC as of this writing, otherwise how would the compiler even work? And if it's really like @code{add@var{m}3}, why restricting it to addresses, i.e. why calling it @code{addptr@var{m}3}? Does that come from an implementation constraint on s390 that supports it only for a subset of the cases supported by @code{add@var{m}3}? Yeah, isn't it that you want a named pattern like add_nocc for an add that doesn't clobber flags? This would suggest that you can use the pattern also for performing a normal add in case the condition code is not needed afterwards but this isn't correct for s390 31 bit where an address calculation is actually something different. addptr is better I think because it is a pattern which is supposed to be implemented with a load address instruction and the middle-end guarantees to use it only on addresses. (I hope LRA is actually behaving that way). Perhaps loadptr or la or loadaddress would be a better name? It is complicated. There is no guarantee that it is used only for addresses. I need some time to think how to fix it. Meanwhile, you *should* commit the patch into the trunk because it solves the real problem. And I can work from this to make changes that the new pattern is only used for addresses. The patch is absolutely safe for all targets but s390. There is still a tiny possibility that it might result in some problems for s390 (now I see only one situation when a pseudo in a subreg changed by equiv plus expr needs a reload). In any case your patch solves real numerous failures and can be used as a base for further work. Thanks for working on this problem, Andreas. Sorry that I missed the PR60501.
[patch testsuite]: Fix some mingw testcases in gcc.dg
Hello, this patch fixes some regressions introduced by default-option -fms-extensions for mingw-targets. ChangeLog 2014-03-17 Kai Tietz * anon-struct-1.c: Add -fno-ms-extensions option for mingw targets. * anon-struct-11.c: Likewise. * anon-struct-2.c: Likewise. * c11-anon-struct-2.c: Likewise. * c11-anon-struct-3.c: Likewise. Tested for i686-w64-mingw32, and x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai Index: anon-struct-1.c === --- anon-struct-1.c(Revision 208594) +++ anon-struct-1.c(Arbeitskopie) @@ -1,4 +1,5 @@ /* { dg-options "-std=iso9899:1990 -pedantic" } */ +/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */ /* In strict ISO C mode, we don't recognize the anonymous struct/union extension or any Microsoft extensions. */ Index: anon-struct-11.c === --- anon-struct-11.c(Revision 208594) +++ anon-struct-11.c(Arbeitskopie) @@ -3,6 +3,7 @@ /* No special options--in particular, turn off the default -pedantic-errors option. */ /* { dg-options "" } */ +/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */ /* When not using -fplan9-extensions, we don't support automatic conversion of pointer types, and we don't support referring to a Index: anon-struct-2.c === --- anon-struct-2.c(Revision 208594) +++ anon-struct-2.c(Arbeitskopie) @@ -1,4 +1,5 @@ /* { dg-options "-std=gnu89" } */ +/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */ /* In GNU C mode, we recognize the anonymous struct/union extension, but not Microsoft extensions. */ Index: c11-anon-struct-2.c === --- c11-anon-struct-2.c(Revision 208594) +++ c11-anon-struct-2.c(Arbeitskopie) @@ -2,6 +2,7 @@ cases. */ /* { dg-do compile } */ /* { dg-options "-std=c11 -pedantic-errors" } */ +/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */ typedef struct s0 { Index: c11-anon-struct-3.c === --- c11-anon-struct-3.c(Revision 208594) +++ c11-anon-struct-3.c(Arbeitskopie) @@ -2,6 +2,7 @@ cases: typedefs disallowed by N1549. */ /* { dg-do compile } */ /* { dg-options "-std=c11 -pedantic-errors" } */ +/* { dg-additional-options "-fno-ms-extensions" { target *-*-mingw* } } */ typedef struct {
Re: [patch testsuite]: Fix some mingw testcases in gcc.dg
Hi Kai, > this patch fixes some regressions introduced by default-option > -fms-extensions for mingw-targets. you should state in your submissions *which* regressions were introduced/*which* problem you are fixing. While this may be obvious to you, it's often not so to reviewers. > ChangeLog > > 2014-03-17 Kai Tietz > > * anon-struct-1.c: Add -fno-ms-extensions option for mingw targets. > * anon-struct-11.c: Likewise. > * anon-struct-2.c: Likewise. > * c11-anon-struct-2.c: Likewise. > * c11-anon-struct-3.c: Likewise. gcc.dg/ prefix missing in ChangeLog entries. > Tested for i686-w64-mingw32, and x86_64-unknown-linux-gnu. Ok for apply? Ok with that change. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch testsuite]: Fix some mingw testcases in gcc.dg
2014-03-17 21:50 GMT+01:00 Rainer Orth : > Hi Kai, > >> this patch fixes some regressions introduced by default-option >> -fms-extensions for mingw-targets. > > you should state in your submissions *which* regressions were > introduced/*which* problem you are fixing. While this may be obvious to > you, it's often not so to reviewers. I did. "The regressions in testsuite are introduced by turning on the state of -fms-extensions." That all, and not more to tell. >> ChangeLog >> >> 2014-03-17 Kai Tietz >> >> * anon-struct-1.c: Add -fno-ms-extensions option for mingw targets. >> * anon-struct-11.c: Likewise. >> * anon-struct-2.c: Likewise. >> * c11-anon-struct-2.c: Likewise. >> * c11-anon-struct-3.c: Likewise. > > gcc.dg/ prefix missing in ChangeLog entries. > >> Tested for i686-w64-mingw32, and x86_64-unknown-linux-gnu. Ok for apply? > > Ok with that change. > > Rainer > > -- > - > Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch testsuite]: Fix some mingw testcases in gcc.dg
Kai Tietz writes: > 2014-03-17 21:50 GMT+01:00 Rainer Orth : >> Hi Kai, >> >>> this patch fixes some regressions introduced by default-option >>> -fms-extensions for mingw-targets. >> >> you should state in your submissions *which* regressions were >> introduced/*which* problem you are fixing. While this may be obvious to >> you, it's often not so to reviewers. > > I did. "The regressions in testsuite are introduced by turning on the > state of -fms-extensions." That all, and not more to tell. You didn't. *Which* regressions? What happens? I had to infer it from a comment in one of the changed testcases: /* In strict ISO C mode, we don't recognize the anonymous struct/union extension or any Microsoft extensions. */ If you'd cited the compiler error you get for one of the testcases, everything had been clear. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] x86: Define _mm*_undefined_*
On Mon, Mar 17, 2014 at 7:39 AM, Ilya Tocar wrote: > Do you know of any cases where xor is > generated (except for destination in gather/scatter) I don't have any code exhibiting this handy right now. I'll keep an eye out. > but it also clobbers > flags. Maybe just define it to setzero for now? What do you mean by "clobbers flags"? Do you have an example?
extending constants in rtl
So, to support things like this: (define_constants (C1_TEMP_REGNUM PROLOGUE_SCRATCH_1) (C1_TEMP2_REGNUM PROLOGUE_SCRATCH_2) I need the rtl reader to do less checking. We we turn off int validation, this then works, and we get: #define C1_TEMP_REGNUM PROLOGUE_SCRATCH_1 in insn-constants.h, which is what I wanted. The problem is that I choose different scratch register based upon the cpu and this is then used in a clobber in the rtl of a define_insn. I’d be happy to do this some other way, but, I didn’t see a way to do this, otherwise. Absent a better solution, I’d like to pursue this. The only question I have, remove the checking, or allow the target to explain that we don’t want the checking? diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c index c198b5b..ceef96c 100644 --- a/gcc/read-rtl.c +++ b/gcc/read-rtl.c @@ -807,8 +807,12 @@ validate_const_int (const char *string) valid = 0; break; } +#if 0 + /* In order to support defining the md constants in terms of CPP constants from tm.h, we + can't check this. */ if (!valid) fatal_with_file_and_line ("invalid decimal constant \"%s\"\n", string); +#endif } static void
Ping^3 GCC trunk 4.9: documentation patch on plugins
On Sat, 2014-03-08 at 11:15 +0100, Basile Starynkevitch wrote: > I am pinging again this documentation patch > http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00074.html > (pinged at http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01002.html on > febµ.17th 2014) and also pinged at http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00387.html on march 8th 2014 gcc/ChangeLog entry 2014-03-18 Basile Starynkevitch * doc/plugins.texi (Plugin callbacks): Mention PLUGIN_INCLUDE_FILE. Italicize plugin event names in description. Explain that PLUGIN_PRAGMAS has no sense for lto1. Explain PLUGIN_INCLUDE_FILE. Remind that no GCC functions should be called after PLUGIN_FINISH. Explain what pragmas with expansion are. the patch: Index: gcc/doc/plugins.texi === --- gcc/doc/plugins.texi(revision 207422) +++ gcc/doc/plugins.texi(working copy) @@ -209,6 +209,10 @@ PLUGIN_EARLY_GIMPLE_PASSES_END, /* Called when a pass is first instantiated. */ PLUGIN_NEW_PASS, +/* Called when a file is #include-d or given thru #line directive. + Could happen many times. The event data is the included file path, + as a const char* pointer. */ + PLUGIN_INCLUDE_FILE, PLUGIN_EVENT_FIRST_DYNAMIC/* Dummy event used for indexing callback array. */ @@ -229,15 +233,27 @@ @item @code{void *user_data}: Pointer to plugin-specific data. @end itemize -For the PLUGIN_PASS_MANAGER_SETUP, PLUGIN_INFO, PLUGIN_REGISTER_GGC_ROOTS -and PLUGIN_REGISTER_GGC_CACHES pseudo-events the @code{callback} should be -null, and the @code{user_data} is specific. +For the @i{PLUGIN_PASS_MANAGER_SETUP}, @i{PLUGIN_INFO}, +@i{PLUGIN_REGISTER_GGC_ROOTS} and @i{PLUGIN_REGISTER_GGC_CACHES} +pseudo-events the @code{callback} should be null, and the +@code{user_data} is specific. -When the PLUGIN_PRAGMAS event is triggered (with a null -pointer as data from GCC), plugins may register their own pragmas -using functions like @code{c_register_pragma} or -@code{c_register_pragma_with_expansion}. +When the @i{PLUGIN_PRAGMAS} event is triggered (with a null pointer as +data from GCC), plugins may register their own pragmas. Notice that +pragmas are not available from @file{lto1}, so plugins used with +@code{-flto} option to GCC during link-time optimization cannot use +pragmas and do not even see functions like @code{c_register_pragma} or +@code{pragma_lex}. +The @i{PLUGIN_INCLUDE_FILE} event, with a @code{const char*} file path as +GCC data, is triggered for processing of @code{#include} or +@code{#line} directives. + +The @i{PLUGIN_FINISH} event is the last time that plugins can call GCC +functions, notably emit diagnostics with @code{warning}, @code{error} +etc. + + @node Plugins pass @section Interacting with the pass manager @@ -376,10 +392,13 @@ @end smallexample -The @code{PLUGIN_PRAGMAS} callback is called during pragmas -registration. Use the @code{c_register_pragma} or -@code{c_register_pragma_with_expansion} functions to register custom -pragmas. +The @i{PLUGIN_PRAGMAS} callback is called once during pragmas +registration. Use the @code{c_register_pragma}, +@code{c_register_pragma_with_data}, +@code{c_register_pragma_with_expansion}, +@code{c_register_pragma_with_expansion_and_data} functions to register +custom pragmas and their handlers (which often want to call +@code{pragma_lex}) from @file{c-family/c-pragma.h}. @smallexample /* Plugin callback called during pragmas registration. Registered with @@ -397,7 +416,15 @@ It is suggested to pass @code{"GCCPLUGIN"} (or a short name identifying your plugin) as the ``space'' argument of your pragma. +Pragmas registered with @code{c_register_pragma_with_expansion} or +@code{c_register_pragma_with_expansion_and_data} are allowing +preprocessor expansions, like e.g. +@smallexample +#define NUMBER 10 +#pragma GCCPLUGIN foothreshold (NUMBER) +@end smallexample + @node Plugins recording @section Recording information about pass execution # Ok for 4.9? Regards