Re: [AArch64] Implement movmem for the benefit of inline memcpy
On Fri, Aug 1, 2014 at 2:21 AM, wrote: > > >> On Jun 6, 2014, at 1:50 AM, James Greenhalgh >> wrote: >> >> >> Hi, >> >> The move_by_pieces infrastructure performs a copy by repeatedly trying >> the largest safe copy it can make. So for a 15-byte copy we might see: >> >> offset amount bytes copied >> 08 0-7 >> 84 8-11 >> 12 2 12-13 >> 14 1 14 >> >> However, we can implement a 15-byte copy as so: >> >> offset amount bytes copied >> 08 0-7 >> 78 7-14 >> >> Which can prove more efficient for both space and speed. >> >> In this patch we set MOVE_RATIO low to avoid using move_by_pieces, and >> implement the movmem pattern name to expand small block copy cases. Note, >> this >> optimization does not apply for -mstrict-align targets, which must continue >> copying byte-by-byte. > > Why not change move_by_pieces instead of having a target specific code? This > seems like a better option. You can check is unaligned slow target macro to > see if you want to do this optimization too. As I mentioned in the other > email make sure you check the volatile ness of the from and to before doing > this optimization. Attached is the patch which does what I mentioned, I also changed store_by_pieces to implement a similar optimization there (for memset and strcpy). Also since I used SLOW_UNALIGNED_ACCESS, this is a generic optimization. I had tested an earlier version on x86_64-linux-gnu and I am in the middle of bootstrap/testing on this one. Thanks, Andrew Pinski * expr.c (move_by_pieces): Take the min of max_size and len to speed up things and to take advatage of the mode in move_by_pieces_1. (move_by_pieces_1): Read/write the leftovers using an overlapping memory locations to reduce the number of reads/writes. (store_by_pieces_1): Take the min of max_size and len to speed up things and to take advatage of the mode in store_by_pieces_2. (store_by_pieces_2): Write the leftovers using an overlapping memory locations to reduce the number of writes. > > Thanks, > Andrew > > >> >> Setting MOVE_RATIO low in this way causes a few tests to begin failing, >> both of these are documented in the test-case as expected to fail for >> low MOVE_RATIO targets, which do not allow certain Tree-Level optimizations. >> >> Bootstrapped on aarch64-unknown-linux-gnu with no issues. >> >> OK for trunk? >> >> Thanks, >> James >> >> --- >> gcc/ >> >> 2014-06-06 James Greenhalgh >> >>* config/aarch64/aarch64-protos.h (aarch64_expand_movmem): New. >>* config/aarch64/aarch64.c (aarch64_move_pointer): New. >>(aarch64_progress_pointer): Likewise. >>(aarch64_copy_one_part_and_move_pointers): Likewise. >>(aarch64_expand_movmen): Likewise. >>* config/aarch64/aarch64.h (MOVE_RATIO): Set low. >>* config/aarch64/aarch64.md (movmem): New. >> >> gcc/testsuite/ >> >> 2014-06-06 James Greenhalgh >> >>* gcc.dg/tree-ssa/pr42585.c: Skip for AArch64. >>* gcc.dg/tree-ssa/sra-12.c: Likewise. >> <0001-AArch64-Implement-movmem-for-the-benefit-of-inline-m.patch> Index: expr.c === --- expr.c (revision 213306) +++ expr.c (working copy) @@ -876,6 +876,9 @@ move_by_pieces (rtx to, rtx from, unsign if (data.reverse) data.offset = len; data.len = len; + /* Use the MIN of the length and the max size we can use. */ + max_size = max_size > (len + 1) ? (len + 1) : max_size; + /* If copying requires more than two move insns, copy addresses to registers (to make displacements shorter) and use post-increment if available. */ @@ -1073,6 +1076,32 @@ move_by_pieces_1 (insn_gen_fn genfun, ma data->len -= size; } + + /* If we have some data left and unalign accesses + are not slow, back up slightly and emit the move. */ + if (data->len > 0 + && !STRICT_ALIGNMENT + && !SLOW_UNALIGNED_ACCESS (mode, 1) + /* Not a stack push */ + && data->to + /* Neither side is volatile memory. */ + && !MEM_VOLATILE_P (data->to) + && !MEM_VOLATILE_P (data->from) + && ceil_log2 (data->len) == exact_log2 (size) + /* No incrementing of the to or from. */ + && data->explicit_inc_to == 0 + && data->explicit_inc_from == 0 + /* No auto-incrementing of the to or from. */ + && !data->autinc_to + && !data->autinc_from + && !data->reverse) +{ + unsigned offset = data->offset - (size - data->len); + to1 = adjust_address (data->to, mode, offset); + from1 = adjust_address (data->from, mode, offset); + emit_insn ((*genfun) (to1, from1)); + data->len = 0; +} } /* Emit code to move a block Y to a block X. This may be done with @@ -2636,6 +2665,9 @@ store_by_pieces_1 (struct store_by_piece if (data->reverse) data->offset = data->len; + /* Use the MIN of the length and the max size we can use. */ + max_size = max_size > (data->len + 1) ? (data->l
Re: [PATCH] Fix bootstrap failure because of -Wreturn-local-addr
On Mon, 4 Aug 2014, Jakub Jelinek wrote: > Hi! > > I've tried to bootstrap with r213515 because later revisions broke because > of the hash_map and Ada changes, but unfortunately even that revision > failus to bootstrap, the new -Wreturn-local-addr warning rightfully > warns about get_ivts_expr possibly returning address of a local variable. > > It seems that n_loc is always 1 and loc[0] is always 1, already since > 4.0 when these loop-unroll.c changes have been introduced, so I think the > best fix is just to remove those two fields. > > Bootstrapped/regtested (with r213515, rtl checking) on x86_64-linux and > i686-linux, ok for trunk? Ok. Thanks, Richard. > 2014-08-04 Jakub Jelinek > > * loop-unroll.c (struct iv_to_split): Remove n_loc and loc fields. > (analyze_iv_to_split_insn): Don't initialize them. > (get_ivts_expr): Removed. > (allocate_basic_variable, insert_base_initialization): Use > SET_SRC instead of *get_ivts_expr. > (split_iv): Use &SET_SRC instead of get_ivts_expr. > > --- gcc/loop-unroll.c.jj 2014-06-24 16:41:55.0 +0200 > +++ gcc/loop-unroll.c 2014-08-04 14:13:35.750917507 +0200 > @@ -79,11 +79,6 @@ struct iv_to_split > iterations are based. */ >rtx step; /* Step of the induction variable. */ >struct iv_to_split *next; /* Next entry in walking order. */ > - unsigned n_loc; > - unsigned loc[3]; /* Location where the definition of the induction > -variable occurs in the insn. For example if > -N_LOC is 2, the expression is located at > -XEXP (XEXP (single_set, loc[0]), loc[1]). */ > }; > > /* Information about accumulators to expand. */ > @@ -1942,8 +1937,6 @@ analyze_iv_to_split_insn (rtx insn) >ivts->base_var = NULL_RTX; >ivts->step = iv.step; >ivts->next = NULL; > - ivts->n_loc = 1; > - ivts->loc[0] = 1; > >return ivts; > } > @@ -2080,27 +2073,12 @@ determine_split_iv_delta (unsigned n_cop > } > } > > -/* Locate in EXPR the expression corresponding to the location recorded > - in IVTS, and return a pointer to the RTX for this location. */ > - > -static rtx * > -get_ivts_expr (rtx expr, struct iv_to_split *ivts) > -{ > - unsigned i; > - rtx *ret = &expr; > - > - for (i = 0; i < ivts->n_loc; i++) > -ret = &XEXP (*ret, ivts->loc[i]); > - > - return ret; > -} > - > /* Allocate basic variable for the induction variable chain. */ > > static void > allocate_basic_variable (struct iv_to_split *ivts) > { > - rtx expr = *get_ivts_expr (single_set (ivts->insn), ivts); > + rtx expr = SET_SRC (single_set (ivts->insn)); > >ivts->base_var = gen_reg_rtx (GET_MODE (expr)); > } > @@ -2111,7 +2089,7 @@ allocate_basic_variable (struct iv_to_sp > static void > insert_base_initialization (struct iv_to_split *ivts, rtx insn) > { > - rtx expr = copy_rtx (*get_ivts_expr (single_set (insn), ivts)); > + rtx expr = copy_rtx (SET_SRC (single_set (insn))); >rtx seq; > >start_sequence (); > @@ -2146,7 +2124,7 @@ split_iv (struct iv_to_split *ivts, rtx > } > >/* Figure out where to do the replacement. */ > - loc = get_ivts_expr (single_set (insn), ivts); > + loc = &SET_SRC (single_set (insn)); > >/* If we can make the replacement right away, we're done. */ >if (validate_change (insn, loc, expr, 0)) > > Jakub
[PATCH] Fix PR61672
When removing the MEM_ATTR unification hash I didn't grep for MEM_ATTR equality comares which now causes PR61672, missed optimizations at least in PRE. The following fixes that by exporting and using mem_attrs_eq_p. Bootstrapped and tested on x86_64-unknown-linux-gnu - as this is a regression on the 4.9 branch I'd like to eventually backport this. Richard. 2014-08-05 Richard Biener PR rtl-optimization/61672 * emit-rtl.h (mem_attrs_eq_p): Declare. * emit-rtl.c (mem_attrs_eq_p): Export. Handle NULL mem-attrs. * cse.c (exp_equiv_p): Use mem_attrs_eq_p. * cfgcleanup.c (merge_memattrs): Likewise. Include emit-rtl.h. Index: gcc/emit-rtl.h === --- gcc/emit-rtl.h (revision 208113) +++ gcc/emit-rtl.h (working copy) @@ -20,6 +20,9 @@ along with GCC; see the file COPYING3. #ifndef GCC_EMIT_RTL_H #define GCC_EMIT_RTL_H +/* Return whether two MEM_ATTRs are equal. */ +bool mem_attrs_eq_p (const struct mem_attrs *, const struct mem_attrs *); + /* Set the alias set of MEM to SET. */ extern void set_mem_alias_set (rtx, alias_set_type); Index: gcc/emit-rtl.c === --- gcc/emit-rtl.c (revision 208113) +++ gcc/emit-rtl.c (working copy) @@ -245,9 +245,13 @@ const_fixed_htab_eq (const void *x, cons /* Return true if the given memory attributes are equal. */ -static bool +bool mem_attrs_eq_p (const struct mem_attrs *p, const struct mem_attrs *q) { + if (p == q) +return true; + if (!p || !q) +return false; return (p->alias == q->alias && p->offset_known_p == q->offset_known_p && (!p->offset_known_p || p->offset == q->offset) Index: gcc/cse.c === --- gcc/cse.c (revision 208113) +++ gcc/cse.c (working copy) @@ -2680,7 +2680,7 @@ exp_equiv_p (const_rtx x, const_rtx y, i But because really all MEM attributes should be the same for equivalent MEMs, we just use the invariant that MEMs that have the same attributes share the same mem_attrs data structure. */ - if (MEM_ATTRS (x) != MEM_ATTRS (y)) + if (!mem_attrs_eq_p (MEM_ATTRS (x), MEM_ATTRS (y))) return 0; } break; Index: gcc/cfgcleanup.c === --- gcc/cfgcleanup.c(revision 208113) +++ gcc/cfgcleanup.c(working copy) @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3. #include "df.h" #include "dce.h" #include "dbgcnt.h" +#include "emit-rtl.h" #define FORWARDER_BLOCK_P(BB) ((BB)->flags & BB_FORWARDER_BLOCK) @@ -882,7 +883,7 @@ merge_memattrs (rtx x, rtx y) if (GET_MODE (x) != GET_MODE (y)) return; - if (code == MEM && MEM_ATTRS (x) != MEM_ATTRS (y)) + if (code == MEM && !mem_attrs_eq_p (MEM_ATTRS (x), MEM_ATTRS (y))) { if (! MEM_ATTRS (x)) MEM_ATTRS (y) = 0;
Re: [RFC: Patch, PR 60102] [4.9/4.10 Regression] powerpc fp-bit ices@dwf_regno
On Mon, Aug 04, 2014 at 11:51:34AM -0500, Edmar wrote: > Committed on trunk, revision 213596 > Committed on 4.9 branch, revision 213597 Note the ChangeLog entry was grossly misformated. I've fixed it up in gcc/ChangeLog on the trunk, but not on the branch nor in libgcc. There should be no space before :, all lines in ChangeLog entry should be just tab indented rather than tab + 2 spaces, and filenames, unless they are too long, shouldn't be alone on the lines. And testsuite entries shouldn't go into gcc/ChangeLog. > On 08/04/2014 05:25 AM, Ulrich Weigand wrote: > >David Edelsohn wrote: > >>On Fri, Aug 1, 2014 at 2:03 PM, rohitarul...@freescale.com > >> wrote: > >>>[libgcc] > >>>2014-07-31 Rohit > >>> * config/rs6000/linux-unwind.h (ppc_fallback_frame_state): Update > >>> based on change in SPE high register numbers and 3 HTM > >>> registers. > >>> > >>>[gcc] > >>>2014-07-31 Rohit > >>> * config/rs6000/rs6000.c > >>> (rs6000_reg_names) : Add SPE high register names. > >>> (alt_reg_names) : Likewise. > >>> (rs6000_dwarf_register_span) : For SPE high registers, replace > >>> dwarf register numbers with GCC hard register numbers. > >>> (rs6000_init_dwarf_reg_sizes_extra) : Likewise. > >>> (rs6000_dbx_register_number): For SPE high registers, return > >>> dwarf > >>> register number for the corresponding GCC hard register number. > >>> > >>> * config/rs6000/rs6000.h > >>> (FIRST_PSEUDO_REGISTER) : Update based on 32 newly added GCC > >>> hard > >>> register numbers for SPE high registers. > >>> (DWARF_FRAME_REGISTERS) : Likewise. > >>> (DWARF_REG_TO_UNWIND_COLUMN) : Likewise. > >>> (DWARF_FRAME_REGNUM) : Likewise. > >>> (FIXED_REGISTERS) : Likewise. > >>> (CALL_USED_REGISTERS) : Likewise. > >>> (CALL_REALLY_USED_REGISTERS) : Likewise. > >>> (REG_ALLOC_ORDER) : Likewise. > >>> (enum reg_class) : Likewise. > >>> (REG_CLASS_NAMES) : Likewise. > >>> (REG_CLASS_CONTENTS) : Likewise. > >>> (SPE_HIGH_REGNO_P) : New macro to identify SPE high registers. > >>> > >>> * gcc.target/powerpc/pr60102.c: New testcase. > >>The patch is okay with me if Uli is satisfied. > >Yes, this is fine with me. Jakub
Re: [PATCH] fix pr62009 use after free in redirect_edge_var_map_dup
On Mon, Aug 4, 2014 at 3:04 PM, wrote: > From: Trevor Saunders > > Hi, > > It used to be that edge_var_maps held pointers to embedded vectors, but > now it holds vectors. This means that now instead of copying the > address of the embedded vector from the table we keep a pointer into the > table. However that's incorrect because we may expand the table when > inserting new into the map in which case our pointer into the map points > at freed memory. > > gcc/ > > * tree-ssa.c (redirect_edge_var_map_dup): copy the value in the > map for old before inserting new. > > testing ongoing on x86_64-unknown-linux-gnu, ok? Umm... can you explore what it takes to instead store some kind of indexes? That is, how is the above not an issue for other 'head's that may be live somewhere? Or can you restore what "used to be"? Thanks, Richard. > Trev > > --- > gcc/tree-ssa.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c > index 920cbea..b949d48 100644 > --- a/gcc/tree-ssa.c > +++ b/gcc/tree-ssa.c > @@ -109,7 +109,11 @@ redirect_edge_var_map_dup (edge newe, edge olde) >if (!head) > return; > > - edge_var_maps->get_or_insert (newe).safe_splice (*head); > + /* Save what head points at because if we need to insert new into the map > we > + may end up expanding the table in which case head will no longer point > at > + valid memory. */ > + vec h = *head; > + edge_var_maps->get_or_insert (newe).safe_splice (h); > } > > > -- > 2.0.1 >
Re: Invalid gimple for nested functions
On Mon, Aug 4, 2014 at 10:37 PM, Bernd Schmidt wrote: > Some code I added for the ptx backend triggered tree-checking failures for > gimple created in tree-nested: we can end up taking the address of an > SSA_NAME, which seems invalid. The problem is that code in tree-nested wants > to change the rhs of an existing assignment, but just using > gimple_assign_set_rhs1 is insufficient: the previous rhs was an ADDR_EXPR, > and that code remains afterwards. > > Fixed with the following patch, tested (including Ada) on x86_64-linux. Ok? The better interface for this is gimple_assign_set_rhs_from_tree. But it seems the code can be re-structured to avoid re-setting the RHS by simply doing if (!is_gimple_reg (x) && is_gimple_reg_type (TREE_TYPE (x)) x = init_tmp_var (root, x, &gsi); right after x is assigned to? Ok with that change. Thanks, Richard. > > Bernd
Re: [PATCH][convert.c] PR 61876: Guard transformation to lrint by -fno-math-errno
On Mon, Aug 4, 2014 at 2:27 PM, Kyrill Tkachov wrote: > Hi all, > > Following up on Josephs' comments on PR 61876 this patch guards the rint + > cast -> lrint transformation on -fno-math-errno. > > Bootstrapped and tested on aarch64-linux and x86. > > Ok for trunk? Ok. Thanks, Richard. > Thanks, > Kyrill > > 2014-08-04 Kyrylo Tkachov > > * convert.c (convert_to_integer): Guard transformation to lrint by > -fno-math-errno.
Re: [AArch64] Some aarch64-builtins.c cleanup.
On 04/08/14 11:13, James Greenhalgh wrote: > > This patch removes the aarch64_simd_builtin_type enum. This is only really > used for indexing in to an array of strings and an array of machine_mode. > > We don't need that. Given all the macro pasting we presently do in this > file, just add another. Then we can store a proper machine_mode for each > builtin and save the headache of keeping everything in sync. > > Regression tested for aarch64-none-elf. > > OK? > OK. R.
[PATCH, ARM] Keep constants in register when expanding
Hi, For some large constants, ARM will split them during expanding, which makes impossible to hoist them out the loop or shared by different references (refer the test case in the patch). The patch keeps some constants in registers. If the constant can not be optimized, the cprop and combine passes can optimize them as what we do in current expand pass with define_insn_and_split "*arm_subsi3_insn" define_insn_and_split "*arm_andsi3_insn" define_insn_and_split "*iorsi3_insn" define_insn_and_split "*arm_xorsi3" The patch does not modify addsi3 since the define_insn_and_split "*arm_addsi3" is only valid when (reload_completed || !arm_eliminable_register (operands[1])). The cprop and combine passes can not optimize the large constant if we put it in register, which will lead to regression. For logic operators, the patch skips changes for constants: INTVAL (operands[2]) < 0 && const_ok_for_arm (-INTVAL (operands[2]) since expand pass always uses "sign-extend" to get the value (trunc_int_for_mode called from immed_wide_int_const) for rtl, and logs show most negative values are UNSIGNED when they are TREE node. And combine pass is smart enough to recover the negative value to positive value. Bootstrap and no make check regression on Chrome book. For coremark, dhrystone and eembcv1, no any code size and performance change on Cortex-M4. No any file in CSiBE has code size change for Cortex-A15 and Cortex-M4. No Spec2000 performance regression on Chrome book and dumped assemble codes only show very few difference. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-08-05 Zhenqiang Chen * config/arm/arm.md (subsi3, andsi3, iorsi3, xorsi3): Keep some large constants in register other than split them. testsuite/ChangeLog: 2014-08-05 Zhenqiang Chen * gcc.target/arm/maskdata.c: New test. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index bd8ea8f..c8b3001 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1162,9 +1162,16 @@ { if (TARGET_32BIT) { - arm_split_constant (MINUS, SImode, NULL_RTX, - INTVAL (operands[1]), operands[0], - operands[2], optimize && can_create_pseudo_p ()); + if (!const_ok_for_arm (INTVAL (operands[1])) + && can_create_pseudo_p ()) + { + operands[1] = force_reg (SImode, operands[1]); + emit_insn (gen_subsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (MINUS, SImode, NULL_RTX, +INTVAL (operands[1]), operands[0], operands[2], +optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ @@ -2077,6 +2084,17 @@ emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0], operands[1])); } + else if (!(const_ok_for_arm (INTVAL (operands[2])) + || const_ok_for_arm (~INTVAL (operands[2])) + /* zero_extendhi instruction is efficient enough. */ + || INTVAL (operands[2]) == 0x + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2] + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_andsi3 (operands[0], operands[1], operands[2])); + } else arm_split_constant (AND, SImode, NULL_RTX, INTVAL (operands[2]), operands[0], @@ -2882,9 +2900,20 @@ { if (TARGET_32BIT) { - arm_split_constant (IOR, SImode, NULL_RTX, - INTVAL (operands[2]), operands[0], operands[1], - optimize && can_create_pseudo_p ()); + if (!(const_ok_for_arm (INTVAL (operands[2])) + || (TARGET_THUMB2 + && const_ok_for_arm (~INTVAL (operands[2]))) + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2] + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_iorsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (IOR, SImode, NULL_RTX, +INTVAL (operands[2]), operands[0], operands[1], +optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ @@ -3052,9 +3081,18 @@ { if (TARGET_32BIT) { - arm_split_constant (XOR, SImode, NULL_RTX, - INTVAL (operands[2]), operands[0], operands[1], - optimize && can_create_pseudo_p ()); + if (!(const_ok_for_arm (INTVAL (operands[2])) + || (INTVAL (operands[2]) < 0 + && const_ok_for_arm (-INTVAL (operands[2] + && can_create_pseudo_p ()) + { + operands[2] = force_reg (SImode, operands[2]); + emit_insn (gen_xorsi3 (operands[0], operands[1], operands[2])); + } + else + arm_split_constant (XOR, SImode, NULL_RTX, +INTVAL (operands[2]), operands[0], operands[1], +optimize && can_create_pseudo_p ()); DONE; } else /* TARGET_THUMB1 */ diff --git a/gcc/testsuite/gcc.target/arm/maskdata.c
Re: [GSoC][match-and-simplify] add pointerplus patterns
On Mon, Aug 4, 2014 at 4:13 PM, Prathamesh Kulkarni wrote: > Added patterns in associate_pointerplus and associate_pointerplus_diff. > > * genmatch.c (capture_max): Change value to 6. > (match-plusminus.pd): Add new patterns. > > [gcc/testsuite/gcc.dg/tree-ssa] > * match-plusminus.c (plusminus_9): New test-case. Any reason you match a conversion for associate_pointerplus? associate_pointerplus_diff misses to match that ptr1 == ptr1. I suggest to write it as /* associate_pointerplus_diff: ptr1 p+ (ptr2 - ptr1) -> ptr2 */ (match_and_simplify (pointer_plus @0 (convert (minus (convert @1) (convert @0 I have applied the patch with the two parts fixed. Thanks, Richard. > > Thanks, > Prathamesh
Re: [PATCH][ARM] Adjust clz, rbit and rev patterns for -mrestrict-it
On 04/08/14 13:22, Kyrill Tkachov wrote: > Hi all, > > While working on another patch and looking at the rbit and clz patterns, > I noticed that in aarch32 the relevant patterns are not adjusted for > -mrestrict-it. > A program like: > int > foo (int a, int b) > { >if (a + 5 == b) > return __builtin_ctz (a); > >return 1; > } > > compiled with -march=armv8-a -mthumb could generate clz and rbit > instructions inside an IT block, thus earning an assembler warning. > Whilst there I also noticed that the output templates for the arm_rev > pattern have a %? that is used to print out condition codes during > predication but the pattern is not marked as predicable. > So I set the predicable attribute there and adjusted it for > -mrestrict-it while at it. > > Ok for trunk? > > 2014-08-04 Kyrylo Tkachov > > * config/arm/arm.md (clzsi2): Set predicable_short_it attr to no. > (rbitsi2): Likewise. > (*arm_rev): Set predicable and predicable_short_it attributes. > OK. R.
[PATCH] FAIL: gcc.dg/torture/ftrapv-1.c -O0 (test for excess errors) on bare-metal targets
Hi all, The recently added -ftrapv test FAILS on bare metal targets because it requires the usage of fork. It does have a dg-require-fork guard but apparently that function takes an unused argument and if you don't give it that, DejaGNU will happily ignore the directive and not bother checking for the presence of fork :S The other places where I see dg-require-fork being used is in the libstdc++ testsuite. Perhaps it's worth just removing the argument from the definition of dg-require-fork in target-supports-dg.exp, but there is a whole host of functions that take an argument and then don't use it (dg-require-mkfifo is one, for example) so it seems like it would be an out-of-place change unless we do a big sweep across the testsuite library to clean that stuff up. This patch just fixes the dg-requires-fork invocation in gcc.dg/torture/ftrapv-1.c. Richard (and testsuite maintainers), is this ok? or would you prefer to remove the unused arguments for the dg-requires-* functions in testsuite/lib/ ? With this patch the test now appears as UNSUPPORTED on bare metal aarch64 and arm targets (that I've tested). Thanks, Kyrill 2014-08-05 Kyrylo Tkachov * gcc.dg/torture/ftrapv-1.c: Correct usage of dg-require-fork.commit f7e9661501f3ceb2221c78569ce6bf6e1140f040 Author: Kyrylo Tkachov Date: Mon Aug 4 15:38:40 2014 +0100 [test] Fix dg-require-fork usage diff --git a/gcc/testsuite/gcc.dg/torture/ftrapv-1.c b/gcc/testsuite/gcc.dg/torture/ftrapv-1.c index 4fdccd8..c74535f 100644 --- a/gcc/testsuite/gcc.dg/torture/ftrapv-1.c +++ b/gcc/testsuite/gcc.dg/torture/ftrapv-1.c @@ -1,7 +1,7 @@ /* { dg-do run } */ /* { dg-additional-options "-ftrapv" } */ /* { dg-require-effective-target trapping } */ -/* { dg-require-fork } */ +/* { dg-require-fork "" } */ #include #include
Re: [PATCH] FAIL: gcc.dg/torture/ftrapv-1.c -O0 (test for excess errors) on bare-metal targets
On Tue, Aug 5, 2014 at 11:47 AM, Kyrill Tkachov wrote: > Hi all, > > The recently added -ftrapv test FAILS on bare metal targets because it > requires the usage of fork. > It does have a dg-require-fork guard but apparently that function takes an > unused argument and if you don't give it that, DejaGNU will happily ignore > the directive and not bother checking for the presence of fork :S > > The other places where I see dg-require-fork being used is in the libstdc++ > testsuite. Perhaps it's worth just removing the argument from the definition > of dg-require-fork in target-supports-dg.exp, but there is a whole host of > functions that take an argument and then don't use it (dg-require-mkfifo is > one, for example) so it seems like it would be an out-of-place change unless > we do a big sweep across the testsuite library to clean that stuff up. This > patch just fixes the dg-requires-fork invocation in > gcc.dg/torture/ftrapv-1.c. > > Richard (and testsuite maintainers), is this ok? or would you prefer to > remove the unused arguments for the dg-requires-* functions in > testsuite/lib/ ? > > With this patch the test now appears as UNSUPPORTED on bare metal aarch64 > and arm targets (that I've tested). Eh, I thought I fixed that already... (must have failed to commit). OK. Thanks, Richard. > Thanks, > Kyrill > > 2014-08-05 Kyrylo Tkachov > > * gcc.dg/torture/ftrapv-1.c: Correct usage of dg-require-fork.
[PATCH][match-and-simplify] Initial drop of match-and-simplify.texi
Converted from my plain-text notes I did for writing the Cauldron presntation slides. Built and installed. Richard. 2014-08-05 Richard Biener * doc/match-and-simplify.texi: New file. * doc/gccint.texi: Include match-and-simplify.texi. * Makefile.in (TEXI_GCCINT_FILES): Add match-and-simplify.texi. Index: gcc/doc/match-and-simplify.texi === --- gcc/doc/match-and-simplify.texi (revision 0) +++ gcc/doc/match-and-simplify.texi (working copy) @@ -0,0 +1,202 @@ +@c Copyright (C) 2014 Free Software Foundation, Inc. +@c Free Software Foundation, Inc. +@c This is part of the GCC manual. +@c For copying conditions, see the file gcc.texi. + +@node Match and Simplify +@chapter Match and Simplify +@cindex Match and Simplify + +The GIMPLE and GENERIC pattern matching project match-and-simplify +tries to address several issues. + +@enumerate +@item unify expression simplifications currently spread and duplicated +over separate files like fold-const.c, gimple-fold.c and builtins.c +@item allow for a cheap way to implement building and simplifying +non-trivial GIMPLE expressions, avoiding the need to go through +building and simplifying GENERIC via fold_buildN and then +gimplifying via force_gimple_operand +@end enumerate + +To address these the project introduces a simple domain specific language +to write expression simplifications from which code targeting GIMPLE +and GENERIC is auto-generated. The GENERIC variant follows the +fold_buildN API while for the GIMPLE variant and to addres 2) new +APIs are introduced. + +@menu +* GIMPLE API:: +* The Language:: +@end menu + +@node GIMPLE API +@section GIMPLE API +@cindex GIMPLE API + +The main GIMPLE API entry to the expression simplifications mimics +that of the GENERIC fold_@{unary,binary,ternary@} API: + +@deftypefn +tree gimple_match_and_simplify (enum tree_code, tree, tree, +gimple_seq *, tree (*)(tree)); +@end deftypefn +@deftypefn +tree gimple_match_and_simplify (enum tree_code, tree, tree, tree, +gimple_seq *, tree (*)(tree)); +@end deftypefn +@deftypefn +tree gimple_match_and_simplify (enum tree_code, tree, tree, tree, tree, +gimple_seq *, tree (*)(tree)); +@end deftypefn +@deftypefn +tree gimple_match_and_simplify (enum built_in_function, tree, tree, +gimple_seq *, tree (*)(tree)); +@end deftypefn + +thus providing n-ary overloads for operation or function. The +additional arguments are a gimple_seq where built statements are +inserted on (if @code{NULL} then simplifications requiring new statements +are not performed) and a valueization hook that can be used to +tie simplifications to a SSA lattice. + +In addition to those APIs a fold_stmt-like interface is provided with + +@deftypefn +bool gimple_match_and_simplify (gimple_stmt_iterator *, tree (*)(tree)); +@end deftypefn + +which also has the additional valueization hook. + +Ontop of these a @code{fold_buildN}-like API for GIMPLE is introduced: + +@deftypefn +tree gimple_build (gimple_seq *, location_t, + enum tree_code, tree, tree, + tree (*valueize) (tree) = NULL); +@end deftypefn +@deftypefn +tree gimple_build (gimple_seq *, location_t, + enum tree_code, tree, tree, tree, + tree (*valueize) (tree) = NULL); +@end deftypefn +@deftypefn +tree gimple_build (gimple_seq *, location_t, + enum tree_code, tree, tree, tree, tree, + tree (*valueize) (tree) = NULL); +@end deftypefn +@deftypefn +tree gimple_build (gimple_seq *, location_t, + enum built_in_function, tree, tree, + tree (*valueize) (tree) = NULL); +@end deftypefn + +which is supposed to replace @code{force_gimple_operand (fold_buildN (...), ...)}. + + +@node The Language +@section The Language +@cindex The Language + +The language to write expression simplifications in resembles other +domain-specific languages GCC uses. Thus it is lispy. Lets start +with an example from the match.pd file on the branch: + +@smallexample +(match_and_simplify + (bit_and @@0 integer_all_onesp) + @@0) +@end smallexample + +This example contains all required parts of an expression simplification. +A simplification is wrapped inside a @code{(match_and_simplify ...)} expression. +That contains at least two operands - an expression that is matched +with the GIMPLE or GENERIC IL and a replacement expression that is +returned if the match was successful. + +Expressions have an ID, @code{bit_and} in this case. Expressions can +be lower-case tree codes with @code{_expr} stripped off or builtin +function code names in all-caps, like @code{BUILT_IN_SQRT}. + +@code{@@n} denotes a so-called capture. It captures the operand and lets +you refer to it in other places of the match-and-simplify. In t
Re: [PATCH AArch64] Prefer dup to zip for vec_perm_const; enable dup for bigendian; add testcase.
On 04/08/14 14:32, Alan Lawrence wrote: > At the moment, for two-element vectors, __builtin_shuffle (vector, (mask) {C, > C}) for identical constants C outputs a zip (with both argument vectors the > same) rather than a dup. Dup is more obvious and easier to read, so prefer it. > > For big-endian, aarch64_evpc_dup always aborts; however tests demonstrate it > works ok, so enable it. > > Finally, add a testcase (of execution results, this gives confidence that > evpc_dup is ok for bigendian - yes, a different element index is output than > for > little-endian). Note existing tests for zip are not affected, they always > have > the two arguments different. > > gcc/ChangeLog: > * config/aarch64/aarch64.c (aarch64_evpc_dup): Enable for bigendian. > (aarch64_expand_vec_perm_const): Check for dup before zip. > > gcc/testsuite/ChangeLog: > * gcc.target/aarch64/vdup_n_2.c: New test. > > OK. R.
Re: [PATCH][ARM/AArch64] Add CRC32 scheduling information to Cortex-A53 and Cortex-A57
On 04/08/14 15:01, Kyrill Tkachov wrote: > Hi all, > > Now that both backends have a way of generating CRC32 instructions this > patch adds the crc type to the scheduling information for the Cortex-A53 > and Cortex-A57 cores. > For both Cortex-A53 and Cortex-A57 they behave similarly to shifted > arithmetic instructions > > When scheduling for the Cortex-A57 we use the Cortex-A15 pipeline > description, so the crc type is added to cortex-a15.md. > CRC32 instructions can only be generated when targeting ARMv8-A so this > will not affect codegen when tuning for the ARMv7-A Cortex-A15 core. > > Ok for trunk? > > 2014-08-04 Kyrylo Tkachov > > * config/arm/cortex-a15.md (cortex_a15_alu_shift): Add crc type > to reservation. > * config/arm/cortex-a53.md (cortex_a53_alu_shift): Likewise. > > OK. R.
Re: [PATCH][AArch64] Implement some vmul*_lane*_f* intrinsics in arm_neon.h
On 04/08/14 17:31, Kyrill Tkachov wrote: > Hi all, > > As part of other intrinsics-related messing around due to the > float64x1_t changes I noticed these can be (re)implemented relatively > easily. > > Tested on aarch64-none-elf and aarch64_be-none-elf to make sure the > lane-wise intrinsics do the right thing. > > Ok for trunk? > > 2014-08-04 Kyrylo Tkachov > > * config/aarch64/arm_neon.h (vmul_f64): New intrinsic. > (vmuld_laneq_f64): Likewise. > (vmuls_laneq_f32): Likewise. > (vmul_n_f64): Likewise. > (vmuld_lane_f64): Reimplement in C. > (vmuls_lane_f32): Likewise. > > 2014-08-04 Kyrylo Tkachov > > * gcc.target/aarch64/simd/vmul_f64_1.c: New test. > * gcc.target/aarch64/simd/vmul_n_f64_1.c: Likewise. > * gcc.target/aarch64/simd/vmuld_lane_f64_1.c: Likewise. > * gcc.target/aarch64/simd/vmuld_laneq_f64_1.c: Likewise. > * gcc.target/aarch64/simd/vmuls_lane_f32_1.c: Likewise. > * gcc.target/aarch64/simd/vmuls_laneq_f32_1.c: Likewise. > > OK. R.
[PATCH][match-and-simplify] s/match_and_simplify/simplify/
This replaces match_and_simplify with just 'simplify' everywhere, in patterns and API. It's shorter and as descriptive. This also pushes two minor changes to not use APIs I consider internal from SCCVN or forwprop. Committed. Richard. 2014-08-05 Richard Biener * doc/match-and-simplify.texi: s/match_and_simplify/simplify/g * fold-const.c: Likewise. * genmatch.c: Likewise. * gimple-fold.c: Likewise. * gimple-fold.h: Likewise. * gimple-match-head.c: Likewise. * match-bitwise.pd: Likewise. * match-builtin.pd: Likewise. * match-constant-folding.pd: Likewise. * match-plusminus.pd: Likewise. * match-rotate.pd: Likewise. * match.pd: Likewise. * tree-ssa-forwprop.c: Likewise. * tree-ssa-sccvn.c: Likewise. * gimple-fold.c (fold_stmt): Add overload with valueize parameter. * gimple-fold.h (fold_stmt): Declare it. * tree-ssa-forwprop.c (pass_forwprop::execute): Use fold_stmt API. * tree-ssa-sccvn.c (try_to_simplify): Use gimple_fold_stmt_to_constant_1 API. * testsuite/gcc.dg/tree-ssa/match-1.c: s/match_and_simplified/simplified/g * testsuite/gcc.dg/tree-ssa/match-bitwise.c: Likewise * testsuite/gcc.dg/tree-ssa/match-plusminus.c: Likewise * testsuite/gcc.dg/tree-ssa/match-realimag.c: Likewise * testsuite/gcc.dg/tree-ssa/match-rotate.c: Likewise * testsuite/gcc.dg/tree-ssa/pr52631.c: Likewise Index: gcc/doc/match-and-simplify.texi === --- gcc/doc/match-and-simplify.texi (revision 213631) +++ gcc/doc/match-and-simplify.texi (working copy) @@ -38,19 +38,19 @@ The main GIMPLE API entry to the express that of the GENERIC fold_@{unary,binary,ternary@} API: @deftypefn -tree gimple_match_and_simplify (enum tree_code, tree, tree, +tree gimple_simplify (enum tree_code, tree, tree, gimple_seq *, tree (*)(tree)); @end deftypefn @deftypefn -tree gimple_match_and_simplify (enum tree_code, tree, tree, tree, +tree gimple_simplify (enum tree_code, tree, tree, tree, gimple_seq *, tree (*)(tree)); @end deftypefn @deftypefn -tree gimple_match_and_simplify (enum tree_code, tree, tree, tree, tree, +tree gimple_simplify (enum tree_code, tree, tree, tree, tree, gimple_seq *, tree (*)(tree)); @end deftypefn @deftypefn -tree gimple_match_and_simplify (enum built_in_function, tree, tree, +tree gimple_simplify (enum built_in_function, tree, tree, gimple_seq *, tree (*)(tree)); @end deftypefn @@ -63,7 +63,7 @@ tie simplifications to a SSA lattice. In addition to those APIs a fold_stmt-like interface is provided with @deftypefn -bool gimple_match_and_simplify (gimple_stmt_iterator *, tree (*)(tree)); +bool gimple_simplify (gimple_stmt_iterator *, tree (*)(tree)); @end deftypefn which also has the additional valueization hook. @@ -103,13 +103,13 @@ domain-specific languages GCC uses. Thu with an example from the match.pd file on the branch: @smallexample -(match_and_simplify +(simplify (bit_and @@0 integer_all_onesp) @@0) @end smallexample This example contains all required parts of an expression simplification. -A simplification is wrapped inside a @code{(match_and_simplify ...)} expression. +A simplification is wrapped inside a @code{(simplify ...)} expression. That contains at least two operands - an expression that is matched with the GIMPLE or GENERIC IL and a replacement expression that is returned if the match was successful. @@ -123,7 +123,7 @@ you refer to it in other places of the m above example it is refered to in the replacement expression. @smallexample -(match_and_simplify +(simplify (bit_xor @@0 @@0) @{ build_zero_cst (type); @}) @end smallexample @@ -134,7 +134,7 @@ operands written in C code. These can b replacements and are supposed to evaluate to a tree node. @smallexample -(match_and_simplify +(simplify (trunc_mod integer_zerop@@0 @@1) (if (!integer_zerop (@@1))) @@0) @@ -145,7 +145,7 @@ which is also predicated with @code{inte may be either expressions, predicates or captures. Captures can be unconstrained or capture expresions or predicates. -This example introduces an optional operand of match_and_simplify, +This example introduces an optional operand of simplify, the if-expression. This condition is evaluated after the expression matched in the IL and is required to evaluate to true to enable the replacement expression. The expression operand @@ -153,7 +153,7 @@ of the @code{if} is a standard C express to captures. @smallexample -(match_and_simplify +(simplify (bit_and:c integral_op_p@@0 (bit_ior:c (bit_not @@0) @@1)) (bit_and @@1 @@0)) @end smallexample @@ -176,7 +176,7 @@ Two more features exist to avoid too muc @smal
Re: [PATCH][AArch64] Implement some saturating math NEON intrinsics
On 04/08/14 17:45, Kyrill Tkachov wrote: > Hi all, > > This patch implements some saturating math *laneq_s* intrinsics. > The implementation is fairly straightforward, just use more general mode > iterators, add appropriate builtins etc. > > Some execution tests are added with some scan-assembly parts to make > sure we generate the correct lane number for both big and little endian > versions of the lanewise intrinsics. > > Tested aarch64-none-elf, aarch64_be-none-elf and bootstrapped on > aarch64-linux. > > Ok for trunk? > > 2014-08-04 Kyrylo Tkachov > > * config/aarch64/aarch64-simd.md (aarch64_sqdmulh_laneq): > Use VSDQ_HSI mode iterator. > (aarch64_sqrdmulh_laneq): Likewise. > (aarch64_sqdmulh_laneq_internal): New define_insn. > * config/aarch64/aarch64-simd-builtins.def (sqdmulh_laneq): > Use BUILTIN_VDQHS macro. > (sqrdmulh_laneq): Likewise. > * config/aarch64/arm_neon.h (vqdmlalh_laneq_s16): New intrinsic. > (vqdmlals_laneq_s32): Likewise. > (vqdmlslh_laneq_s16): Likewise. > (vqdmlsls_laneq_s32): Likewise. > (vqdmulhh_laneq_s16): Likewise. > (vqdmulhs_laneq_s32): Likewise. > (vqrdmulhh_laneq_s16): Likewise. > (vqrdmulhs_laneq_s32): Likewise. > > 2014-08-04 Kyrylo Tkachov > > * gcc.target/aarch64/simd/vqdmlalh_laneq_s16_1.c: New test. > * gcc.target/aarch64/simd/vqdmlals_laneq_s32_1.c: Likewise. > * gcc.target/aarch64/simd/vqdmlslh_laneq_s16_1.c: Likewise. > * gcc.target/aarch64/simd/vqdmlsls_laneq_s32_1.c: Likewise. > * gcc.target/aarch64/simd/vqdmulhh_laneq_s16_1.c: Likewise. > * gcc.target/aarch64/simd/vqdmulhs_laneq_s32_1.c: Likewise. > * gcc.target/aarch64/simd/vqrdmulhh_laneq_s16_1.c: Likewise. > * gcc.target/aarch64/simd/vqrdmulhs_laneq_s32_1.c: Likewise. > > OK. R.
Re: [PATCH][AArch64] Fix types for vqdmlals_lane_s32 and vqdmlsls_lane_s32 intrinsics
On 04/08/14 17:49, Kyrill Tkachov wrote: > Hi all, > > Now that uint64x1_t and uint64_t are not interchangeable these > intrinsics have an incorrect type. The reason is that these intrinsics > should have had the scalar type for some of their arguments and results > according to the NEON intrinsics spec. > > This patch fixes that and updates the relevant tests. > > Tested on aarch64-none-elf and bootstrapped on aarch64-linux. > > Ok for trunk? > > 2014-08-04 Kyrylo Tkachov > > * config/aarch64/arm_neon.h (vqdmlals_lane_s32): Use scalar types > rather than singleton vectors. > (vqdmlsls_lane_s32): Likewise. > > 2014-08-04 Kyrylo Tkachov > > * gcc.target/aarch64/scalar_intrinsics.c (test_vqdmlals_lane_s32): > Fix types. > (test_vqdmlsls_lane_s32): Likewise. > * gcc.target/aarch64/simd/vqdmlals_lane_s32.c: Likewise. > * gcc.target/aarch64/simd/vqdmlsls_lane_s32.c: Likewise. > > OK. R.
RE: [RFC: Patch, PR 60102] [4.9/4.10 Regression] powerpc fp-bit ices@dwf_regno
Jakub, > On Mon, Aug 04, 2014 at 11:51:34AM -0500, Edmar wrote: > > Committed on trunk, revision 213596 > > Committed on 4.9 branch, revision 213597 > > Note the ChangeLog entry was grossly misformated. > I've fixed it up in gcc/ChangeLog on the trunk, but not on the branch > nor in libgcc. There should be no space before :, all lines > in ChangeLog entry should be just tab indented rather than tab + 2 spaces, > and filenames, unless they are too long, shouldn't be alone on the lines. > And testsuite entries shouldn't go into gcc/ChangeLog. Sorry about that. I have updated the ChangeLog entries accordingly. Edmar, can you please commit these patches? Trunk: pr60102-ChangeLog-trunk.patch GCC v4.9 branch: pr60102-ChangeLog-v49.patch Regards, Rohit pr60102-ChangeLog-trunk.patch Description: pr60102-ChangeLog-trunk.patch pr60102-ChangeLog-v49.patch Description: pr60102-ChangeLog-v49.patch
[PATCH][AArch64] Use REG_P and CONST_INT_P instead of GET_CODE + comparison
Hi all, This is a cleanup to replace usages of GET_CODE (x) == CONST_INT with CONST_INT_P (x) and GET_CODE (x) == REG with REG_P (x). No functional changes. Tested on aarch64-none-elf and bootstrapped on aarch64-linux. Ok for trunk? Thanks, Kyrill 2014-08-05 Kyrylo Tkachov * config/aarch64/aarch64.c (aarch64_classify_address): Use REG_P and CONST_INT_P instead of GET_CODE and compare. (aarch64_select_cc_mode): Likewise. (aarch64_print_operand): Likewise. (aarch64_rtx_costs): Likewise. (aarch64_simd_valid_immediate): Likewise. (aarch64_simd_check_vect_par_cnst_half): Likewise. (aarch64_simd_emit_pair_result_insn): Likewise.commit 5b8fd4ebe8229c61159d3a18ef2cba5b8b78933a Author: Kyrylo Tkachov Date: Mon Aug 4 12:11:29 2014 +0100 [AArch64] Use REG_P and CONST_INT_P instead of GET_CODE diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index f434680..df55f61 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3253,11 +3253,11 @@ aarch64_classify_address (struct aarch64_address_info *info, op1 = XEXP (x, 1); if (! strict_p - && GET_CODE (op0) == REG + && REG_P (op0) && (op0 == virtual_stack_vars_rtx || op0 == frame_pointer_rtx || op0 == arg_pointer_rtx) - && GET_CODE (op1) == CONST_INT) + && CONST_INT_P (op1)) { info->type = ADDRESS_REG_IMM; info->base = op0; @@ -3545,7 +3545,7 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y) the comparison will have to be swapped when we emit the assembly code. */ if ((GET_MODE (x) == SImode || GET_MODE (x) == DImode) - && (GET_CODE (y) == REG || GET_CODE (y) == SUBREG) + && (REG_P (y) || GET_CODE (y) == SUBREG) && (GET_CODE (x) == ASHIFT || GET_CODE (x) == ASHIFTRT || GET_CODE (x) == LSHIFTRT || GET_CODE (x) == ZERO_EXTEND || GET_CODE (x) == SIGN_EXTEND)) @@ -3554,7 +3554,7 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y) /* Similarly for a negated operand, but we can only do this for equalities. */ if ((GET_MODE (x) == SImode || GET_MODE (x) == DImode) - && (GET_CODE (y) == REG || GET_CODE (y) == SUBREG) + && (REG_P (y) || GET_CODE (y) == SUBREG) && (code == EQ || code == NE) && GET_CODE (x) == NEG) return CC_Zmode; @@ -3714,7 +3714,7 @@ aarch64_print_operand (FILE *f, rtx x, char code) { int n; - if (GET_CODE (x) != CONST_INT + if (!CONST_INT_P (x) || (n = exact_log2 (INTVAL (x) & ~7)) <= 0) { output_operand_lossage ("invalid operand for '%%%c'", code); @@ -3744,7 +3744,7 @@ aarch64_print_operand (FILE *f, rtx x, char code) int n; /* Print N such that 2^N == X. */ - if (GET_CODE (x) != CONST_INT || (n = exact_log2 (INTVAL (x))) < 0) + if (!CONST_INT_P (x) || (n = exact_log2 (INTVAL (x))) < 0) { output_operand_lossage ("invalid operand for '%%%c'", code); return; @@ -3756,7 +3756,7 @@ aarch64_print_operand (FILE *f, rtx x, char code) case 'P': /* Print the number of non-zero bits in X (a const_int). */ - if (GET_CODE (x) != CONST_INT) + if (!CONST_INT_P (x)) { output_operand_lossage ("invalid operand for '%%%c'", code); return; @@ -3767,7 +3767,7 @@ aarch64_print_operand (FILE *f, rtx x, char code) case 'H': /* Print the higher numbered register of a pair (TImode) of regs. */ - if (GET_CODE (x) != REG || !GP_REGNUM_P (REGNO (x) + 1)) + if (!REG_P (x) || !GP_REGNUM_P (REGNO (x) + 1)) { output_operand_lossage ("invalid operand for '%%%c'", code); return; @@ -3841,7 +3841,7 @@ aarch64_print_operand (FILE *f, rtx x, char code) case 'X': /* Print bottom 16 bits of integer constant in hex. */ - if (GET_CODE (x) != CONST_INT) + if (!CONST_INT_P (x)) { output_operand_lossage ("invalid operand for '%%%c'", code); return; @@ -5111,7 +5111,7 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, op1 = SUBREG_REG (op1); if ((GET_CODE (op1) == ZERO_EXTEND || GET_CODE (op1) == SIGN_EXTEND) - && GET_CODE (XEXP (op0, 1)) == CONST_INT + && CONST_INT_P (XEXP (op0, 1)) && (GET_MODE_BITSIZE (GET_MODE (XEXP (op1, 0))) >= INTVAL (XEXP (op0, 1 op1 = XEXP (op1, 0); @@ -7680,7 +7680,7 @@ aarch64_simd_valid_immediate (rtx op, enum machine_mode mode, bool inverse, unsigned HOST_WIDE_INT elpart; unsigned int part, parts; - if (GET_CODE (el) == CONST_INT) + if (CONST_INT_P (el)) { elpart = INTVAL (el); parts = 1; @@ -7986,7 +7986,7 @@ aarch64_simd_check_vect_par_cnst_half (rtx op, enum machine_mode mode, rtx elt_op = XVECEXP (op, 0, i); rtx elt_ideal = XVECEXP (ideal, 0, i); - if (GET_CODE (elt_op) != CONST_INT + if (!CONST_INT_P (elt_op) || INTVAL (elt_ideal) != INTVAL (elt_op)) return false; } @@ -7999,7 +7999,7 @@ void aarch64_simd_lane_bo
Re: [PATCH] fix pr62009 use after free in redirect_edge_var_map_dup
On Tue, Aug 05, 2014 at 10:45:38AM +0200, Richard Biener wrote: > On Mon, Aug 4, 2014 at 3:04 PM, wrote: > > From: Trevor Saunders > > > > Hi, > > > > It used to be that edge_var_maps held pointers to embedded vectors, but > > now it holds vectors. This means that now instead of copying the > > address of the embedded vector from the table we keep a pointer into the > > table. However that's incorrect because we may expand the table when > > inserting new into the map in which case our pointer into the map points > > at freed memory. > > > > gcc/ > > > > * tree-ssa.c (redirect_edge_var_map_dup): copy the value in the > > map for old before inserting new. > > > > testing ongoing on x86_64-unknown-linux-gnu, ok? > > Umm... can you explore what it takes to instead store some kind of > indexes? into what? > That is, how is the above not an issue for other 'head's that may be > live somewhere? what do you mean by other head's? Presumably nobody did something like void **p = pointer_map_contains (t, x); void **q = pointer_map_insert (t, y); *q = *p; because that would be broken, and that's the same problem we have here. So assuming I didn't break anything else nobody else uses a pointer into the table that they have after inserting elements. > Or can you restore what "used to be"? I thought I had to bail out early if the map didn't contain old, but when I actually look at the ddiff it looks like we delt with this before by just inserting new and then getting old so yeah I can just make things work more like they used to. Though maybe at some point we want to try and have less elements in the table with empty vectors. Trev > > Thanks, > Richard. > > > Trev > > > > --- > > gcc/tree-ssa.c | 6 +- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c > > index 920cbea..b949d48 100644 > > --- a/gcc/tree-ssa.c > > +++ b/gcc/tree-ssa.c > > @@ -109,7 +109,11 @@ redirect_edge_var_map_dup (edge newe, edge olde) > >if (!head) > > return; > > > > - edge_var_maps->get_or_insert (newe).safe_splice (*head); > > + /* Save what head points at because if we need to insert new into the > > map we > > + may end up expanding the table in which case head will no longer > > point at > > + valid memory. */ > > + vec h = *head; > > + edge_var_maps->get_or_insert (newe).safe_splice (h); > > } > > > > > > -- > > 2.0.1 > >
Re: Remove unnecessary and harmful fixincludes for Android
Hi Andrew, Joseph, thanks for looking at the patch. See my comments and updated patch below. 2014-08-05 0:54 GMT+04:00 Andrew Pinski : > On Mon, Aug 4, 2014 at 8:29 AM, Alexander Ivchenko wrote: >> Hi, >> >> The following patch disables "stdio_va_list" fix: stdio.h is already >> good in Android and, since ndk gcc is indented to be used with >> different Android sysroots, it is actually harmful, because without >> this fix only the version of stdio.h from the sysroot the compiler was >> built with will be used. > > Isn't that why fixincludes is installed to run after the fact on sysroot? > > Thanks, > Andrew Pinski > Are you saying that when you specify "--sysroot" option to an installed compiler, it will "fixinclude" the headers from that sysroot during the compilation process? I don't see that happening. we see that on Android stdio.h is taken only from the sysroot the compiler was built with. 2014-08-05 0:50 GMT+04:00 Joseph S. Myers : > On Mon, 4 Aug 2014, Alexander Ivchenko wrote: > >> +2014-08-04 Alexander Ivchenko >> + >> + * inclhack.def (stdio_va_list): Disable fix for *android*. > > Testing for *android* is less than ideal, because of the possibility of > configuring a *-linux* toolchain to have multilibs using various different > C libraries (with -mandroid being used to select the Android multilib). > So, specifying a bypass based on some relevant text that appears in the > header would be better. > > -- > Joseph S. Myers > jos...@codesourcery.com I've added check for "BIONIC" keyword. Hopefully it won't disappear. Updated patch: diff --git a/fixincludes/ChangeLog b/fixincludes/ChangeLog index f7effee..e05412e 100644 --- a/fixincludes/ChangeLog +++ b/fixincludes/ChangeLog @@ -1,3 +1,10 @@ +2014-08-04 Alexander Ivchenko + + * inclhack.def (stdio_va_list): Bypass fix for Bionic. + (complier_h_tradcpp): Remove. + * fixincl.x: Regenerate. + * tests/base/linux/compiler.h: Remove. + 2014-04-22 Rainer Orth * inclhack.def (math_exception): Bypass on *-*-solaris2.1[0-9]*. diff --git a/fixincludes/fixincl.x b/fixincludes/fixincl.x index dd45802..d555c51 100644 --- a/fixincludes/fixincl.x +++ b/fixincludes/fixincl.x @@ -2,11 +2,11 @@ * * DO NOT EDIT THIS FILE (fixincl.x) * - * It has been AutoGen-ed Tuesday January 7, 2014 at 12:02:54 PM MET + * It has been AutoGen-ed August 5, 2014 at 02:53:14 PM by AutoGen 5.12 * From the definitionsinclhack.def * and the template file fixincl */ -/* DO NOT SVN-MERGE THIS FILE, EITHER Tue Jan 7 12:02:54 MET 2014 +/* DO NOT SVN-MERGE THIS FILE, EITHER Tue Aug 5 14:53:14 MSK 2014 * * You must regenerate it. Use the ./genfixes script. * @@ -15,7 +15,7 @@ * certain ANSI-incompatible system header files which are fixed to work * correctly with ANSI C and placed in a directory that GNU C will search. * - * This file contains 224 fixup descriptions. + * This file contains 223 fixup descriptions. * * See README for more information. * @@ -2111,41 +2111,6 @@ int vfscanf(FILE *, const char *, __builtin_va_list) __asm__ (_BSD_STRING(__USER /* * * * * * * * * * * * * * * * * * * * * * * * * * * - * Description of Complier_H_Tradcpp fix - */ -tSCC zComplier_H_TradcppName[] = - "complier_h_tradcpp"; - -/* - * File name selection pattern - */ -tSCC zComplier_H_TradcppList[] = - "linux/compiler.h\0"; -/* - * Machine/OS name selection pattern - */ -#define apzComplier_H_TradcppMachs (const char**)NULL - -/* - * content selection pattern - do fix if pattern found - */ -tSCC zComplier_H_TradcppSelect0[] = - "#define __builtin_warning\\(x, y\\.\\.\\.\\) \\(1\\)"; - -#defineCOMPLIER_H_TRADCPP_TEST_CT 1 -static tTestDesc aComplier_H_TradcppTests[] = { - { TT_EGREP,zComplier_H_TradcppSelect0, (regex_t*)NULL }, }; - -/* - * Fix Command Arguments for Complier_H_Tradcpp - */ -static const char* apzComplier_H_TradcppPatch[] = { -"format", -"/* __builtin_warning(x, y...) is obsolete */", -(char*)NULL }; - -/* * * * * * * * * * * * * * * * * * * * * * * * * * - * * Description of Ctrl_Quotes_Def fix */ tSCC zCtrl_Quotes_DefName[] = @@ -7234,7 +7199,7 @@ tSCC* apzStdio_Va_ListMachs[] = { * content bypass pattern - skip fix if pattern found */ tSCC zStdio_Va_ListBypass0[] = - "__gnuc_va_list|_BSD_VA_LIST_|__DJ_va_list|_G_va_list"; + "__gnuc_va_list|_BSD_VA_LIST_|__DJ_va_list|_G_va_list|BIONIC"; #defineSTDIO_VA_LIST_TEST_CT 1 static tTestDesc aStdio_Va_ListTests[] = { @@ -9187,9 +9152,9 @@ static const char* apzX11_SprintfPatch[] = { * * List of all fixes */ -#define REGEX_COUNT 261 +#define REGEX_COUNT 260 #define MACH_LIST_SIZE_LIMIT 187 -#define FIX_COUNT224 +#define FIX_COUNT223 /* * Enumerate the fixes @@ -9242,7 +9207,6 @@ typedef enum { BROKEN_CABS_FIXIDX, BROKEN_NAN_FIXIDX, BSD_STDIO_ATTRS_CONFLICT_FIXIDX, -COMPLIER_H_TRADCPP_FIXIDX, CTRL_QUOTES_DEF_FIXIDX, CTRL_QUOTES_USE_FI
Re: Go 1.3 update broke Solaris bootstrap
Hi Ian, > On Tue, Jul 22, 2014 at 1:14 AM, Rainer Orth > wrote: >> The recent Go 1.3 update (which I couldn't find posted to gcc-patches) >> broke Solaris bootstrap: >> >> /vol/gcc/src/hg/trunk/local/libgo/runtime/mem.c:50:20: error: pointer >> targets in passing argument 1 of 'mincore' differ in signedness >> [-Werror=pointer-sign] >>errval = mincore((int8*)v + off, chunk, vec); >> ^ >> In file included from >> /vol/gcc/src/hg/trunk/local/libgo/runtime/runtime.h:22:0, >> from /vol/gcc/src/hg/trunk/local/libgo/runtime/mem.c:8: >> /usr/include/sys/mman.h:232:12: note: expected 'caddr_t' but argument is of >> type 'int8 *' >> extern int mincore(caddr_t, size_t, char *); >> ^ >> /vol/gcc/src/hg/trunk/local/libgo/runtime/mem.c:50:43: error: pointer >> targets in passing argument 3 of 'mincore' differ in signedness >> [-Werror=pointer-sign] >>errval = mincore((int8*)v + off, chunk, vec); >>^ >> In file included from >> /vol/gcc/src/hg/trunk/local/libgo/runtime/runtime.h:22:0, >> from /vol/gcc/src/hg/trunk/local/libgo/runtime/mem.c:8: >> /usr/include/sys/mman.h:232:12: note: expected 'char *' but argument is of >> type 'byte *' >> extern int mincore(caddr_t, size_t, char *); >> ^ >> >> The following patch restores it, though there are certainly other >> options (uint8* for the v cast, a void* cast for vec). > > Thanks. I had to adjust your patch to work on GNU/Linux, for which > the third argument to mincore is unsigned char *. This is what I've > committed to mainline after a bootstrap and testsuite run on > x86_64-unknown-linux-gnu. Thanks. This is exactly the patch I had in my local tree once I noticed the original one broke Linux bootstrap. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [C++ Patch/RFC] PR 43906
Hi, On 08/05/2014 03:58 AM, Jason Merrill wrote: On 08/04/2014 07:01 PM, Paolo Carlini wrote: In fact I wondered about that a few minutes after sending my message... And this is what I figured out: normally we have hard errors from composite_pointer_type (eg, try scalar types, class types), even for null values. The only exception I have been able to find earlier today is that of pointer to the same function type, eg: extern void z(); typedef void (*ptr)(); void i() { if ( z != (ptr)0 ); } but in this case the C front-end too doesn't warn. I don't see why we wouldn't want to warn in this case; it's still the case thet the comparison will always be false. In general, I agree of course. I was trying to understand if keeping the issue as minimally one of consistency with the C front-end simplified it. We can also see this situation for non-function pointers: void f() { int i; if (&i != (int*)0); } Sure. Then, however, we must be careful about the actual pointer types, otherwise we change hard errors to warnings. And void* is an exception to the general rule. I tried using ptr_reasonably_similar to avoid the explicit call of comptypes + separate VOID_TYPE_P, but unfortunately it appears too loose about at least pointers to function types. Thanks! Paolo. // Index: cp/typeck.c === --- cp/typeck.c (revision 213631) +++ cp/typeck.c (working copy) @@ -4353,12 +4353,14 @@ cp_build_binary_op (location_t location, && (code1 == INTEGER_TYPE || code1 == REAL_TYPE || code1 == COMPLEX_TYPE || code1 == ENUMERAL_TYPE)) short_compare = 1; - else if ((code0 == POINTER_TYPE && code1 == POINTER_TYPE) - || (TYPE_PTRDATAMEM_P (type0) && TYPE_PTRDATAMEM_P (type1))) - result_type = composite_pointer_type (type0, type1, op0, op1, - CPO_COMPARISON, complain); else if ((code0 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type0)) - && null_ptr_cst_p (op1)) + && (null_ptr_cst_p (op1) + /* Handle, eg, (void*)0 (c++/43906), and more. */ + || (TYPE_PTR_P (type1) && integer_zerop (op1) + && (VOID_TYPE_P (TREE_TYPE (type1)) + || comptypes (TYPE_MAIN_VARIANT (TREE_TYPE (type0)), +TYPE_MAIN_VARIANT (TREE_TYPE (type1)), +COMPARE_BASE | COMPARE_DERIVED) { if (TREE_CODE (op0) == ADDR_EXPR && decl_with_nonnull_addr_p (TREE_OPERAND (op0, 0))) @@ -4371,7 +4373,13 @@ cp_build_binary_op (location_t location, result_type = type0; } else if ((code1 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type1)) - && null_ptr_cst_p (op0)) + && (null_ptr_cst_p (op0) + /* Handle, eg, (void*)0 (c++/43906), and more. */ + || (TYPE_PTR_P (type0) && integer_zerop (op0) + && (VOID_TYPE_P (TREE_TYPE (type0)) + || comptypes (TYPE_MAIN_VARIANT (TREE_TYPE (type0)), +TYPE_MAIN_VARIANT (TREE_TYPE (type1)), +COMPARE_BASE | COMPARE_DERIVED) { if (TREE_CODE (op1) == ADDR_EXPR && decl_with_nonnull_addr_p (TREE_OPERAND (op1, 0))) @@ -4383,6 +4391,10 @@ cp_build_binary_op (location_t location, } result_type = type1; } + else if ((code0 == POINTER_TYPE && code1 == POINTER_TYPE) + || (TYPE_PTRDATAMEM_P (type0) && TYPE_PTRDATAMEM_P (type1))) + result_type = composite_pointer_type (type0, type1, op0, op1, + CPO_COMPARISON, complain); else if (null_ptr_cst_p (op0) && null_ptr_cst_p (op1)) /* One of the operands must be of nullptr_t type. */ result_type = TREE_TYPE (nullptr_node); Index: testsuite/g++.dg/warn/Waddress-1.C === --- testsuite/g++.dg/warn/Waddress-1.C (revision 0) +++ testsuite/g++.dg/warn/Waddress-1.C (working copy) @@ -0,0 +1,42 @@ +// PR c++/43906 +// { dg-options "-Waddress" } + +extern void z(); +typedef void (*ptrf) (); +typedef int (*ptrfn) (int); +int n; +const int m = 1; +struct S { }; +struct T : S { }; +struct U; +S s; +T t; +double d; + +void f() { if (z) z(); } // { dg-warning "address" } + +void gl() { if (z != 0) z(); } // { dg-warning "address" } +void hl() { if (z != (ptrf)0) z(); }// { dg-warning "address" } +void il() { if (z != (void*)0) z(); } // { dg-warning "address" } +void jl() { if (&n != (int*)0) z(); } // { dg-warning "address" } +void kl() { if (&m != (int*)0) z(); } // { dg-warning "address" } +void ll() { if (&s != (T*)0) z(); } // { dg-warning "address" } +void m
Re: [PATCH][testsuite] Don't run cproj-fails-with-broken-glibc.c for broken glibc
Hi Mike, >> Or do we go with the removal suggestion of Mike? > > I’ll let others weigh in. I’m fine either way. Original author likely > prefers the xfail, so I’m fine with the above. the test now XPASSes on Solaris, adding testsuite noise in the other direction on completely innocent systems. Given that we're not actually testing gcc behaviour or a workaround for a glibc bug here, but just alerting users to changed cproj semantics on old and newer glibc systems https://gcc.gnu.org/ml/gcc-patches/2010-04/msg01285.html I'd strongly suggest just removing the test and the supporting effective-target keywords. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Add D demangling support to libiberty
Attached revision #2 of the patch. --- Add D demangling support for version 2 of the ABI. include/ChangeLog 2014-08-05 Iain Buclaw * demangle.h (DMGL_DLANG): New macro. (DMGL_STYLE_MASK): Add DMGL_DLANG. (demangling_styles): Add dlang_demangling. (DLANG_DEMANGLING_STYLE_STRING): New macro. (DLANG_DEMANGLING): New macro. (dlang_demangle): New prototype. libiberty/ChangeLog 2014-08-05 Iain Buclaw * Makefile.in (CFILES): Add d-demangle.c. (REQUIRED_OFILES): Add d-demangle.o. * cplus-dem.c (libiberty_demanglers): Add dlang_demangling case. (cplus_demangle): Likewise. * d-demangle.c: New file. * testsuite/Makefile.in (really-check): Add check-d-demangle. * testsuite/d-demangle-expected: New file. --- diff --git a/include/demangle.h b/include/demangle.h index bbad71b..d2a6731 100644 --- a/include/demangle.h +++ b/include/demangle.h @@ -63,9 +63,10 @@ extern "C" { #define DMGL_EDG (1 << 13) #define DMGL_GNU_V3 (1 << 14) #define DMGL_GNAT (1 << 15) +#define DMGL_DLANG (1 << 16) /* If none of these are set, use 'current_demangling_style' as the default. */ -#define DMGL_STYLE_MASK (DMGL_AUTO|DMGL_GNU|DMGL_LUCID|DMGL_ARM|DMGL_HP|DMGL_EDG|DMGL_GNU_V3|DMGL_JAVA|DMGL_GNAT) +#define DMGL_STYLE_MASK (DMGL_AUTO|DMGL_GNU|DMGL_LUCID|DMGL_ARM|DMGL_HP|DMGL_EDG|DMGL_GNU_V3|DMGL_JAVA|DMGL_GNAT|DMGL_DLANG) /* Enumeration of possible demangling styles. @@ -87,7 +88,8 @@ extern enum demangling_styles edg_demangling = DMGL_EDG, gnu_v3_demangling = DMGL_GNU_V3, java_demangling = DMGL_JAVA, - gnat_demangling = DMGL_GNAT + gnat_demangling = DMGL_GNAT, + dlang_demangling = DMGL_DLANG } current_demangling_style; /* Define string names for the various demangling styles. */ @@ -102,6 +104,7 @@ extern enum demangling_styles #define GNU_V3_DEMANGLING_STYLE_STRING"gnu-v3" #define JAVA_DEMANGLING_STYLE_STRING "java" #define GNAT_DEMANGLING_STYLE_STRING "gnat" +#define DLANG_DEMANGLING_STYLE_STRING "dlang" /* Some macros to test what demangling style is active. */ @@ -115,6 +118,7 @@ extern enum demangling_styles #define GNU_V3_DEMANGLING (((int) CURRENT_DEMANGLING_STYLE) & DMGL_GNU_V3) #define JAVA_DEMANGLING (((int) CURRENT_DEMANGLING_STYLE) & DMGL_JAVA) #define GNAT_DEMANGLING (((int) CURRENT_DEMANGLING_STYLE) & DMGL_GNAT) +#define DLANG_DEMANGLING (((int) CURRENT_DEMANGLING_STYLE) & DMGL_DLANG) /* Provide information about the available demangle styles. This code is pulled from gdb into libiberty because it is useful to binutils also. */ @@ -169,6 +173,9 @@ java_demangle_v3 (const char *mangled); char * ada_demangle (const char *mangled, int options); +extern char * +dlang_demangle (const char *mangled, int options); + enum gnu_v3_ctor_kinds { gnu_v3_complete_object_ctor = 1, gnu_v3_base_object_ctor, diff --git a/libiberty/Makefile.in b/libiberty/Makefile.in index 44e340f..9b87720 100644 --- a/libiberty/Makefile.in +++ b/libiberty/Makefile.in @@ -127,7 +127,7 @@ CFILES = alloca.c argv.c asprintf.c atexit.c\ basename.c bcmp.c bcopy.c bsearch.c bzero.c \ calloc.c choose-temp.c clock.c concat.c cp-demangle.c \ cp-demint.c cplus-dem.c crc32.c\ - dwarfnames.c dyn-string.c \ + d-demangle.c dwarfnames.c dyn-string.c\ fdmatch.c ffs.c fibheap.c filename_cmp.c floatformat.c \ fnmatch.c fopen_unlocked.c \ getcwd.c getopt.c getopt1.c getpagesize.c getpwd.c getruntime.c \ @@ -167,7 +167,7 @@ REQUIRED_OFILES = \ ./md5.$(objext) ./sha1.$(objext) ./alloca.$(objext) \ ./argv.$(objext) \ ./choose-temp.$(objext) ./concat.$(objext) \ - ./cp-demint.$(objext) ./crc32.$(objext)\ + ./cp-demint.$(objext) ./crc32.$(objext) ./d-demangle.$(objext) \ ./dwarfnames.$(objext) ./dyn-string.$(objext) \ ./fdmatch.$(objext) ./fibheap.$(objext)\ ./filename_cmp.$(objext) ./floatformat.$(objext) \ @@ -714,6 +714,14 @@ $(CONFIGURED_OFILES): stamp-picdir stamp-noasandir else true; fi $(COMPILE.c) $(srcdir)/dyn-string.c $(OUTPUT_OPTION) +./d-demangle.$(objext): $(srcdir)/d-demangle.c config.h $(INCDIR)/ansidecl.h \ + $(srcdir)/cp-demangle.h $(INCDIR)/demangle.h \ + $(INCDIR)/dyn-string.h $(INCDIR)/getopt.h $(INCDIR)/libiberty.h + if [ x"$(PICFLAG)" != x ]; then \ + $(COMPILE.c) $(PICFLAG) $(srcdir)/d-demangle.c -o pic/$@; \ + else true; fi + $(COMPILE.c) $(srcdir)/d-demangle.c $(OUTPUT_OPTION) + ./fdmatch.$(objext): $(srcdir)/fdmatch.c config.h $(INCDIR)/ansidecl.h \ $(INCDIR)/libiberty.h if [ x"$(PICFLAG)" != x ]; then \ diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c index 52767cc..c68b981 100644 --- a/libiberty/cplus-dem.c +++ b/libiberty/cplus-dem.c @@ -306,6 +306,12 @@ const struct demangler_engine libiberty_demanglers[] = } , { +DLANG_DEMANGLING_STYLE_STRING, +dlang_demangling, +"DLANG style demangling" + } + , + { NULL, unknown_demangling, NULL } }; @@ -870,6 +876,13 @@ cplus_demangle (cons
Re: [C++ Patch/RFC] PR 43906
.. a clarification. As I tried to briefly explain yesterday, this kind of change means that: extern void z(); void il() { if (z != (void*)0) z(); } doesn't trigger anymore the pedwarn at beginning of composite_pointer_type about the comparison itself, for the simple reason that we don't call it anymore. I suppose that's Ok. Otherwise we have to change something, it's a bit tricky tough. We could even decide that we don't want to handle the above for -Waddress because conceptually the issue with the comparison itself comes before, as again I tried to explain a bit yesterday, but that triggers a warning only with -pedantic (EDG has it by default). Paolo.
Re: [C++ Patch/RFC] PR 43906
On 08/05/2014 08:10 AM, Paolo Carlini wrote: .. a clarification. As I tried to briefly explain yesterday, this kind of change means that: extern void z(); void il() { if (z != (void*)0) z(); } doesn't trigger anymore the pedwarn at beginning of composite_pointer_type about the comparison itself, for the simple reason that we don't call it anymore. So let's keep calling it when the RHS is also a pointer? Jason
[PATCH v2] fix pr62009 use after free in redirect_edge_var_map_dup
From: Trevor Saunders hi, The change to get the entry for the old edge before inserting the new one was incorrect because if inserting the new one resized the table then the pointer to the entry for the old one would become invalid. gcc/ * tree-ssa.c (redirect_edge_var_map_dup): insert newe before getting olde. bootstrapping on and regtest on x86_64-unknown-linux-gnu and bootstrap on i686-linux-gnu ongoing, ok? Trev --- gcc/tree-ssa.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c index 920cbea..b6b3718 100644 --- a/gcc/tree-ssa.c +++ b/gcc/tree-ssa.c @@ -105,11 +105,12 @@ redirect_edge_var_map_dup (edge newe, edge olde) if (!edge_var_maps) return; - auto_vec *head = edge_var_maps->get (olde); - if (!head) + auto_vec *new_head = &edge_var_maps->get_or_insert (newe); + auto_vec *old_head = edge_var_maps->get (olde); + if (!old_head) return; - edge_var_maps->get_or_insert (newe).safe_splice (*head); + new_head->safe_splice (*old_head); } -- 2.0.1
Re: [PATCH] fix pr62009 use after free in redirect_edge_var_map_dup
On Tue, Aug 5, 2014 at 1:17 PM, Trevor Saunders wrote: > On Tue, Aug 05, 2014 at 10:45:38AM +0200, Richard Biener wrote: >> On Mon, Aug 4, 2014 at 3:04 PM, wrote: >> > From: Trevor Saunders >> > >> > Hi, >> > >> > It used to be that edge_var_maps held pointers to embedded vectors, but >> > now it holds vectors. This means that now instead of copying the >> > address of the embedded vector from the table we keep a pointer into the >> > table. However that's incorrect because we may expand the table when >> > inserting new into the map in which case our pointer into the map points >> > at freed memory. >> > >> > gcc/ >> > >> > * tree-ssa.c (redirect_edge_var_map_dup): copy the value in the >> > map for old before inserting new. >> > >> > testing ongoing on x86_64-unknown-linux-gnu, ok? >> >> Umm... can you explore what it takes to instead store some kind of >> indexes? > > into what? No idea - I didn't look into the affected code not do I remember it off-head. >> That is, how is the above not an issue for other 'head's that may be >> live somewhere? > > what do you mean by other head's? Presumably nobody did something like > > void **p = pointer_map_contains (t, x); > void **q = pointer_map_insert (t, y); > *q = *p; > > because that would be broken, and that's the same problem we have here. > So assuming I didn't break anything else nobody else uses a pointer into > the table that they have after inserting elements. Ah, now I understand. I thought somebody might hold onto 'p' to some other random element. At least if that was valid "before". If not the patch is ok. Thanks, Richard. >> Or can you restore what "used to be"? > > I thought I had to bail out early if the map didn't contain old, but > when I actually look at the ddiff it looks like we delt with this before > by just inserting new and then getting old so yeah I can just make > things work more like they used to. Though maybe at some point we want > to try and have less elements in the table with empty vectors. > > Trev > >> >> Thanks, >> Richard. >> >> > Trev >> > >> > --- >> > gcc/tree-ssa.c | 6 +- >> > 1 file changed, 5 insertions(+), 1 deletion(-) >> > >> > diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c >> > index 920cbea..b949d48 100644 >> > --- a/gcc/tree-ssa.c >> > +++ b/gcc/tree-ssa.c >> > @@ -109,7 +109,11 @@ redirect_edge_var_map_dup (edge newe, edge olde) >> >if (!head) >> > return; >> > >> > - edge_var_maps->get_or_insert (newe).safe_splice (*head); >> > + /* Save what head points at because if we need to insert new into the >> > map we >> > + may end up expanding the table in which case head will no longer >> > point at >> > + valid memory. */ >> > + vec h = *head; >> > + edge_var_maps->get_or_insert (newe).safe_splice (h); >> > } >> > >> > >> > -- >> > 2.0.1 >> >
Re: [PATCH v2] fix pr62009 use after free in redirect_edge_var_map_dup
On Tue, Aug 5, 2014 at 2:32 PM, wrote: > From: Trevor Saunders > > hi, > > The change to get the entry for the old edge before inserting the new > one was incorrect because if inserting the new one resized the table > then the pointer to the entry for the old one would become invalid. > > gcc/ > > * tree-ssa.c (redirect_edge_var_map_dup): insert newe before > getting olde. > > bootstrapping on and regtest on x86_64-unknown-linux-gnu and bootstrap on > i686-linux-gnu ongoing, ok? Also works for me. Richard. > Trev > > --- > gcc/tree-ssa.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c > index 920cbea..b6b3718 100644 > --- a/gcc/tree-ssa.c > +++ b/gcc/tree-ssa.c > @@ -105,11 +105,12 @@ redirect_edge_var_map_dup (edge newe, edge olde) >if (!edge_var_maps) > return; > > - auto_vec *head = edge_var_maps->get (olde); > - if (!head) > + auto_vec *new_head = &edge_var_maps->get_or_insert (newe); > + auto_vec *old_head = edge_var_maps->get (olde); > + if (!old_head) > return; > > - edge_var_maps->get_or_insert (newe).safe_splice (*head); > + new_head->safe_splice (*old_head); > } > > > -- > 2.0.1 >
Fix vec_extract_lo constraint.
Hi, I've noticed that vec_extract_lo_ pattern has vm/vm alternative when mask is not applied. This can lead to insn with 2 memory operands. Patch bellow fixes it. Ok for trunk? 2014-08-05 Ilya Tocar * common/config/i386/sse.md (vec_extract_lo_): Fix constraint. --- gcc/config/i386/sse.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 0f7ca27..85f48ab 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -5999,9 +5999,9 @@ (set_attr "mode" "")]) (define_insn "vec_extract_lo_" - [(set (match_operand: 0 "" "=") + [(set (match_operand: 0 "" "=v,") (vec_select: - (match_operand:V8FI 1 "nonimmediate_operand" "vm") + (match_operand:V8FI 1 "nonimmediate_operand" "vm,v") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])))] "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))" -- 1.8.3.1
Re: [PATCH, Pointer Bounds Checker 27/x] Strlen
eOn 31 Jul 14:07, Ilya Enkovich wrote: > 2014-06-11 12:22 GMT+04:00 Jakub Jelinek : > > On Wed, Jun 11, 2014 at 12:14:14PM +0400, Ilya Enkovich wrote: > >> This patch adds instrumented code support for strlen optimization. > >> > >> Bootstrapped and tested on linux-x86_64. > >> > >> Does it look OK? > > > > Ugh, this looks terribly ugly. So you are changing the meaning of arguments > > of every single builtin that has pointer arguments, so all spots that > > work with those builtins have to take into account > > gimple_call_with_bounds_p and do something different? > > That is very error prone and maintainance nightmare. > > > > Jakub > > Hi Jakub, > > I've tried an approach with internal functions usage as you suggested > on Cauldron. > > Shortly about used scheme: > - only a selected set of builtin functions is instrumented > - during instrumentation builtin call is replaced with internal function > call > > I tried to implement it and faced few issues: > > 1. On expand pass we need to expand all those internal function and > since they should be expanded as regular instrumented calls, I need to > create a CALL_EXPR tree for that. I have nothing to use as fndecl for > this call except the same instrumented builtin fndecl I use right now. > Therefore the problem of instrumented call with the same builtin > function code would still exist during expand pass. > 2. Some builtins actually may have body. If I instrument such call and > replace it with internal call then I loose a possibility to inline it. > If I do not replace it with internal call then I have to make an > instrumented call and again I have to create and use instrumented > builtin fndecl having the same situation I have now but this time only > until call is inlined. > 3. Usage of internal functions means we cannot support whole program > instrumentation (or we have to declare internal function for each > builtin) which may appear a very useful scheme with reduced memory and > performance penalties. > > Now I want to try another scheme. I want to change enum > built_in_function so that each builtin code has a paired code to be > used for instrumented version. I'm not going to actually declare > additional builtins for users, I just want to obtain special function > codes to be used for instrumented builtins and thus to not make > current code handling builtins to work with instrumented calls. > > Ilya I've tried to use different function codes for instrumented builtin calls and this scheme looks better to me. It allows to work with instrumented builtin calls as with regular calls which is convenient. At the same time all current builtin call handlers are not affected. Here is a core change required: diff --git a/gcc/tree-core.h b/gcc/tree-core.h index b70c262..5b8bb5a 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -168,6 +168,14 @@ enum built_in_class { enum built_in_function { #include "builtins.def" + BEGIN_CHKP_BUILTINS, + +#undef DEF_BUILTIN +#define DEF_BUILTIN(ENUM, N, C, T, LT, B, F, NA, AT, IM, COND) ENUM##_CHKP, +#include "builtins.def" + + END_CHKP_BUILTINS, + /* Complex division routines in libgcc. These are done via builtins because emit_library_call_value can't handle complex values. */ BUILT_IN_COMPLEX_MUL_MIN, @@ -1490,7 +1498,7 @@ struct GTY(()) tree_function_decl { DECL_FUNCTION_CODE. Otherwise unused. ??? The bitfield needs to be able to hold all target function codes as well. */ - ENUM_BITFIELD(built_in_function) function_code : 11; + ENUM_BITFIELD(built_in_function) function_code : 12; ENUM_BITFIELD(built_in_class) built_in_class : 2; unsigned static_ctor_flag : 1; @@ -1514,7 +1522,7 @@ struct GTY(()) tree_function_decl { unsigned has_debug_args_flag : 1; unsigned tm_clone_flag : 1; unsigned versioned_function : 1; - /* No bits left. */ + /* 31 bits left. */ }; struct GTY(()) tree_translation_unit_decl { Unfortunately tree_function_decl does not have any free bits to be used for increased enum size. Suppose some flag may be moved into tree_decl_with_vis which has some free bits. Please tell me if this approach with builtins looks OK and I would update affected patches then. Strlen patch in this thread with new approach transforms from stability to performance patch. Below is a modified version of this patch with new function codes introduced. Thanks, Ilya -- diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index f55b7ee..93e88f7 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -50,6 +50,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-pretty-print.h" #include "params.h" #include "expr.h" +#include "tree-chkp.h" /* A vector indexed by SSA_NAME_VERSION. 0 means unknown, positive value is an index into strinfo vector, negative value stands for @@ -426,6 +427,7 @@ get_string_length (strinfo si) if (si->stmt) { gimple stmt = si->stmt, le
Re: Fix vec_extract_lo constraint.
On Tue, Aug 5, 2014 at 2:43 PM, Ilya Tocar wrote: > Hi, > I've noticed that vec_extract_lo_ pattern has > vm/vm alternative when mask is not applied. This can lead to insn > with 2 memory operands. Patch bellow fixes it. > Ok for trunk? > > 2014-08-05 Ilya Tocar > > * common/config/i386/sse.md (vec_extract_lo_): Fix > constraint. > > --- > gcc/config/i386/sse.md | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 0f7ca27..85f48ab 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -5999,9 +5999,9 @@ > (set_attr "mode" "")]) > > (define_insn "vec_extract_lo_" > - [(set (match_operand: 0 "" > "=") > + [(set (match_operand: 0 "" > "=v,") > (vec_select: > - (match_operand:V8FI 1 "nonimmediate_operand" "vm") > + (match_operand:V8FI 1 "nonimmediate_operand" "vm,v") > (parallel [(const_int 0) (const_int 1) > (const_int 2) (const_int 3)])))] >"TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))" I'd suggest op0: "=,v" and op1: "v,m". This would result in op0:"=vm,v" op1:"v,m" and op0:"=v,v" op1:"v,m". Uros.
Re: [PATCH, ivopt] Try aligned offset when get_address_cost
On Mon, Aug 4, 2014 at 11:09 AM, Zhenqiang Chen wrote: > > >> -Original Message- >> From: Bin.Cheng [mailto:amker.ch...@gmail.com] >> Sent: Monday, August 04, 2014 4:41 PM >> To: Zhenqiang Chen >> Cc: gcc-patches List >> Subject: Re: [PATCH, ivopt] Try aligned offset when get_address_cost >> >> On Mon, Aug 4, 2014 at 2:28 PM, Zhenqiang Chen >> wrote: >> > Hi, >> > >> > For some TARGET, like ARM THUMB1, the offset in load/store should be >> > nature aligned. But in function get_address_cost, when computing >> > max_offset, it only tries byte-aligned offsets: >> > >> > ((unsigned HOST_WIDE_INT) 1 << i) - 1 >> > >> > which can not meet thumb_legitimate_offset_p check called from >> > thumb1_legitimate_address_p for HImode and SImode. >> > >> > The patch adds additional try for aligned offset: >> > >> > ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode). >> > >> > Bootstrap and no make check regression on X86-64. >> > No make check regression on qemu for Cortex-m0 and Cortex-m3. >> > For Cortex-m0, no performance changes with coremark and dhrystone. >> > Coremark code size is ~0.44 smaller. And eembcv2 code size is ~0.22 >> > smaller. CSiBE code size is ~0.05% smaller. >> > >> > OK for trunk? >> > >> > Thanks! >> > -Zhenqiang >> > >> > ChangeLog >> > 2014-08-04 Zhenqiang Chen >> > >> > * tree-ssa-loop-ivopts.c (get_address_cost): Try aligned offset. >> > >> > testsuite/ChangeLog: >> > 2014-08-04 Zhenqiang Chen >> > >> > * gcc.target/arm/get_address_cost_aligned_max_offset.c: New > test. >> > >> > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c >> > index 3b4a6cd..562122a 100644 >> > --- a/gcc/tree-ssa-loop-ivopts.c >> > +++ b/gcc/tree-ssa-loop-ivopts.c >> > @@ -3308,6 +3308,18 @@ get_address_cost (bool symbol_present, bool >> > var_present, >> > XEXP (addr, 1) = gen_int_mode (off, address_mode); >> > if (memory_address_addr_space_p (mem_mode, addr, as)) >> > break; >> > + /* For some TARGET, like ARM THUMB1, the offset should be > nature >> > +aligned. Try an aligned offset if address_mode is not > QImode. >> > */ >> > + off = (address_mode == QImode) >> > + ? 0 >> > + : ((unsigned HOST_WIDE_INT) 1 << i) >> > + - GET_MODE_SIZE (address_mode); >> > + if (off > 0) >> > + { >> > + XEXP (addr, 1) = gen_int_mode (off, address_mode); >> > + if (memory_address_addr_space_p (mem_mode, addr, as)) >> > + break; >> > + } >> Hi, Why not just check "address_mode != QImode"? Set off to 0 then check > it >> seems unnecessary. > > Thanks for the comments. > > ((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE (address_mode) might be a > negative value except QImode. A negative value can not be max_offset. So we > do not need to check it. > > For QImode, "((unsigned HOST_WIDE_INT) 1 << i) - GET_MODE_SIZE > (address_mode)" == "((unsigned HOST_WIDE_INT) 1 << i) - 1". It is already > checked. So no need to check it again. > > I think the compiler can optimize the patch like > > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c > index 3b4a6cd..213598a 100644 > --- a/gcc/tree-ssa-loop-ivopts.c > +++ b/gcc/tree-ssa-loop-ivopts.c > @@ -3308,6 +3308,19 @@ get_address_cost (bool symbol_present, bool > var_present, > XEXP (addr, 1) = gen_int_mode (off, address_mode); > if (memory_address_addr_space_p (mem_mode, addr, as)) > break; > + /* For some TARGET, like ARM THUMB1, the offset should be nature > +aligned. Try an aligned offset if address_mode is not QImode. > */ > + if (address_mode != QImode) > + { > + off = ((unsigned HOST_WIDE_INT) 1 << i) > + - GET_MODE_SIZE (address_mode); > + if (off > 0) > + { > + XEXP (addr, 1) = gen_int_mode (off, address_mode); > + if (memory_address_addr_space_p (mem_mode, addr, as)) > + break; > + } > + } > } >if (i == -1) > off = 0; But is off now guaranteed to be the max value? (1 << (i-1) ) - 1 for small i is larger than (1 << i) - GET_MODE_SIZE (address_mode). That is, I think you want to guard this with 1 << (i - 1) > GET_MODE_SIZE (address_mode)? You don't adjust the negative offset side - why? Richard. > >> Thanks, >> bin >> > } >> >if (i == -1) >> > off = 0; >> > diff --git >> > a/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c >> > b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset.c >> > new file mode 100644 >> > index 000..cc3e2f7 >> > --- /dev/null >> > +++ >> b/gcc/testsuite/gcc.target/arm/get_address_cost_aligned_max_offset >> > +++ .c >> > @@ -0,0 +1,28 @@ >> > +/* { dg-do compile } */ >> > +/* { dg-options "-mthumb -O2" } */ >> > +/* { d
Re: Remove unnecessary and harmful fixincludes for Android
Hi, On Tue, Aug 5, 2014 at 4:35 AM, Alexander Ivchenko wrote: >> Testing for *android* is less than ideal, because of the possibility of >> configuring a *-linux* toolchain to have multilibs using various different >> C libraries (with -mandroid being used to select the Android multilib). >> So, specifying a bypass based on some relevant text that appears in the >> header would be better. >> >> -- >> Joseph S. Myers >> jos...@codesourcery.com > > I've added check for "BIONIC" keyword. Hopefully it won't disappear. "based on some relevant text" I think that's important, too (that it be relevant). "BIONIC" is just some improbable text you found in the header. My guess is that testing for '*android*' would be more selective, and certainly less obscure. Who would ever guess that "BIONIC" implies "android"? > Updated patch: > > diff --git a/fixincludes/ChangeLog b/fixincludes/ChangeLog > index f7effee..e05412e 100644 > --- a/fixincludes/ChangeLog > +++ b/fixincludes/ChangeLog > @@ -1,3 +1,10 @@ > +2014-08-04 Alexander Ivchenko > + > + * inclhack.def (stdio_va_list): Bypass fix for Bionic. > + (complier_h_tradcpp): Remove. > + * fixincl.x: Regenerate. > + * tests/base/linux/compiler.h: Remove. > + > 2014-04-22 Rainer Orth > > * inclhack.def (math_exception): Bypass on *-*-solaris2.1[0-9]*. > diff --git a/fixincludes/fixincl.x b/fixincludes/fixincl.x [[generated text is not needed for approval]] > diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def > index 6a1136c..bf452c6 100644 > --- a/fixincludes/inclhack.def > +++ b/fixincludes/inclhack.def > @@ -1140,20 +1140,6 @@ fix = { > }; > > /* > - * Old Linux kernel's header breaks Traditional CPP > - */ > -fix = { > -hackname = complier_h_tradcpp; [[ OK ]] > @@ -3722,8 +3708,9 @@ fix = { > fix = { > hackname = stdio_va_list; > files= stdio.h; > -bypass = '__gnuc_va_list|_BSD_VA_LIST_|__DJ_va_list|_G_va_list'; > +bypass = '__gnuc_va_list|_BSD_VA_LIST_|__DJ_va_list|_G_va_list|BIONIC'; > /* > + * In Bionic va_list is always appropriately typedefed to __gnuc_va_list. And that typedef does not live in stdio.h either. If there is no better way to identify this file, then > Is it ok? It is "okay". (You may be left with little choice -- I can't see the header :).
Re: [PATCH 2/2] Enable elimination of zext/sext
On Fri, Aug 1, 2014 at 6:03 PM, Kugan wrote: if (rhs_uns) return wi::ge_p (min, 0); // if min >= 0 then range contains positive values else return wi::le_p (max, wi::max_value (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); // if max <= signed-max-of-type then range doesn't need sign-extension >>> >>> I think we will have to check that ssa has necessary sign/zero extension >>> when assigned to lhs_type. If PROMOTE_MODE tells us that ssa's type will >>> be interpreted differently, the value range of ssa also will have >>> corresponding range. In this cases, shouldn’t we have to check for >>> upper and lower limit for both min and max? >> >> Hmm? That's exactly what the check is testing... we know that >> min <= max thus if min >= 0 then max >= 0. >> >> zero_extension will never do anything on [0, INF] >> >> If max < MAX-SIGNED then sign-extension will not do anything. Ok, >> sign-extension will do sth for negative values still. So rather >> >> if (rhs_uns) >> return wi::geu_p (min, 0); >> else >> return wi::ges_p (min, 0) && wi::les_p (max, wi::max_value >> (TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED)); >> >> ? > > Thanks for the explanation. I agree. Don’t we have to however check this > on lhs_uns as this function is checking if ssa is promoted for lhs_sign > and lhs_mode? > > Here is an attempt based on this. I ran regression testing with > arm-none-linux-gnueabi on qemu-arm without any new regressions. > > Sine I am not comparing value ranges to see if it can be represented in > lhs_sigh, I can now skip the PROMOTED_MODE check. Now I'm lost. You call this function from two contexts: diff --git a/gcc/calls.c b/gcc/calls.c index a3e6faa..eac512f 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1484,7 +1484,10 @@ precompute_arguments (int num_actuals, struct arg_data *args) args[i].initial_value = gen_lowpart_SUBREG (mode, args[i].value); SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1; - SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); + if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp)) + SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp); and @@ -9527,7 +9587,10 @@ expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode, temp = gen_lowpart_SUBREG (mode, decl_rtl); SUBREG_PROMOTED_VAR_P (temp) = 1; - SUBREG_PROMOTED_SET (temp, unsignedp); + if (is_promoted_for_type (ssa_name, mode, !unsignedp)) + SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED); + else + SUBREG_PROMOTED_SET (temp, unsignedp); return temp; } what's the semantic of setting SRP_SIGNED_AND_UNSIGNED on the subreg? That is, for the created (subreg:lhs_mode (reg: N))? it seems that we need to verify that 'ssa', when promoted, does not have bits set above the target modes MSB when we know it is zero-extended (according to PROMOTE_MODE)? Or has all bits set to one and is sign-extended (according to PROMOTE_MODE)? Now it seems that the promotion is according to promote_{function,decl}_mode in expand_expr_real_1 and according to promote_mode in calls.c. The function comment above promoted_for_type_p needs to be more elaborate on what invariant it checks. As you pass in the subreg mode but you need to verify the larger mode is properly extended. > I am still using wide_int::from (instead of wi::max_value) to get the > limit as I have to match the precision with min, max precision. > otherwise wide_int comparisons will not work. Is there a better way for > this? I don't understand. wi::max_value takes a precision argument. > > /* Return TRUE if value in SSA is already zero/sign extended for lhs type >(type here is the combination of LHS_MODE and LHS_UNS) using value range >information stored. Return FALSE otherwise. */ > bool > promoted_for_type_p (tree ssa, enum machine_mode lhs_mode, bool lhs_uns) > { > wide_int min, max, limit; > tree lhs_type; > bool rhs_uns; > signop rhs_signop; > > if (ssa == NULL_TREE > || TREE_CODE (ssa) != SSA_NAME > || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))) > return false; > > /* Return FALSE if value_range is not recorded for SSA. */ > if (get_range_info (ssa, &min, &max) != VR_RANGE) > return false; > > rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa)); > rhs_signop = rhs_uns ? UNSIGNED : SIGNED; > lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns); > limit = wide_int::from (TYPE_MAX_VALUE (lhs_type), > TYPE_PRECISION (TREE_TYPE (ssa)), SIGNED); > > if (lhs_uns) > /* If min >= 0 then range contains positive values and doesnt need >zero-extension. */ > return wi::ge_p (min, 0, rhs_signop); > else > /* If min >= 0 and max <=
Re: [PATCH 2/2] Enable elimination of zext/sext
On Tue, Aug 05, 2014 at 04:17:41PM +0200, Richard Biener wrote: > what's the semantic of setting SRP_SIGNED_AND_UNSIGNED > on the subreg? That is, for the created (subreg:lhs_mode > (reg: N))? SRP_SIGNED_AND_UNSIGNED on a subreg should mean that the subreg is both zero and sign extended, which means that the topmost bit of the narrower mode is known to be zero, and all bits above it in the wider mode are known to be zero too. SRP_SIGNED means that the topmost bit of the narrower mode is either 0 or 1 and depending on that the above wider mode bits are either all 0 or all 1. SRP_UNSIGNED means that regardless of the topmost bit value, all above wider mode bits are 0. Jakub
Re: [C PATCH] Discard P - (P + CST) optimization in pointer_diff (PR c/61240)
On Mon, Aug 04, 2014 at 02:04:36PM +0200, Richard Biener wrote: > > Looks like .fre can optimize "q - (q - 1)" into 1: > >: > >q.0_3 = (long int) &MEM[(void *)&i + 4B]; > >_5 = (long int) &i; > > - _6 = q.0_3 - _5; > > - t.1_7 = _6 /[ex] 4; > > - t ={v} t.1_7; > > + t ={v} 1; > >i ={v} {CLOBBER}; > >return; > > > > But associate_plusminus doesn't optimize it: > > else if (code == MINUS_EXPR > >&& CONVERT_EXPR_CODE_P (def_code) > >&& TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME > >&& TREE_CODE (rhs2) == SSA_NAME) > > { > > /* (T)(P + A) - (T)P -> (T)A. */ > > becase gimple_assign_rhs1 (def_stmt) is not an SSA_NAME, but ADDR_EXPR (it's > > &MEM[(void *)&i + 4B]). Then there's transformation A - (A +- B) -> -+ B > > below, but that doesn't handle casts. > > > > So - should I try to handle it in associate_plusminus? > > Yes please, with a (few) testcase(s). Ok, so the following adds the (T)P - (T)(P + A) -> (T)-A transformation. It is based on code hunk that does the (T)(P + A) - (T)P -> (T)A transformation. The difference it makes is in the .optimized dump something like: int fn2(int, int) (int p, int i) { - unsigned int p.2_2; + unsigned int _3; int _4; - unsigned int _5; unsigned int _6; - int _7; : - p.2_2 = (unsigned int) p_1(D); - _4 = p_1(D) + i_3(D); - _5 = (unsigned int) _4; - _6 = p.2_2 - _5; - _7 = (int) _6; - return _7; + _6 = (unsigned int) i_2(D); + _3 = -_6; + _4 = (int) _3; + return _4; i.e., the PLUS_EXPR and MINUS_EXPR are gone, and NEGATE_EXPR is used instead. During bootstrap with --enable-languages=c,c++ this optimization triggered 238 times. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2014-08-05 Marek Polacek PR c/61240 * tree-ssa-forwprop.c (associate_plusminus): Add (T)P - (T)(P + A) -> (T)-A transformation. c/ * c-typeck.c (pointer_diff): Remove P - (P + CST) optimization. testsuite/ * gcc.dg/pr61240.c: New test. * gcc.dg/tree-ssa/forwprop-29.c: New test. diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c index fe9440c..5f46944 100644 --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -3460,7 +3460,6 @@ pointer_diff (location_t loc, tree op0, tree op1) addr_space_t as0 = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op0))); addr_space_t as1 = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (op1))); tree target_type = TREE_TYPE (TREE_TYPE (op0)); - tree con0, con1, lit0, lit1; tree orig_op1 = op1; /* If the operands point into different address spaces, we need to @@ -3490,7 +3489,6 @@ pointer_diff (location_t loc, tree op0, tree op1) else inttype = restype; - if (TREE_CODE (target_type) == VOID_TYPE) pedwarn (loc, OPT_Wpointer_arith, "pointer of type % used in subtraction"); @@ -3498,50 +3496,6 @@ pointer_diff (location_t loc, tree op0, tree op1) pedwarn (loc, OPT_Wpointer_arith, "pointer to a function used in subtraction"); - /* If the conversion to ptrdiff_type does anything like widening or - converting a partial to an integral mode, we get a convert_expression - that is in the way to do any simplifications. - (fold-const.c doesn't know that the extra bits won't be needed. - split_tree uses STRIP_SIGN_NOPS, which leaves conversions to a - different mode in place.) - So first try to find a common term here 'by hand'; we want to cover - at least the cases that occur in legal static initializers. */ - if (CONVERT_EXPR_P (op0) - && (TYPE_PRECISION (TREE_TYPE (op0)) - == TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0, 0) -con0 = TREE_OPERAND (op0, 0); - else -con0 = op0; - if (CONVERT_EXPR_P (op1) - && (TYPE_PRECISION (TREE_TYPE (op1)) - == TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op1, 0) -con1 = TREE_OPERAND (op1, 0); - else -con1 = op1; - - if (TREE_CODE (con0) == POINTER_PLUS_EXPR) -{ - lit0 = TREE_OPERAND (con0, 1); - con0 = TREE_OPERAND (con0, 0); -} - else -lit0 = integer_zero_node; - - if (TREE_CODE (con1) == POINTER_PLUS_EXPR) -{ - lit1 = TREE_OPERAND (con1, 1); - con1 = TREE_OPERAND (con1, 0); -} - else -lit1 = integer_zero_node; - - if (operand_equal_p (con0, con1, 0)) -{ - op0 = lit0; - op1 = lit1; -} - - /* First do the subtraction as integers; then drop through to build the divide operator. Do not do default conversions on the minus operator diff --git gcc/testsuite/gcc.dg/pr61240.c gcc/testsuite/gcc.dg/pr61240.c index e69de29..115e415 100644 --- gcc/testsuite/gcc.dg/pr61240.c +++ gcc/testsuite/gcc.dg/pr61240.c @@ -0,0 +1,13 @@ +/* PR c/61240 */ +/* { dg-do compile } */ + +void +foo (void) +{ + volatile __PTRDIFF_TYPE__ t; + int i; + int *p = &i; + int *q = &i + 1; + t = q - (q - 1); + t = p - (p - 1); +} diff --git gcc/testsuite/gcc.dg/tre
Re: Fix vec_extract_lo constraint.
> I'd suggest op0: "=,v" and op1: "v,m". This > would result in op0:"=vm,v" op1:"v,m" and op0:"=v,v" op1:"v,m". > > Uros. Done. 2014-08-05 Ilya Tocar * common/config/i386/sse.md (vec_extract_lo_): Fix constraint. --- gcc/config/i386/sse.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 0f7ca27..3337104 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -5999,9 +5999,9 @@ (set_attr "mode" "")]) (define_insn "vec_extract_lo_" - [(set (match_operand: 0 "" "=") + [(set (match_operand: 0 "" "=,v") (vec_select: - (match_operand:V8FI 1 "nonimmediate_operand" "vm") + (match_operand:V8FI 1 "nonimmediate_operand" "v,m") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])))] "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))" -- 1.8.3.1
Re: Fix vec_extract_lo constraint.
On Tue, Aug 5, 2014 at 4:36 PM, Ilya Tocar wrote: >> I'd suggest op0: "=,v" and op1: "v,m". This >> would result in op0:"=vm,v" op1:"v,m" and op0:"=v,v" op1:"v,m". >> >> Uros. > > Done. > > 2014-08-05 Ilya Tocar > > * common/config/i386/sse.md (vec_extract_lo_): Fix > constraint. OK wverywhere. Thanks, Uros.
Re: [RFC: Patch, PR 60102] [4.9/4.10 Regression] powerpc fp-bit ices@dwf_regno
Jackub, Thanks for point this up. I apologize for the sloppiness. I fixed and committed the ChangeLogs on the branch, revision 213639 Also fixed the libgcc ChangeLog on trunk. Revision 213640 Edmar On 08/05/2014 03:11 AM, Jakub Jelinek wrote: On Mon, Aug 04, 2014 at 11:51:34AM -0500, Edmar wrote: Committed on trunk, revision 213596 Committed on 4.9 branch, revision 213597 Note the ChangeLog entry was grossly misformated. I've fixed it up in gcc/ChangeLog on the trunk, but not on the branch nor in libgcc. There should be no space before :, all lines in ChangeLog entry should be just tab indented rather than tab + 2 spaces, and filenames, unless they are too long, shouldn't be alone on the lines. And testsuite entries shouldn't go into gcc/ChangeLog. On 08/04/2014 05:25 AM, Ulrich Weigand wrote: David Edelsohn wrote: On Fri, Aug 1, 2014 at 2:03 PM, rohitarul...@freescale.com wrote: [libgcc] 2014-07-31 Rohit * config/rs6000/linux-unwind.h (ppc_fallback_frame_state): Update based on change in SPE high register numbers and 3 HTM registers. [gcc] 2014-07-31 Rohit * config/rs6000/rs6000.c (rs6000_reg_names) : Add SPE high register names. (alt_reg_names) : Likewise. (rs6000_dwarf_register_span) : For SPE high registers, replace dwarf register numbers with GCC hard register numbers. (rs6000_init_dwarf_reg_sizes_extra) : Likewise. (rs6000_dbx_register_number): For SPE high registers, return dwarf register number for the corresponding GCC hard register number. * config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTER) : Update based on 32 newly added GCC hard register numbers for SPE high registers. (DWARF_FRAME_REGISTERS) : Likewise. (DWARF_REG_TO_UNWIND_COLUMN) : Likewise. (DWARF_FRAME_REGNUM) : Likewise. (FIXED_REGISTERS) : Likewise. (CALL_USED_REGISTERS) : Likewise. (CALL_REALLY_USED_REGISTERS) : Likewise. (REG_ALLOC_ORDER) : Likewise. (enum reg_class) : Likewise. (REG_CLASS_NAMES) : Likewise. (REG_CLASS_CONTENTS) : Likewise. (SPE_HIGH_REGNO_P) : New macro to identify SPE high registers. * gcc.target/powerpc/pr60102.c: New testcase. The patch is okay with me if Uli is satisfied. Yes, this is fine with me. Jakub .
Re: [C++ Patch/RFC] PR 43906
Hi, On 08/05/2014 02:32 PM, Jason Merrill wrote: On 08/05/2014 08:10 AM, Paolo Carlini wrote: .. a clarification. As I tried to briefly explain yesterday, this kind of change means that: extern void z(); void il() { if (z != (void*)0) z(); } doesn't trigger anymore the pedwarn at beginning of composite_pointer_type about the comparison itself, for the simple reason that we don't call it anymore. So let's keep calling it when the RHS is also a pointer? Indeed ;) Then I'm finishing testing the below. Note: I also rearranged the conditionals, splitting out the TYPE_PTRDATAMEM_P case, which was causing confusion in some cases: we were feeding a TYPE_PTRDATAMEM_P and a TYPE_PTR_P to composite_pointer_type, thus obtaining immediately verbose diagnostic, instead of the expected clean one talking about invalid operands to operator!=. Thanks, Paolo. / Index: cp/typeck.c === --- cp/typeck.c (revision 213631) +++ cp/typeck.c (working copy) @@ -4353,13 +4353,22 @@ cp_build_binary_op (location_t location, && (code1 == INTEGER_TYPE || code1 == REAL_TYPE || code1 == COMPLEX_TYPE || code1 == ENUMERAL_TYPE)) short_compare = 1; - else if ((code0 == POINTER_TYPE && code1 == POINTER_TYPE) - || (TYPE_PTRDATAMEM_P (type0) && TYPE_PTRDATAMEM_P (type1))) - result_type = composite_pointer_type (type0, type1, op0, op1, - CPO_COMPARISON, complain); - else if ((code0 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type0)) - && null_ptr_cst_p (op1)) + else if (((code0 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type0)) + && null_ptr_cst_p (op1)) + /* Handle, eg, (void*)0 (c++/43906), and more. */ + || (code0 == POINTER_TYPE + && TYPE_PTR_P (type1) && integer_zerop (op1) + && (VOID_TYPE_P (TREE_TYPE (type1)) + || comptypes (TYPE_MAIN_VARIANT (TREE_TYPE (type0)), +TYPE_MAIN_VARIANT (TREE_TYPE (type1)), +COMPARE_BASE | COMPARE_DERIVED { + if (TYPE_PTR_P (type1)) + result_type = composite_pointer_type (type0, type1, op0, op1, + CPO_COMPARISON, complain); + else + result_type = type0; + if (TREE_CODE (op0) == ADDR_EXPR && decl_with_nonnull_addr_p (TREE_OPERAND (op0, 0))) { @@ -4368,11 +4377,23 @@ cp_build_binary_op (location_t location, warning (OPT_Waddress, "the address of %qD will never be NULL", TREE_OPERAND (op0, 0)); } - result_type = type0; } - else if ((code1 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type1)) - && null_ptr_cst_p (op0)) + else if (((code1 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type1)) + && null_ptr_cst_p (op0)) + /* Handle, eg, (void*)0 (c++/43906), and more. */ + || (code1 == POINTER_TYPE + && TYPE_PTR_P (type0) && integer_zerop (op0) + && (VOID_TYPE_P (TREE_TYPE (type0)) + || comptypes (TYPE_MAIN_VARIANT (TREE_TYPE (type0)), +TYPE_MAIN_VARIANT (TREE_TYPE (type1)), +COMPARE_BASE | COMPARE_DERIVED { + if (TYPE_PTR_P (type0)) + result_type = composite_pointer_type (type0, type1, op0, op1, + CPO_COMPARISON, complain); + else + result_type = type1; + if (TREE_CODE (op1) == ADDR_EXPR && decl_with_nonnull_addr_p (TREE_OPERAND (op1, 0))) { @@ -4381,8 +4402,11 @@ cp_build_binary_op (location_t location, warning (OPT_Waddress, "the address of %qD will never be NULL", TREE_OPERAND (op1, 0)); } - result_type = type1; } + else if ((code0 == POINTER_TYPE && code1 == POINTER_TYPE) + || (TYPE_PTRDATAMEM_P (type0) && TYPE_PTRDATAMEM_P (type1))) + result_type = composite_pointer_type (type0, type1, op0, op1, + CPO_COMPARISON, complain); else if (null_ptr_cst_p (op0) && null_ptr_cst_p (op1)) /* One of the operands must be of nullptr_t type. */ result_type = TREE_TYPE (nullptr_node); Index: testsuite/g++.dg/warn/Waddress-1.C === --- testsuite/g++.dg/warn/Waddress-1.C (revision 0) +++ testsuite/g++.dg/warn/Waddress-1.C (working copy) @@ -0,0 +1,42 @@ +// PR c++/43906 +// { dg-options "-Waddress -pedantic" } + +extern void z(); +typedef void (*ptrf) (); +typedef int (*ptrfn) (int); +int n; +const int m = 1; +struct S { }; +struct T
[PATCH] Add recursion check to gcc-ar/ranlib/nm
From: Andi Kleen This patch adds a recursion check to gcc-ar/ranlib/nm. The program avoids to call itself, so it can be directly put into the $PATH as a wrapper for the normal ar etc. The recursion check will only work on Linux (or systems with Linux like /proc) for now. It should fall back gracefully if it doesn't exist. libiberty/: 2014-08-04 Andi Kleen * pex-unix.c (find_path): Add new function. (pex_unix_exec_child): Use find_path. --- libiberty/pex-unix.c | 49 ++--- 1 file changed, 42 insertions(+), 7 deletions(-) diff --git a/libiberty/pex-unix.c b/libiberty/pex-unix.c index addf8ee..ba8ee92 100644 --- a/libiberty/pex-unix.c +++ b/libiberty/pex-unix.c @@ -585,6 +585,41 @@ pex_unix_exec_child (struct pex_obj *obj ATTRIBUTE_UNUSED, } #else + +/* Search $PATH for EXE, but avoid recursion. */ + +static char * +find_path (char *exe) +{ + char *p; + char *prefix, *path; + int n; + char *fn; + char me[1024]; + + if (strchr (exe, '/')) +return xstrdup (exe); + p = getenv("PATH"); + if (!p) +return NULL; + memset(me, 0, sizeof me); + n = readlink ("/proc/self/exe", me, sizeof me - 1); + if (n < 0) +return xstrdup (exe); + p = path = xstrdup (p); + while ((prefix = strsep (&path, ":")) != NULL) +{ + fn = concat (prefix, "/", exe, NULL); + if (access (fn, X_OK) == 0 + && strcmp (fn, me) != 0) + break; + free (fn); + fn = NULL; +} + free (p); + return fn; +} + /* Implementation of pex->exec_child using standard vfork + exec. */ static pid_t @@ -668,15 +703,15 @@ pex_unix_exec_child (struct pex_obj *obj, int flags, const char *executable, if ((flags & PEX_SEARCH) != 0) { - execvp (executable, to_ptr32 (argv)); - pex_child_error (obj, executable, "execvp", errno); - } - else - { - execv (executable, to_ptr32 (argv)); - pex_child_error (obj, executable, "execv", errno); + char *exe = find_path (executable); + if (exe == NULL) +pex_child_error (obj, executable, "access", errno); + executable = exe; } + execv (executable, to_ptr32 (argv)); + pex_child_error (obj, executable, "execvp", errno); + /* NOTREACHED */ return (pid_t) -1; -- 2.0.1
Re: [PATCH] [gomp4] Initial support of OpenACC loop directive in C front-end.
On 07/29/2014 02:07 AM, Thomas Schwinge wrote: > On Thu, 20 Mar 2014 15:42:48 +0100, I wrote: >> On Tue, 18 Mar 2014 14:50:44 +0100, I wrote: >>> On Tue, 18 Mar 2014 16:37:24 +0400, Ilmir Usmanov >>> wrote: This patch introduces support of OpenACC loop directive (and combined directives) in C front-end up to GENERIC. Currently no clause is allowed. >>> >>> Thanks! I had worked on a simpler patch, not yet dealing with combined >>> clauses. Also, I have some work for the GIMPLE level, namely building on >>> GIMPLE_OMP_FOR, adding a new GF_OMP_FOR_KIND_OACC_LOOP. I'll post this >>> soon. >> >> Here are the patches, committed in r208702..4 to gomp-4_0-branch. > > Cesar, I hope I'm not confusing things here, but I remember that you once > pointed out that in Fortran OpenACC, we have to explicitly specify the > loop iteration variable in a private clause, whereas the OpenACC > specification says this needs to happen automatically (predetermined data > attribute). I see gcc/fortran/openmp.c:gfc_resolve_do_iterator add this > private clause, which I assume we're using also for OpenACC loops. > Looking at -fdump-tree-all output for the following two test cases > complied in OpenMP and OpenACC mode, I see that indeed an explicit > private clause is added for Fortran code, but not for C. Why is that > required? (I have not yet spent any time on figuring this out myself.) > > int > main(void) > { > int i; > > #pragma acc parallel > #pragma acc loop > #pragma omp parallel for > for (i = 0; i < 10; ++i) > ; > > return 0; > } > > program test > implicit none > integer :: i > > !$acc parallel > !$acc loop > !$omp parallel do > DO i = 1, 10 > ENDDO > !$omp end parallel do > !$acc end parallel > end > > In light of this, please also review whether the following > gimplify_omp_for changes of mine (when I originally added the OACC_LOOP > support) are correct. Evidently, this code still has TODO markers in it; > this was well before you added preliminary support for the private > clause. That is, should we use GOVD_PRIVATE also for OACC_LOOP? OMP has both a private and a shared clause. And those neither of those clauses can share variables, i.e. an index variable cannot be both private and shared. By default, index variables are private in OMP, so the gimplification pass makes those variables explicitly private so that later passes can check for errors. Because the fortran frontend needs to do a little more error handling early on, it makes the induction variables private. > Why does this nevertheless currently work for C for loops without any > private clause for the loop iteration variable? Maybe because in C, the > loop iteration variable ends up in a register and that is not shared > between threads? I do see a private clause being added during > gimplification for the OpenMP C test case, but not for OpenACC -- which I > assume is precisely due to the code I'm quoting below. OMP lowering pass makes the index variables local to the parallel function/kernel. So each thread effectively has its private copy of the index variable. > Once this has been resolved, please also remove the explicit private > clauses from the existing test cases (Fortran and also C, as applicable) > to make sure that the predermined data attributes are tested there. Considering that ACC lacks a shared clause, it should be harmless to revert that patch below. Do you want to revert it with your private/firstprivate patch? Cesar >> --- gcc/gimplify.c >> +++ gcc/gimplify.c >> @@ -6683,14 +6683,36 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) >>gimple_seq for_body, for_pre_body; >>int i; >>bool simd; >> + enum gimplify_omp_var_data govd_private; >> + enum omp_region_type ort; >>bitmap has_decl_expr = NULL; >> >>orig_for_stmt = for_stmt = *expr_p; >> >> - simd = TREE_CODE (for_stmt) == OMP_SIMD >> -|| TREE_CODE (for_stmt) == CILK_SIMD; >> - gimplify_scan_omp_clauses (&OMP_FOR_CLAUSES (for_stmt), pre_p, >> - simd ? ORT_SIMD : ORT_WORKSHARE); >> + switch (TREE_CODE (for_stmt)) >> +{ >> +case OMP_FOR: >> +case OMP_DISTRIBUTE: >> + simd = false; >> + govd_private = GOVD_PRIVATE; >> + ort = ORT_WORKSHARE; >> + break; >> +case OACC_LOOP: >> + simd = false; >> + govd_private = /* TODO */ GOVD_LOCAL; >> + ort = /* TODO */ ORT_WORKSHARE; >> + break; >> +case OMP_SIMD: >> +case CILK_SIMD: >> + simd = true; >> + govd_private = GOVD_PRIVATE; >> + ort = ORT_SIMD; >> + break; >> +default: >> + gcc_unreachable (); >> +} >> + >> + gimplify_scan_omp_clauses (&OMP_FOR_CLAUSES (for_stmt), pre_p, ort); >> >>/* Handle OMP_FOR_INIT. */ >>for_pre_body = NULL; >> @@ -6722,6 +6744,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) >> >>
Re: Remove unnecessary and harmful fixincludes for Android
On Tue, Aug 5, 2014 at 7:10 AM, Bruce Korb wrote: > Hi, > > On Tue, Aug 5, 2014 at 4:35 AM, Alexander Ivchenko wrote: >>> Testing for *android* is less than ideal, because of the possibility of >>> configuring a *-linux* toolchain to have multilibs using various different >>> C libraries (with -mandroid being used to select the Android multilib). >>> So, specifying a bypass based on some relevant text that appears in the >>> header would be better. >>> >>> -- >>> Joseph S. Myers >>> jos...@codesourcery.com >> >> I've added check for "BIONIC" keyword. Hopefully it won't disappear. > > "based on some relevant text" > I think that's important, too (that it be relevant). > "BIONIC" is just some improbable text you found in the header. > My guess is that testing for '*android*' would be more selective, > and certainly less obscure. Who would ever guess that > "BIONIC" implies "android"? > >> Updated patch: >> >> diff --git a/fixincludes/ChangeLog b/fixincludes/ChangeLog >> index f7effee..e05412e 100644 >> --- a/fixincludes/ChangeLog >> +++ b/fixincludes/ChangeLog >> @@ -1,3 +1,10 @@ >> +2014-08-04 Alexander Ivchenko >> + >> + * inclhack.def (stdio_va_list): Bypass fix for Bionic. >> + (complier_h_tradcpp): Remove. >> + * fixincl.x: Regenerate. >> + * tests/base/linux/compiler.h: Remove. >> + >> 2014-04-22 Rainer Orth >> >> * inclhack.def (math_exception): Bypass on *-*-solaris2.1[0-9]*. >> diff --git a/fixincludes/fixincl.x b/fixincludes/fixincl.x > > [[generated text is not needed for approval]] > >> diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def >> index 6a1136c..bf452c6 100644 >> --- a/fixincludes/inclhack.def >> +++ b/fixincludes/inclhack.def >> @@ -1140,20 +1140,6 @@ fix = { >> }; >> >> /* >> - * Old Linux kernel's header breaks Traditional CPP >> - */ >> -fix = { >> -hackname = complier_h_tradcpp; > > [[ OK ]] > >> @@ -3722,8 +3708,9 @@ fix = { >> fix = { >> hackname = stdio_va_list; >> files= stdio.h; >> -bypass = '__gnuc_va_list|_BSD_VA_LIST_|__DJ_va_list|_G_va_list'; >> +bypass = >> '__gnuc_va_list|_BSD_VA_LIST_|__DJ_va_list|_G_va_list|BIONIC'; >> /* >> + * In Bionic va_list is always appropriately typedefed to >> __gnuc_va_list. > > And that typedef does not live in stdio.h either. > If there is no better way to identify this file, then > >> Is it ok? > > It is "okay". (You may be left with little choice -- I can't see the header > :). you can see the current version of bionic's stdio.h here: https://android.googlesource.com/platform/bionic/+/master/libc/include/stdio.h i'm happy to add any string to the header file that makes things easier. if you want 'x-gcc-no-fixincludes' or whatever in there, just say :-) (you're correct that the string 'BIONIC' is currently there only as a side-effect; our FORTIFY_SOURCE implementation uses a __BIONIC_FORTIFY_INLINE macro.)
Re: [RFA] Introdue warning_n; fix singulars in the final keyword wanrings
On Sun, 3 Aug 2014, Jan Hubicka wrote: > * diagnostic.c (warning_n): New function. > * diagnostic-core.h (warning_n): Declare. > * ipa-devirt.c (ipa_devirt): Handle singulars correctly; > output dynamic counts when available. OK, though note the pre-existing issue that these diagnostics start with uppercase letters, contrary to GNU style. -- Joseph S. Myers jos...@codesourcery.com
Re: Patch for constexpr variable templates
Applied with a few formatting/comment tweaks, thanks! Jason
Re: Remove unnecessary and harmful fixincludes for Android
Hi, On Tue, Aug 5, 2014 at 10:36 AM, enh wrote: > you can see the current version of bionic's stdio.h here: > > https://android.googlesource.com/platform/bionic/+/master/libc/include/stdio.h > > i'm happy to add any string to the header file that makes things > easier. if you want 'x-gcc-no-fixincludes' or whatever in there, just > say :-) That would be great, but you could also add: /* this file depends on __gnuc_va_list being used for va_list */ and not bother changing fixincludes at all. :) But either of those two comments added to the header would be preferable to looking for "BIONIC". Thank you! With one of the two changes, the patch is approved. Thanks!
Re: [PATCH 5/5] add libcc1
> "Jeff" == Jeff Law writes: Jeff> Obviously if there are no objections and you check in the change, Jeff> please be on the lookout for any fallout. I'm particularly concerned Jeff> about AIX, Solaris and other non-linux platforms. I did a build on the AIX box (gcc111) in the compile farm and didn't have any issues. The plugin isn't built there as plugin support seems to be disabled. Jeff> Does this deserve a mention in the news file? I suppose so, I will get someone here to write it. Tom
Re: [PATCH v2] fix pr62009 use after free in redirect_edge_var_map_dup
On Tue, Aug 05, 2014 at 02:42:17PM +0200, Richard Biener wrote: > On Tue, Aug 5, 2014 at 2:32 PM, wrote: > > From: Trevor Saunders > > > > hi, > > > > The change to get the entry for the old edge before inserting the new > > one was incorrect because if inserting the new one resized the table > > then the pointer to the entry for the old one would become invalid. > > > > gcc/ > > > > * tree-ssa.c (redirect_edge_var_map_dup): insert newe before > > getting olde. > > > > bootstrapping on and regtest on x86_64-unknown-linux-gnu and bootstrap on > > i686-linux-gnu ongoing, ok? > > Also works for me. committed as r213644, thanks. Trev > > Richard. > > > Trev > > > > --- > > gcc/tree-ssa.c | 7 --- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c > > index 920cbea..b6b3718 100644 > > --- a/gcc/tree-ssa.c > > +++ b/gcc/tree-ssa.c > > @@ -105,11 +105,12 @@ redirect_edge_var_map_dup (edge newe, edge olde) > >if (!edge_var_maps) > > return; > > > > - auto_vec *head = edge_var_maps->get (olde); > > - if (!head) > > + auto_vec *new_head = &edge_var_maps->get_or_insert (newe); > > + auto_vec *old_head = edge_var_maps->get (olde); > > + if (!old_head) > > return; > > > > - edge_var_maps->get_or_insert (newe).safe_splice (*head); > > + new_head->safe_splice (*old_head); > > } > > > > > > -- > > 2.0.1 > >
Re: Patch for constexpr variable templates
Hi, On 08/05/2014 08:26 PM, Jason Merrill wrote: Applied with a few formatting/comment tweaks, thanks! Great. I will double check but var-templ4.C fails for me with an ICE. Can anybody reproduce? Thanks! Paolo. /.../gcc/testsuite/g++.dg/cpp1y/var-templ4.C:8:17: internal compiler error: tree check: expected function_decl, have var_decl in check_explicit_specialization, at cp/pt.c:2822 0xde05a4 tree_check_failed(tree_node const*, char const*, int, char const*, ...) ../../trunk/gcc/tree.c:9174 0x5f9dd9 tree_check(tree_node*, char const*, int, char const*, tree_code) ../../trunk/gcc/tree.h:2729 0x5f9dd9 check_explicit_specialization(tree_node*, tree_node*, int, int) ../../trunk/gcc/cp/pt.c:2822 0x5af6e1 grokvardecl ../../trunk/gcc/cp/decl.c:8079 0x5af6e1 grokdeclarator(cp_declarator const*, cp_decl_specifier_seq*, decl_context, int, tree_node**) ../../trunk/gcc/cp/decl.c:10918 0x5b24b4 start_decl(cp_declarator const*, cp_decl_specifier_seq*, int, tree_node*, tree_node*, tree_node**) ../../trunk/gcc/cp/decl.c:4572 0x68ff5f cp_parser_init_declarator ../../trunk/gcc/cp/parser.c:16920 0x690b0a cp_parser_single_declaration ../../trunk/gcc/cp/parser.c:23494 0x69160d cp_parser_explicit_specialization ../../trunk/gcc/cp/parser.c:14418 0x69bfaf cp_parser_declaration ../../trunk/gcc/cp/parser.c:11152 0x69ab18 cp_parser_declaration_seq_opt ../../trunk/gcc/cp/parser.c:11085 0x69c3b3 cp_parser_translation_unit ../../trunk/gcc/cp/parser.c:4061 0x69c3b3 c_parse_file() ../../trunk/gcc/cp/parser.c:31954 0x7c0712 c_common_parse_file() ../../trunk/gcc/c-family/c-opts.c:1115
Re: [C PATCH] Discard P - (P + CST) optimization in pointer_diff (PR c/61240)
On 08/05/14 08:36, Marek Polacek wrote: On Mon, Aug 04, 2014 at 02:04:36PM +0200, Richard Biener wrote: Looks like .fre can optimize "q - (q - 1)" into 1: : q.0_3 = (long int) &MEM[(void *)&i + 4B]; _5 = (long int) &i; - _6 = q.0_3 - _5; - t.1_7 = _6 /[ex] 4; - t ={v} t.1_7; + t ={v} 1; i ={v} {CLOBBER}; return; But associate_plusminus doesn't optimize it: else if (code == MINUS_EXPR && CONVERT_EXPR_CODE_P (def_code) && TREE_CODE (gimple_assign_rhs1 (def_stmt)) == SSA_NAME && TREE_CODE (rhs2) == SSA_NAME) { /* (T)(P + A) - (T)P -> (T)A. */ becase gimple_assign_rhs1 (def_stmt) is not an SSA_NAME, but ADDR_EXPR (it's &MEM[(void *)&i + 4B]). Then there's transformation A - (A +- B) -> -+ B below, but that doesn't handle casts. So - should I try to handle it in associate_plusminus? Yes please, with a (few) testcase(s). Ok, so the following adds the (T)P - (T)(P + A) -> (T)-A transformation. It is based on code hunk that does the (T)(P + A) - (T)P -> (T)A transformation. The difference it makes is in the .optimized dump something like: int fn2(int, int) (int p, int i) { - unsigned int p.2_2; + unsigned int _3; int _4; - unsigned int _5; unsigned int _6; - int _7; : - p.2_2 = (unsigned int) p_1(D); - _4 = p_1(D) + i_3(D); - _5 = (unsigned int) _4; - _6 = p.2_2 - _5; - _7 = (int) _6; - return _7; + _6 = (unsigned int) i_2(D); + _3 = -_6; + _4 = (int) _3; + return _4; i.e., the PLUS_EXPR and MINUS_EXPR are gone, and NEGATE_EXPR is used instead. During bootstrap with --enable-languages=c,c++ this optimization triggered 238 times. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2014-08-05 Marek Polacek PR c/61240 * tree-ssa-forwprop.c (associate_plusminus): Add (T)P - (T)(P + A) -> (T)-A transformation. c/ * c-typeck.c (pointer_diff): Remove P - (P + CST) optimization. testsuite/ * gcc.dg/pr61240.c: New test. * gcc.dg/tree-ssa/forwprop-29.c: New test. So I'm all for delaying folding when possible, so I'm comfortable with the general direction this is going. My concern is the code we're removing discusses the need to simplify when these expressions are in static initializers. What's going to ensure that we're still simplifying instances which appear in static initializers? I don't see anything which tests that. And does it still work for targets which utilize PSImode? Jeff
Re: Replacement of isl_int by isl_val
On 08/03/14 17:44, Mircea Namolaru wrote: 2014-08-03 Mircea Namolaru Replacement of isl-int by isl_val * graphite-clast-to-gimple.c: include isl/val.h, isl/val_gmp.h (compute_bounds_for_param): use isl_val instead of isl_int (compute_bounds_for_loop): likewise * graphite-interchange.c: include isl/val.h, isl/val_gmp.h (build_linearized_memory_access): use isl_val instead of isl_int (pdr_stride_in_loop): likewise * graphite-optimize-isl.c: (getPrevectorMap): use isl_val instead of isl_int * graphite-poly.c: (pbb_number_of_iterations_at_time): use ils_val instead of isl_int graphite-sese-to-poly.c: include isl/val.h, isl/val_gmp.h (extern the_isl_ctx): declare (build_pbb_scattering_polyhedrons): use isl_val instead of isl_int (extract_affine_gmp): likewise (wrap): likewise (build_loop_iteration_domains): likewise (add_param_constraints): likewise (add_param_constraints): likewise This is good. Please install if you haven't already. jeff
Re: Fix build of *86*-linux-android with "--enable-shared"
On 08/04/14 00:08, Alexander Ivchenko wrote: Hi, libcilkrts is compiled with "-nostdlib", that means we have to explicitly specify the pthread library we should link with (e.g. we don't have such problem with libgomp, because it is C). And, indeed, "-lpthread" is hard-coded in the Makefile for cilkrts. For Android this doesn't work, because lpthread is absent and pthreads are part of libc. I also noticed, that configure check for "pthread_{,attr_}[sg]etaffinity_np" always fails, because at the point where it is placed in configure.ac, "-pthread" is not set. We just have to put this check after we added "-pthread" to CFLAGS. This patch addresses this as well. diff --git a/libcilkrts/ChangeLog b/libcilkrts/ChangeLog index 3881c82..ab10a0b 100644 --- a/libcilkrts/ChangeLog +++ b/libcilkrts/ChangeLog @@ -1,3 +1,15 @@ +2014-08-01 Alexander Ivchenko + + * configure.ac: Move pthread affinity test to the place where + '-pthread' passed to CFLAGS. Otherwise the test always fails. + (XCFLAGS): New variable for correctly passing + '-pthread'. + (XLDFLAGS): New variable for passing the correct pthread lib. + * configure: Regenerate. + * Makefile.am (AM_CFLAGS): Add $XCFLAGS. + (AM_LDFLAGS): Add $XLDFLAGS. + * Makefile.in: Regenerate. So can you confirm that you've bootstrapped this on x86_64-unknown-linux-gnu and that there were no regressions? Also double-check the indention in the ChangeLog entry, though it may just be your mailer that has mucked that up. Once the bootstrap and regression test are OK, this is OK. jeff
Re: [PATCH] Add statistical printout of rank_for_schedule decisions
On 08/03/14 22:51, Maxim Kuvyrkov wrote: On Jul 17, 2014, at 5:34 AM, Jeff Law wrote: On 07/13/14 22:17, Maxim Kuvyrkov wrote: Hi, This patch adds dump printouts for scheduling heuristics in rank_for_schedule. Rank_for_schedule is one of the cornerstones of haifa scheduler, yet its decisions are hard to track and debug. This patch adds statistical gathering for each branch of rank_for_schedule, and prints them out according to sched verbosity. This patch helped me track down several bugs in rank_for_schedule that result is stupid scheduling decisions. Bootstrapped and tested on x86_64-linux-gnu. OK to apply? Presmably you use the return increment, retval; construct to avoid the need for braces? I can see how it's useful here, but I don't think we've generally used comma operators like that and it's a style that I've never liked all that much. Could you go ahead and split it into two statements and add the necessary braces? Approved with that change. Code in rank_for_schedule is difficult to understand already, so I want to keep amount of "extra" code to a minimum. How about for following instead: "return rfs_result (RFS_xxx, )" ? Updated patch attached (the rest of the patch is unchanged). Yea, that seems reasonable. Good for the trunk. Thanks, Jeff
Re: [PATCH libstdc++ v3] - Add xmethods for std::vector and std::unique_ptr
Hi Jonathan, Thanks a lot for taking a look. The patch in question, and the GDB support, do not yet work with Python3. If that is a necessary requirement, I can make the changes and send a new version of the patch. Thanks, Siva Chandra On Mon, Aug 4, 2014 at 2:36 AM, Jonathan Wakely wrote: > On 25 July 2014 20:46, Siva Chandra wrote: >> The attached patch is identical to v2 except that I rebased it over >> the current head. >> >> To recollect, GDB now supports xmethods in its Python API: >> https://sourceware.org/gdb/current/onlinedocs/gdb/Xmethods-In-Python.html >> >> This feature will be available in GDB starting version 7.8 (which has >> not yet been released, but has been branched). The attached patch adds >> xmethods to the classes std::vector and std::unique_ptr. One can of >> course add xmethods to many other classes, but I am viewing this as >> the first patch in that series (though not a series yet) to get the >> basic infrastructure for adding more xmethods in place. >> >> Link to v1 posting: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg02339.html >> Link to v2 posting: https://gcc.gnu.org/ml/gcc-patches/2014-07/msg2.html >> >> ChangeLog >> >> 2014-07-25 Siva Chandra Reddy >> >> * python/libstdcxx/v6/xmethods.py: New file. >> * testsuite/lib/gdb-test.exp (gdb_version_check_xmethods): New >> function. >> (gdb-test): New optional argument LOAD_XMETHODS. Load xmethods >> python script if LOAD_XMETHODS is true. >> * testsuite/libstdc++-xmethods/unique_ptr.cc: New file. >> * testsuite/libstdc++-xmethods/vector.cc: New file. >> * testsuite/libstdc++-xmethods/xmethods.exp: New file. > > N.B. all patches for libstdc++ need to be sent to the libstdc++ > mailing list as well as the gcc-patches list, to ensure the right > reviewers see them. > > It looks like it should be OK, but do you know if the patch works with > both Python2 and Python3?
Re: [PATCH 01/50] Add rtl-iter.h
On 08/03/14 07:39, Richard Sandiford wrote: This patch adds the new iterators. gcc/ * rtl-iter.h: New file. * rtlanal.c: Include it. (rtx_all_subrtx_bounds, rtx_nonconst_subrtx_bounds): New variables. (generic_subrtx_iterator ::add_single_to_queue) (generic_subrtx_iterator ::add_subrtxes_to_queue) (generic_subrtx_iterator ::free_array): New functions. (generic_subrtx_iterator ::LOCAL_ELEMS): Define. (generic_subrtx_iterator ) (generic_subrtx_iterator (generic_subrtx_iterator ): Instantiate. (setup_reg_subrtx_bounds): New function. (init_rtlanal): Call it. OK. Just one nit... + +/* This structure describes the subrtxes of an rtx as follows: + + - if the rtx has no subrtxes, START and COUNT are both 0. Seems reasonable. +static inline bool +leaf_code_p (enum rtx_code code) +{ + return rtx_all_subrtx_bounds[code].count == 0; +} But we only check COUNT here. It's a minor inconsistency. Your call what (if anything) to do about it. Jeff
Re: [PATCH 02/50] alias.c:refs_newer_value_p
On 08/03/14 07:40, Richard Sandiford wrote: gcc/ * alias.c: Include rtl-iter.h. (refs_newer_value_cb): Delete. (refs_newer_value_p): Use FOR_EACH_SUBRTX instead of for_each_rtx. OK. Just a few notes. 1. I really like when we can make these callbacks go away :-) 2. When the callbacks go away, the opaque data we pass them goes away as well which is good for readability. 3. That opaque data is often a local and we had to take its address and thus force it to memory. With the direct uses in the updated code those locals don't have to live in memory, which is goodness. Anyway, just wanted to point those out for anyone looking at this in parallel or in the future. jeff
Re: [PATCH 04/50] caller-save.c:add_used_regs
On 08/03/14 07:48, Richard Sandiford wrote: As noted in https://gcc.gnu.org/ml/gcc-patches/2014-02/msg01391.html a bitmap-related cleanup turned add_used_regs_1 into a no-op for pseudo registers, because the result of: regno = reg_renumber[regno]; is never used. This patch does as Steven requested and adds an assert that no allocated pseudos are seen here. gcc/ * caller-save.c: Include rtl-iter.h. (add_used_regs_1): Delete. (add_used_regs): Use FOR_EACH_SUBRTX rather than for_each_rtx to iterate over subrtxes. Assert that any remaining pseudos have been spilled. OK. jeff
Re: [PATCH 05/50] calls.c:internal_arg_pointer_based_exp
On 08/03/14 07:48, Richard Sandiford wrote: gcc/ * calls.c: Include rtl-iter.h. (internal_arg_pointer_based_exp_1): Delete. (internal_arg_pointer_based_exp): Take a const_rtx. Use FOR_EACH_SUBRTX to iterate over subrtxes. OK. jeff
Re: [PATCH 06/50] combine.c:unmentioned_reg_p
On 08/03/14 07:50, Richard Sandiford wrote: gcc/ * combine.c: Include rtl-iter.h. (unmentioned_reg_p_1): Delete. (unmentioned_reg_p): Use FOR_EACH_SUBRTX rather than for_each_rtx. Don't handle null rtxes. OK. Jeff
Re: [PATCH 07/50] combine.c:record_truncated_values
On 08/03/14 07:50, Richard Sandiford wrote: gcc/ * combine.c (record_truncated_value): Turn from being a for_each_rtx callback to a function that takes an rtx and returns a bool (record_truncated_values): Use FOR_EACH_SUBRTX_VAR instead of for_each_rtx. OK. jeff
Re: [PATCH 09/50] cfgcleanup.c:mentions_nonequal_regs
On 08/03/14 07:53, Richard Sandiford wrote: gcc/ * cfgcleanup.c: Include rtl-iter.h. (mentions_nonequal_regs): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. (thread_jump): Update accordingly. OK. jeff
Re: [PATCH 10/50] cse.c:approx_reg_cost
On 08/03/14 07:54, Richard Sandiford wrote: gcc/ * cse.c: Include rtl-iter.h. (approx_reg_cost_1): Delete. (approx_reg_cost): Use FOR_EACH_SUBRTX instead of for_each_rtx. Don't handle null rtxes. OK. jeff
Re: [PATCH 13/50] cse.c:is_dead_debug_insn
On 08/03/14 07:57, Richard Sandiford wrote: gcc/ * cse.c (is_dead_reg): Change argument to const_rtx. (dead_debug_insn_data): Delete. (is_dead_debug_insn): Expand commentary. Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. Take the fields of dead_debug_insn_data as argument. (delete_trivially_dead_insns): Update call accordingly. OK. jeff
Re: [PATCH 14/50] cse.c:cse_change_cc_mode
On 08/03/14 07:58, Richard Sandiford wrote: gcc/ * cse.c (change_cc_mode_args): Delete. (cse_change_cc_mode): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. Take the fields of change_cc_mode_args as argument and return void. (cse_change_cc_mode_insn): Update calls accordingly. OK. jeff
Re: [PATCH 15/50] ddg.c:mark_mem_use
On 08/03/14 07:58, Richard Sandiford wrote: gcc/ * ddg.c: Include rtl-iter.h. (mark_mem_use_1): Rename to... (mark_mem_use): ...deleting old mark_mem_use. Use FOR_EACH_SUBRTX instead of for_each_rtx. (mem_read_insn_p): Update accordingly. OK. jeff
Re: [PATCH 16/50] ddg.c:insns_may_alias_p
On 08/03/14 07:59, Richard Sandiford wrote: gcc/ * ddg.c (walk_mems_2, walk_mems_1): Delete. (insns_may_alias_p): Use FOR_EACH_SUBRTX rather than for_each_rtx to iterate over subrtxes. Return a bool rather than an int. OK. Jeff
Re: [PATCH] Making it easier to set breakpoints in gdb on pass->execute methods
On Wed, 2014-07-30 at 10:07 +0200, Richard Biener wrote: > On Tue, Jul 29, 2014 at 10:44 PM, David Malcolm wrote: > > A complaint I heard at Cauldron with the C++ification of GCC passes is > > that it's become much more difficult to set breakpoints on the execute > > hooks of a pass, now that the passes are classes within anonymous > > namespaces. > > > > When this was first done, the execute methods were trivial > > implementations that called into the existing named functions, which are > > still easy to put a breakpoint on by name (assuming you know the name of > > the function), but some of these have now been converted so that the > > "execute" method is the body of the pass. > > > > I did some experimentation, on this box with > > gdb-7.6.50.20130731-19.fc20.x86_64 and gcc trunk r212913 (the latter > > from a week ago). > > > > You *can* set a breakpoint by name on such an execute method, but it's > > tedious to type: > > (gdb) break '(anonymous namespace)::pass_expand::execute' > > Breakpoint 7 at 0x655220: file ../../src/gcc/cfgexpand.c, line > > > > ...since tab-completion doesn't work well: > > > > (gdb) break '(a > > does tab complete to: > > (gdb) break '(anonymous namespace):: > > > > but typing anything else then hitting tab returns back to: > > (gdb) break '(anonymous namespace):: > > > > Is anyone else seeing this? > > Yes, I filed a gdb bug about this. > > > I had a go at implementing a workaround, for the lack of tab completion > > (and the general verbosity) using gdbhooks.py. > > > > Attached is a patch to gdbhooks.py which adds a new command > > "break-on-pass" to gdb; in particular, it locates and parses passes.def, > > so that it can tab-complete on pass classnames: > > > > Example of use from the script: putting a breakpoint on "final", i.e. > > classname "pass_final": > > > > (gdb) break-on-pass > > Press ; it autocompletes to "pass_": > > (gdb) break-on-pass pass_ > > Press : > > Display all 219 possibilities? (y or n) > > Press "n"; then type "f": > > (gdb) break-on-pass pass_f > > Press to autocomplete to pass classnames beginning with "pass_f": > > pass_fast_rtl_dce pass_fold_builtins > > pass_feedback_split_functions pass_forwprop > > pass_final pass_fre > > pass_fixup_cfg pass_free_cfg > > Type "in" to complete to "pass_final": > > (gdb) break-on-pass pass_final > > ...and hit : > > Breakpoint 6 at 0x8396ba: file ../../src/gcc/final.c, line 4526. > > ...and we have a breakpoint set; continue execution: > > (gdb) cont > > Continuing. > > Breakpoint 6, (anonymous namespace)::pass_final::execute (this=0x17fb990) > > at ../../src/gcc/final.c:4526 > > 4526virtual unsigned int execute (function *) { return > > rest_of_handle_final (); } > > > > This assumes you've suitably enabled gdbhooks.py, as documented at the > > top of that file. > > > > Thoughts? > > Works for me though I'm not able to review python ;) I've gone ahead and committed it to trunk as r213646; hope this reduces the pain somewhat. > Richard. > > > Dave > > > > gcc/ > > * gdbhooks.py (find_gcc_source_dir): New helper function. > > (class PassNames): New class, locating and parsing passes.def. > > (class BreakOnPass): New command "break-on-pass". > >
Re: [PATCH 18/50] dse.c:check_mem_read_use
On 08/03/14 08:02, Richard Sandiford wrote: gcc/ * dse.c: Include rtl-iter.h. (check_mem_read_rtx): Change void * parameter to real type. Remove return value. (check_mem_read_use): Fix comment. Use FOR_EACH_SUBRTX_PTR instead of for_each_rtx. Don't handle null rtxes. OK. Jeff
Re: [PATCH 21/50] Faster for_each_rtx-like iterators
On 08/03/14 08:07, Richard Sandiford wrote: The switch statement in the old code seemed overly cautious. It's well established elsewhere that the first operand of an RTX_AUTOINC is the automodified register. If anyone wanted to add a new code for which that wasn't true they should (a) reconsider or (b) go through all RTX_AUTOINCs as a precaution. gcc/ * emit-rtl.c: Include rtl-iter.h. (find_auto_inc): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. Assume the first operand to an RTX_AUTOINC is the automodified register. (try_split): Update call accordingly. OK. It seems like we ought to document somewhere that all autoincrement RTXs increment their first operand. If you can find a goodplace, please add that little doc update as a pre-approved patch. jeff
Re: [PATCH] Making it easier to set breakpoints in gdb on pass->execute methods
On Tue, Aug 05, 2014 at 05:02:57PM -0400, David Malcolm wrote: > On Wed, 2014-07-30 at 10:07 +0200, Richard Biener wrote: > > On Tue, Jul 29, 2014 at 10:44 PM, David Malcolm wrote: > > > A complaint I heard at Cauldron with the C++ification of GCC passes is > > > that it's become much more difficult to set breakpoints on the execute > > > hooks of a pass, now that the passes are classes within anonymous > > > namespaces. > > > > > > When this was first done, the execute methods were trivial > > > implementations that called into the existing named functions, which are > > > still easy to put a breakpoint on by name (assuming you know the name of > > > the function), but some of these have now been converted so that the > > > "execute" method is the body of the pass. > > > > > > I did some experimentation, on this box with > > > gdb-7.6.50.20130731-19.fc20.x86_64 and gcc trunk r212913 (the latter > > > from a week ago). > > > > > > You *can* set a breakpoint by name on such an execute method, but it's > > > tedious to type: > > > (gdb) break '(anonymous namespace)::pass_expand::execute' > > > Breakpoint 7 at 0x655220: file ../../src/gcc/cfgexpand.c, line > > > > > > ...since tab-completion doesn't work well: > > > > > > (gdb) break '(a > > > does tab complete to: > > > (gdb) break '(anonymous namespace):: > > > > > > but typing anything else then hitting tab returns back to: > > > (gdb) break '(anonymous namespace):: > > > > > > Is anyone else seeing this? > > > > Yes, I filed a gdb bug about this. > > > > > I had a go at implementing a workaround, for the lack of tab completion > > > (and the general verbosity) using gdbhooks.py. > > > > > > Attached is a patch to gdbhooks.py which adds a new command > > > "break-on-pass" to gdb; in particular, it locates and parses passes.def, > > > so that it can tab-complete on pass classnames: > > > > > > Example of use from the script: putting a breakpoint on "final", i.e. > > > classname "pass_final": > > > > > > (gdb) break-on-pass > > > Press ; it autocompletes to "pass_": > > > (gdb) break-on-pass pass_ > > > Press : > > > Display all 219 possibilities? (y or n) > > > Press "n"; then type "f": > > > (gdb) break-on-pass pass_f > > > Press to autocomplete to pass classnames beginning with "pass_f": > > > pass_fast_rtl_dce pass_fold_builtins > > > pass_feedback_split_functions pass_forwprop > > > pass_final pass_fre > > > pass_fixup_cfg pass_free_cfg > > > Type "in" to complete to "pass_final": > > > (gdb) break-on-pass pass_final > > > ...and hit : > > > Breakpoint 6 at 0x8396ba: file ../../src/gcc/final.c, line 4526. > > > ...and we have a breakpoint set; continue execution: > > > (gdb) cont > > > Continuing. > > > Breakpoint 6, (anonymous namespace)::pass_final::execute > > > (this=0x17fb990) at ../../src/gcc/final.c:4526 > > > 4526virtual unsigned int execute (function *) { return > > > rest_of_handle_final (); } > > > > > > This assumes you've suitably enabled gdbhooks.py, as documented at the > > > top of that file. > > > > > > Thoughts? > > > > Works for me though I'm not able to review python ;) > > I've gone ahead and committed it to trunk as r213646; hope this reduces > the pain somewhat. So, I was thinking about this more at some point. iirc the reason to put the classes in the anon namespace was to get stuff devirtualized. However I think we can just use __final if building with gcc to acomplish the same thing, and then remove the namespace {}s. It needs to be tested, but if it works it would be even better :-) Trev > > > Richard. > > > > > Dave > > > > > > gcc/ > > > * gdbhooks.py (find_gcc_source_dir): New helper function. > > > (class PassNames): New class, locating and parsing passes.def. > > > (class BreakOnPass): New command "break-on-pass". > > > > >
Re: [PATCH 24/50] fwprop.c:varying_mem_p
On 08/03/14 08:10, Richard Sandiford wrote: gcc/ * fwprop.c: Include rtl-iter.h. (varying_mem_p): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. (propagate_rtx): Update accordingly. OK. jeff
Re: [PATCH 22/50] final.c:mark_symbol_refs_as_used
On 08/03/14 08:08, Richard Sandiford wrote: gcc/ * final.c: Include rtl-iter.h. (mark_symbol_ref_as_used): Delete. (mark_symbol_refs_as_used): Use FOR_EACH_SUBRTX instead of for_each_rtx. OK. jeff
Re: [PATCH 25/50] ira.c:set_paradoxical_subreg
On 08/03/14 08:10, Richard Sandiford wrote: gcc/ * ira.c: Include rtl-iter.h. (set_paradoxical_subreg): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. Remove handling of null rtxes. (update_equiv_regs): Update call accordingly. OK. jeff
Re: [PATCH 26/50] jump.c:returnjump_p
On 08/03/14 08:11, Richard Sandiford wrote: gcc/ * jump.c: Include rtl-iter.h. (returnjump_p_1): Delete. (returnjump_p): Use FOR_EACH_SUBRTX rather than for_each_rtx. Remove handling of null rtxes. OK. Jeff
Re: [PATCH 27/50] jump.c:eh_returnjump_p
On 08/03/14 08:12, Richard Sandiford wrote: gcc/ * jump.c (eh_returnjump_p_1): Delete. (eh_returnjump_p): Use FOR_EACH_SUBRTX rather than for_each_rtx. Remove handling of null rtxes. OK. jeff
Re: [PATCH 31/50] lower-subreg.c:resolve_debug
On 08/03/14 08:15, Richard Sandiford wrote: gcc/ * lower-subreg.c (adjust_decomposed_uses): Delete. (resolve_debug): Use FOR_EACH_SUBRTX_PTR rather than for_each_rtx. Remove handling of null rtxes. OK. jeff
Re: [PATCH 29/50] loop-iv.c:altered_reg_used
On 08/03/14 08:14, Richard Sandiford wrote: gcc/ * loop-iv.c (altered_reg_used): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. (simplify_using_condition, simplify_using_initial_values): Update accordingly. Ok. jeff
Re: [PATCH 33/50] reg-stack.c:subst_all_stack_regs_in_debug_insn
On 08/03/14 08:18, Richard Sandiford wrote: gcc/ * reg-stack.c: Include rtl-iter.h. (subst_stack_regs_in_debug_insn): Delete. (subst_all_stack_regs_in_debug_insn): Use FOR_EACH_SUBRTX_PTR instead of for_each_rtx. OK. Jeff
Re: [PATCH 35/50] regcprop.c:cprop_find_used_regs
On 08/03/14 08:19, Richard Sandiford wrote: gcc/ * regcprop.c (cprop_find_used_regs_1): Delete. (cprop_find_used_regs): Use FOR_EACH_SUBRTX instead of for_each_rtx. OK. jeff
Re: [PATCH 34/50] regcprop.c:kill_autoinc_value
On 08/03/14 08:19, Richard Sandiford wrote: gcc/ * regcprop.c: Include rtl-iter.h. (kill_value): Take a const_rtx. (kill_autoinc_value): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. (copyprop_hardreg_forward_1): Update accordingly. OK. FWIW, I obviously focused on patches that fit on a page or so first. I'd hoped to make a full pass through today, then pick up the rest tomorrow, but family commitments are starting earlier than planned. I'm going to get through as many of the painfully trivial patches as I can before disappearing for the day. jeff
Re: [PATCH 39/50] rtlanal.c:record_hard_reg_uses
On 08/03/14 08:27, Richard Sandiford wrote: find_all_hard_regs seems like a useful function so I split it out of the note_uses callback and exposed it in rtl.h. I have (or had) other patches that make use of it. gcc/ * rtl.h (find_all_hard_regs): Declare. * rtlanal.c (find_all_hard_regs): New function. (record_hard_reg_uses_1): Delete. (record_hard_reg_uses): Use find_all_hard_regs. OK. Jeff
Re: [PATCH 41/50] rtlanal.c:tls_referenced_p
On 08/03/14 08:33, Richard Sandiford wrote: gcc/ * rtl.h (tls_referenced_p): Take a const_rtx rather than an rtx. * rtlanal.c (tls_referenced_p_1): Delete. (tls_referenced_p): Take a const_rtx rather than an rtx. Use FOR_EACH_SUBRTX rather than for_each_rtx. OK. Jeff
Re: [PATCH 43/50] store-motion.c:extract_mentioned_regs
On 08/03/14 08:34, Richard Sandiford wrote: gcc/ * store-motion.c: Include rtl-iter.h. (extract_mentioned_regs_1): Delete. (extract_mentioned_regs): Use FOR_EACH_SUBRTX_VAR rather than for_each_rtx to iterate over subrtxes. OK. jeff
Re: [PATCH 44/50] var-tracking.c:rtx_debug_expr_p
On 08/03/14 08:35, Richard Sandiford wrote: gcc/ * var-tracking.c: Include rtl-iter.h. (rtx_debug_expr_p): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. (use_type): Update accordingly. OK. jeff
Re: [PATCH 45/50] var-tracking.c:non_suitable_const
On 08/03/14 08:36, Richard Sandiford wrote: gcc/ * var-tracking.c (non_suitable_const): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. Remove handling of null rtxes. (add_uses): Update accordingly. OK. Jeff
Re: [PATCH 47/50] var-tracking.c:add_uses
On 08/03/14 08:38, Richard Sandiford wrote: gcc/ * var-tracking.c (add_uses): Take an rtx rather than an rtx *. Give real type of data parameter. Remove return value. (add_uses_1): Use FOR_EACH_SUBRTX_VAR rather than for_each_rtx to iterate over subrtxes. OK. jeff
Re: [PATCH 17/50] df-problems.c:find_memory
On 08/03/14 08:02, Richard Sandiford wrote: This also fixes what I think is a bug: find_memory used to stop at the first MEM it found. If that MEM was nonvolatile and nonconstant, we'd return MEMREF_NORMAL even if there was another volatile MEM. gcc/ * df-problems.c: Include rtl-iter.h. (find_memory): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. Continue to look for volatile references even after a nonvolatile one has been found. (can_move_insns_across): Update calls accordingly. OK. It'd probably be fairly difficult to test for that bug as most of our targets don't allow multiple memory operands in a single insn. But I agree with your assessment. Good catch. jeff
Re: [PATCH 20/50] dwarf2out.c:resolve_one_addr
On 08/03/14 08:04, Richard Sandiford wrote: gcc/ * dwarf2out.c (resolve_one_addr): Remove unused data parameter. Return a bool, inverting the result so that 0/false means "not ok". Use FOR_EACH_SUBRTX_PTR instead of for_each_rtx to iterate over subrtxes of a CONST. (mem_loc_descriptor, add_const_value_attribute) (resolve_addr_in_expr): Update calls accordingly. OK. jeff
Re: [PATCH 19/50] dwarf2out.c:const_ok_for_output
On 08/03/14 08:03, Richard Sandiford wrote: gcc/ * dwarf2out.c: Include rtl-iter.h. (const_ok_for_output_1): Take the rtx instead of a pointer to it. Remove unused data parameter. Return a bool, inverting the result so that 0/false means "not ok". (const_ok_for_output): Update accordingly. Use FOR_EACH_SUBRTX_VAR instead of for_each_rtx. OK. jeff
Re: [PATCH 28/50] loop-iv.c:replace_single_def_regs
On 08/03/14 08:13, Richard Sandiford wrote: gcc/ * loop-iv.c: Include rtl-iter.h. (find_single_def_src): New function. (replace_single_def_regs): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. (replace_in_expr, simplify_using_initial_values): Update accordingly. OK. Jeff
Re: [PATCH 36/50] reload1.c:note_reg_elim_costly
On 08/03/14 08:20, Richard Sandiford wrote: gcc/ * reload1.c: Include rtl-iter.h. (note_reg_elim_costly): Turn from being a for_each_rtx callback to being a function that examines each subrtx itself. (eliminate_regs_1, elimination_costs_in_insn): Update accordingly. OK. jeff
Re: [PATCH 37/50] rtlanal.c:rtx_referenced_p
On 08/03/14 08:22, Richard Sandiford wrote: The old function handled constant pool SYMBOL_REFs by going straight to the underlying constant, which meant you couldn't test for the SYMBOL_REF itself. gcc/ * rtl.h (get_pool_constant, rtx_referenced_p): Replace rtx parameters with const_rtx parameters. * varasm.c (get_pool_constant): Likewise. * rtlanal.c (rtx_referenced_p_1): Delete. (rtx_referenced_p): Use FOR_EACH_SUBRTX instead of for_each_rtx. Assert that the rtx we're looking for is nonnull. Allow searches for constant pool SYMBOL_REFs. OK. Jeff
Re: [PATCH 38/50] rtlanal.c:replace_label
On 08/03/14 08:25, Richard Sandiford wrote: The main change here is to handle ADDR_VEC and ADDR_DIFF_VECs specially, since they can have many elements and are a relatively important case for this function. This is for speed rather than correctness. gcc/ * rtl.h (replace_label_data): Delete. (replace_label): Take the old label, new label and update-nuses flag as direct arguments. Return void. * cfgcleanup.c (outgoing_edges_match): Update accordingly. * rtlanal.c (replace_label): Update interface as above. Handle JUMP_TABLE_DATA as a special case. Handle JUMPs outside the iterator. Use FOR_EACH_SUBRTX_PTR. OK. jeff
Re: [PATCH 42/50] sel-sched.c:count_occurrences_equiv
On 08/03/14 08:33, Richard Sandiford wrote: gcc/ * sel-sched.c: Include rtl-iter.h (count_occurrences_1): Delete. (count_occurrences_equiv): Turn rtxes into const_rtxes. Use FOR_EACH_SUBRTX rather than for_each_rtx. OK. Jeff
Re: [PATCH 50/50] varasm.c:compute_reloc_for_rtx
On 08/03/14 08:45, Richard Sandiford wrote: There's no point calling for_each_rtx/FOR_EACH_SUBRTX on a LABEL_REF or SYMBOL_REF. We can just handle them directly instead. gcc/ * varasm.c (compute_reloc_for_rtx_1): Take a const_rtx. Remove the pointer to the cumulative reloc value and return the value for this reloc instead. (compute_reloc_for_rtx): Take a const_rtx. Call compute_reloc_for_rtx_1 directly for SYMBOL_REF and LABEL_REF, avoiding any recursion. Use FOR_EACH_SUBRTX rather than for_each_rtx for the CONST case. OK. Jeff