Re: [Patch, Fortran, 66927, v2] [6 Regression] ICE in gfc_conf_procedure_call
Dear Andre, As far as I can see, the problems with PR57117 are specific to RESHAPE and need not affect committing your patch. To my surprise, the combination of your patch and mine for PR67171 fixes PR67044 in that the ICE no longer occurs. I have to get my head around how to write a testcase for it that tests the functionality though! You can commit this patch to trunk. As I said elsewhere, I will rename the testcase for PR67171. Many thanks for the patch. Paul On 23 October 2015 at 09:44, Paul Richard Thomas wrote: > Dear Andre, > > I will wait until you fix the problems that Dominique has pointed out. > However, if by Sunday afternoon (rain forecast!) you haven't found the > time, I will see if I can locate the source of these new problems. > > With best regards > > Paul > > On 7 October 2015 at 19:51, Dominique d'Humières wrote: >> This patch also fixes pr57117 comment 2, the original test and the test in >> comment 3 now give an ICE >> >> pr57117.f90:82:0: >> >>allocate(z(9), source=reshape(x, (/ 9 /))) >> 1 >> internal compiler error: Segmentation fault: 11 >> >> and pr67044. >> >> Thanks, >> >> Dominique >> > > > > -- > Outside of a dog, a book is a man's best friend. Inside of a dog it's > too dark to read. > > Groucho Marx -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: [PATCH] PR/67682, break SLP groups up if only some elements match
On 23 October 2015 at 16:20, Alan Lawrence wrote: > diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > index ab54a48..b012d78 100644 > --- a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > @@ -16,12 +16,12 @@ main1 (unsigned int x, unsigned int y) >unsigned int *pout = &out[0]; >unsigned int a0, a1, a2, a3; > > - /* Non isomorphic. */ > + /* Non isomorphic, even 64-bit subgroups. */ >a0 = *pin++ + 23; > - a1 = *pin++ + 142; > + a1 = *pin++ * 142; >a2 = *pin++ + 2; >a3 = *pin++ * 31; Erm, oops, I seem to have posted a version without the corresponding change to result-checking in bb-slp-7.c... Also on second thoughts a small change to improve efficiency of the recursion by skipping some known-impossible bits, would not add much complexity. So I'll post a new version shortly with those changes. Thanks, Alan
Re: [PATCH] PR/67682, break SLP groups up if only some elements match
On Sun, Oct 25, 2015 at 7:51 PM, Alan Lawrence wrote: > On 23 October 2015 at 16:20, Alan Lawrence wrote: >> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c >> b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c >> index ab54a48..b012d78 100644 >> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c >> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c >> @@ -16,12 +16,12 @@ main1 (unsigned int x, unsigned int y) >>unsigned int *pout = &out[0]; >>unsigned int a0, a1, a2, a3; >> >> - /* Non isomorphic. */ >> + /* Non isomorphic, even 64-bit subgroups. */ >>a0 = *pin++ + 23; >> - a1 = *pin++ + 142; >> + a1 = *pin++ * 142; >>a2 = *pin++ + 2; >>a3 = *pin++ * 31; > > Erm, oops, I seem to have posted a version without the corresponding > change to result-checking in bb-slp-7.c... > > Also on second thoughts a small change to improve efficiency > of the recursion by skipping some known-impossible bits, would not add > much complexity. > > So I'll post a new version shortly with those changes. Maybe it is better to make a new test for the changed file and keep the old one with the updated result checking? That way the testcase is the same between versions and the only thing that changes is the result checking. Thanks, Andrew > > Thanks, Alan
Re: [Patch, Fortran, 66927, v2] [6 Regression] ICE in gfc_conf_procedure_call
Hi Paul, hi all, thanks for the review. Submitted as r229294. Regards, Andre On Sun, 25 Oct 2015 08:43:24 +0100 Paul Richard Thomas wrote: > Dear Andre, > > As far as I can see, the problems with PR57117 are specific to RESHAPE > and need not affect committing your patch. To my surprise, the > combination of your patch and mine for PR67171 fixes PR67044 in that > the ICE no longer occurs. I have to get my head around how to write a > testcase for it that tests the functionality though! > > You can commit this patch to trunk. As I said elsewhere, I will rename > the testcase for PR67171. > > Many thanks for the patch. > > Paul > > On 23 October 2015 at 09:44, Paul Richard Thomas > wrote: > > Dear Andre, > > > > I will wait until you fix the problems that Dominique has pointed out. > > However, if by Sunday afternoon (rain forecast!) you haven't found the > > time, I will see if I can locate the source of these new problems. > > > > With best regards > > > > Paul > > > > On 7 October 2015 at 19:51, Dominique d'Humières wrote: > >> This patch also fixes pr57117 comment 2, the original test and the test in > >> comment 3 now give an ICE > >> > >> pr57117.f90:82:0: > >> > >>allocate(z(9), source=reshape(x, (/ 9 /))) > >> 1 > >> internal compiler error: Segmentation fault: 11 > >> > >> and pr67044. > >> > >> Thanks, > >> > >> Dominique > >> > > > > > > > > -- > > Outside of a dog, a book is a man's best friend. Inside of a dog it's > > too dark to read. > > > > Groucho Marx > > > -- Andre Vehreschild * Email: vehre ad gmx dot de Index: gcc/fortran/trans.h === --- gcc/fortran/trans.h (Revision 229293) +++ gcc/fortran/trans.h (Arbeitskopie) @@ -378,7 +378,7 @@ void gfc_reset_vptr (stmtblock_t *, gfc_expr *); void gfc_reset_len (stmtblock_t *, gfc_expr *); tree gfc_get_vptr_from_expr (tree); -tree gfc_get_class_array_ref (tree, tree); +tree gfc_get_class_array_ref (tree, tree, tree); tree gfc_copy_class_to_class (tree, tree, tree, bool); bool gfc_add_finalizer_call (stmtblock_t *, gfc_expr *); bool gfc_add_comp_finalizer_call (stmtblock_t *, tree, gfc_component *, bool); Index: gcc/fortran/trans-array.c === --- gcc/fortran/trans-array.c (Revision 229293) +++ gcc/fortran/trans-array.c (Arbeitskopie) @@ -3250,7 +3250,7 @@ { type = gfc_get_element_type (type); tmp = TREE_OPERAND (cdecl, 0); - tmp = gfc_get_class_array_ref (offset, tmp); + tmp = gfc_get_class_array_ref (offset, tmp, NULL_TREE); tmp = fold_convert (build_pointer_type (type), tmp); tmp = build_fold_indirect_ref_loc (input_location, tmp); return tmp; @@ -7107,9 +7107,20 @@ } else if (GFC_ARRAY_TYPE_P (TREE_TYPE (desc)) || se->use_offset) { + bool toonebased; tmp = gfc_conv_array_lbound (desc, n); + toonebased = integer_onep (tmp); + // lb(arr) - from (- start + 1) tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (base), tmp, from); + if (onebased && toonebased) + { + tmp = fold_build2_loc (input_location, MINUS_EXPR, + TREE_TYPE (base), tmp, start); + tmp = fold_build2_loc (input_location, PLUS_EXPR, + TREE_TYPE (base), tmp, + gfc_index_one_node); + } tmp = fold_build2_loc (input_location, MULT_EXPR, TREE_TYPE (base), tmp, gfc_conv_array_stride (desc, n)); @@ -7183,12 +7194,13 @@ /* For class arrays add the class tree into the saved descriptor to enable getting of _vptr and the like. */ if (expr->expr_type == EXPR_VARIABLE && VAR_P (desc) - && IS_CLASS_ARRAY (expr->symtree->n.sym) - && DECL_LANG_SPECIFIC (expr->symtree->n.sym->backend_decl)) + && IS_CLASS_ARRAY (expr->symtree->n.sym)) { gfc_allocate_lang_decl (desc); GFC_DECL_SAVED_DESCRIPTOR (desc) = - GFC_DECL_SAVED_DESCRIPTOR (expr->symtree->n.sym->backend_decl); + DECL_LANG_SPECIFIC (expr->symtree->n.sym->backend_decl) ? + GFC_DECL_SAVED_DESCRIPTOR (expr->symtree->n.sym->backend_decl) + : expr->symtree->n.sym->backend_decl; } if (!se->direct_byref || se->byref_noassign) { Index: gcc/fortran/trans-expr.c === --- gcc/fortran/trans-expr.c (Revision 229293) +++ gcc/fortran/trans-expr.c (Arbeitskopie) @@ -1039,9 +1039,10 @@ of the referenced element. */ tree -gfc_get_class_array_ref (tree index, tree class_decl) +gfc_get_class_array_ref (tree index, tree class_decl, tree data_comp) { - tree data = gfc_class_data_get (class_decl); + tree data = data_comp != NULL_TREE ? data_comp : + gfc_class_data_get (class_decl); tree size = gfc_class_vtab_size_get (class_decl); tree offset = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type, @@ -1075,6 +1076,7 @@ tree stdcopy; tree extcopy; tree
Re: Re; [Patch, fortran] PR67171 - [6 regression] sourced allocation
> Le 24 oct. 2015 à 21:08, Dominique d'Humières a écrit : > > >> Le 24 oct. 2015 à 15:46, Dominique d'Humières a écrit : >> >> Dear Paul, >> >> AFAICT no patch! >> >> Dominique >> > > If I am not mistaken, your patch fixes pr67528 also. Confirmed with a clean trunk with your patch. It also fixes the ICEs in pr61829 and pr61830. Dominique > > Dominique >
Re: Re; [Patch, fortran] PR67171 - [6 regression] sourced allocation
> It also fixes the ICEs in pr61829 and pr61830. I meant pr61829 and not pr61829. Dominique
[PATCH, i386]: fix PR 68084, Inverted conditions generated for x86 inline assembly
Hello! Inverted condition is generated with =@ccae. 2015-10-25 Uros Bizjak PR target/68084 * config/i386/i386.c (ix86_md_asm_adjust) [case 'a']: Use NE code for =@ccae. testsuite/ChangeLog: 2015-10-25 Uros Bizjak PR target/68084 * gcc.target/i386/pr68084.c: New test. Bootstrapped and regression tested on x86_64-linux {,-m32}, committed to mainline SVN. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 229293) +++ config/i386/i386.c (working copy) @@ -46934,7 +46934,7 @@ ix86_md_asm_adjust (vec &outputs, vec &/ if (con[1] == 0) mode = CCAmode, code = EQ; else if (con[1] == 'e' && con[2] == 0) - mode = CCCmode, code = EQ; + mode = CCCmode, code = NE; break; case 'b': if (con[1] == 0) Index: testsuite/gcc.target/i386/pr68084.c === --- testsuite/gcc.target/i386/pr68084.c (revision 0) +++ testsuite/gcc.target/i386/pr68084.c (working copy) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O" } */ + +int x; + +void foo (void) +{ + char r; + + asm ("" : "=@ccae"(r)); + + if (!r) +x = 0; +} + +/* { dg-final { scan-assembler "jnc" } } */
Re: Re; [Patch, fortran] PR67171 - [6 regression] sourced allocation
> Le 25 oct. 2015 à 14:13, Dominique d'Humières a écrit : > >> It also fixes the ICEs in pr61829 and pr61830. > > I meant pr61829 and not pr61829. Please read pr61819 and pr61830. Sorry for the noise. Dominique
[gomp4] Adjust UNQUE ifn
I've applied this patch to gomp4 branch. It's the reworking of IFN_UNIQUE suggested by Richard & Jakub. 1) IFN_UNIQUE is a ctrl-altering call, and thus ends up at the end of a BB. 2) tracer only needs to check that stmt (and it'a already looking at it for other reasons) 3) IFN_UNIQUE is no longer ECF_LEAF 4) Inserted a data dependency chain to the had & tail call sequence. The 2nd param is the result of the previous call in the chain. Preparing updated trunk patches now ... nathan 2015-10-25 Nathan Sidwell * internal-fn.def (IFN_UNIQUE): Not a leaf. (IFN_UNIQUE, IFN_GOACC_LOOP): Move sub codes to ... * internal-fn.h (enum ifn_unique_kind, enum ifn_goacc_loop_kind): ... here. New enums. * internal-fn.c (expand_UNIQUE): Deal with data dependency var. * tree-cfg.c (gimple_call_initialize_ctrl_altering): Check for unique internal fn call. * config/nvptx/nvptx.md (oacc_fork, oacc_join): Deal with data dependency src & dest. * config/nvptx/nvptx.c (nvptx_xform_fork_join): Rename to ... (nvptx_goacc_fork_join): ... here. Skip date dependency arg. * tracer.c (ignore_bb_p): Just look at last stmt for UNIQUE. * omp-low.c (lower_oacc_head_mark): Take data dependency arg. Use quick_push. (lower_oacc_loop_marker): Take data dependency arg. (lower_oacc_head_tail): Insert data dependency var. (new_oacc_loop): Adjust arg numbering. (dump_oacc_loop_part): Cope with block-straddling sequences. (oacc_loop_discover_walk): Likewise. (oacc_loop_xform_head_tail): Likewise. (execute_oacc_device_lower): Use two bools for scanning & deletion. Index: gcc/config/nvptx/nvptx.md === --- gcc/config/nvptx/nvptx.md (revision 229276) +++ gcc/config/nvptx/nvptx.md (working copy) @@ -1400,20 +1400,28 @@ ) (define_expand "oacc_fork" - [(unspec_volatile:SI [(match_operand:SI 0 "const_int_operand" "")] - UNSPECV_FORKED)] + [(set (match_operand:SI 0 "nvptx_nonmemory_operand" "") +(match_operand:SI 1 "nvptx_general_operand" "")) + (unspec_volatile:SI [(match_operand:SI 2 "const_int_operand" "")] + UNSPECV_FORKED)] "" { - nvptx_expand_oacc_fork (INTVAL (operands[0])); + if (operands[0] != const0_rtx) +emit_move_insn (operands[0], operands[1]); + nvptx_expand_oacc_fork (INTVAL (operands[2])); DONE; }) (define_expand "oacc_join" - [(unspec_volatile:SI [(match_operand:SI 0 "const_int_operand" "")] - UNSPECV_JOIN)] + [(set (match_operand:SI 0 "nvptx_nonmemory_operand" "") +(match_operand:SI 1 "nvptx_general_operand" "")) + (unspec_volatile:SI [(match_operand:SI 2 "const_int_operand" "")] + UNSPECV_JOIN)] "" { - nvptx_expand_oacc_join (INTVAL (operands[0])); + if (operands[0] != const0_rtx) +emit_move_insn (operands[0], operands[1]); + nvptx_expand_oacc_join (INTVAL (operands[2])); DONE; }) Index: gcc/config/nvptx/nvptx.c === --- gcc/config/nvptx/nvptx.c (revision 229276) +++ gcc/config/nvptx/nvptx.c (working copy) @@ -4296,10 +4296,10 @@ nvptx_dim_limit (unsigned axis) /* Determine whether fork & joins are needed. */ static bool -nvptx_xform_fork_join (gcall *call, const int dims[], +nvptx_goacc_fork_join (gcall *call, const int dims[], bool ARG_UNUSED (is_fork)) { - tree arg = gimple_call_arg (call, 1); + tree arg = gimple_call_arg (call, 2); unsigned axis = TREE_INT_CST_LOW (arg); /* We only care about worker and vector partitioning. */ @@ -4844,7 +4844,7 @@ nvptx_use_anchors_for_symbol (const_rtx #define TARGET_GOACC_DIM_LIMIT nvptx_dim_limit #undef TARGET_GOACC_FORK_JOIN -#define TARGET_GOACC_FORK_JOIN nvptx_xform_fork_join +#define TARGET_GOACC_FORK_JOIN nvptx_goacc_fork_join #undef TARGET_GOACC_REDUCTION #define TARGET_GOACC_REDUCTION nvptx_goacc_reduction Index: gcc/tracer.c === --- gcc/tracer.c (revision 229276) +++ gcc/tracer.c (working copy) @@ -93,25 +93,20 @@ bb_seen_p (basic_block bb) static bool ignore_bb_p (basic_block bb) { - gimple_stmt_iterator gsi; - gimple *g; - if (bb->index < NUM_FIXED_BLOCKS) return true; if (optimize_bb_for_size_p (bb)) return true; - /* A transaction is a single entry multiple exit region. It must be - duplicated in its entirety or not at all. */ - g = last_stmt (CONST_CAST_BB (bb)); - if (g && gimple_code (g) == GIMPLE_TRANSACTION) -return true; - - /* Ignore blocks containing non-clonable function calls. */ - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + if (gimple *g = last_stmt (CONST_CAST_BB (bb))) { - g = gsi_stmt (gsi); + /* A transaction is a single entry multiple exit region. It + must be duplicated in its entirety or not at all. */ + if (gimple_code (g) == GIMPLE_TRANSACTION) + return true; + /* An IFN_UNIQUE call must be duplicated as part of its gro
Re: [OpenACC 4/11] C FE changes
On 10/23/15 17:25, Nathan Sidwell wrote: On 10/23/15 16:17, Cesar Philippidis wrote: Nathan, can you try out this patch with your updated patch set? I saw some test cases getting stuck when expanding expand_GOACC_DIM_SIZE in on the host compiler, which is wrong. I don't see that happening in gomp-4_0-branch with this patch. Also, can you merge this patch along with the c++ and new test case patches to trunk? I'll handle the gomp4 backport. Wilco. testing your patch on trunk along with my IFN_UNIQUE changes shows good results. nathan
Re: [OpenACC 1/11] UNIQUE internal function
Richard, Jakub, here is an updated patch. Changes from previous version 1) Moved the subcodes to an enumeration in internal-fn.h 2) Remove ECF_LEAF 3) Added check in initialize_ctrl_altering 4) tracer code now (continues) to only look in last stmt of block I looked at fnsplit and do not believe I need changes there. That's changing things like: if (cheap test) do cheap thing else do complex thing to break out the else part into a separate function. That's fine -- it'll copy the whole CFG of interest. I'll be posting an updated 7/11 patch shortly. comments? nathan 2015-10-25 Nathan Sidwell * internal-fn.c (expand_UNIQUE): New. * internal-fn.h (enum ifn_unique_kind): New. * internal-fn.def (IFN_UNIQUE): New. * gimple.h (gimple_call_internal_unique_p): New. * gimple.c (gimple_call_same_target_p): Check internal fn uniqueness. * tracer.c (ignore_bb_p): Check for IFN_UNIQUE call. * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts): Likewise. * tree-cfg.c (gmple_call_initialize_ctrl_altering): Likewise. Index: gcc/tree-ssa-threadedge.c === --- gcc/tree-ssa-threadedge.c (revision 229276) +++ gcc/tree-ssa-threadedge.c (working copy) @@ -283,6 +283,17 @@ record_temporary_equivalences_from_stmts && gimple_asm_volatile_p (as_a (stmt))) return NULL; + /* If the statement is a unique builtin, we can not thread + through here. */ + if (gimple_code (stmt) == GIMPLE_CALL) + { + gcall *call = as_a (stmt); + + if (gimple_call_internal_p (call) + && gimple_call_internal_unique_p (call)) + return NULL; + } + /* If duplicating this block is going to cause too much code expansion, then do not thread through this block. */ stmt_count++; Index: gcc/internal-fn.def === --- gcc/internal-fn.def (revision 229276) +++ gcc/internal-fn.def (working copy) @@ -65,3 +65,10 @@ DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL) + +/* An unduplicable, uncombinable function. Generally used to preserve + a CFG property in the face of jump threading, tail merging or + other such optimizations. The first argument distinguishes + between uses. See internal-fn.h for usage. */ +DEF_INTERNAL_FN (UNIQUE, ECF_NOTHROW, NULL) Index: gcc/gimple.c === --- gcc/gimple.c (revision 229276) +++ gcc/gimple.c (working copy) @@ -1346,7 +1346,8 @@ gimple_call_same_target_p (const gimple { if (gimple_call_internal_p (c1)) return (gimple_call_internal_p (c2) - && gimple_call_internal_fn (c1) == gimple_call_internal_fn (c2)); + && gimple_call_internal_fn (c1) == gimple_call_internal_fn (c2) + && !gimple_call_internal_unique_p (as_a (c1))); else return (gimple_call_fn (c1) == gimple_call_fn (c2) || (gimple_call_fndecl (c1) Index: gcc/gimple.h === --- gcc/gimple.h (revision 229276) +++ gcc/gimple.h (working copy) @@ -2895,6 +2895,21 @@ gimple_call_internal_fn (const gimple *g return gimple_call_internal_fn (gc); } +/* Return true, if this internal gimple call is unique. */ + +static inline bool +gimple_call_internal_unique_p (const gcall *gs) +{ + return gimple_call_internal_fn (gs) == IFN_UNIQUE; +} + +static inline bool +gimple_call_internal_unique_p (const gimple *gs) +{ + const gcall *gc = GIMPLE_CHECK2 (gs); + return gimple_call_internal_unique_p (gc); +} + /* If CTRL_ALTERING_P is true, mark GIMPLE_CALL S to be a stmt that could alter control flow. */ Index: gcc/internal-fn.c === --- gcc/internal-fn.c (revision 229276) +++ gcc/internal-fn.c (working copy) @@ -1958,6 +1958,30 @@ expand_VA_ARG (gcall *stmt ATTRIBUTE_UNU gcc_unreachable (); } +/* Expand the IFN_UNIQUE function according to its first argument. */ + +static void +expand_UNIQUE (gcall *stmt) +{ + rtx pattern = NULL_RTX; + int code = TREE_INT_CST_LOW (gimple_call_arg (stmt, 0)); + + switch (code) +{ +default: + gcc_unreachable (); + +case IFN_UNIQUE_UNSPEC: +#ifdef HAVE_unique + pattern = gen_unique (); +#endif + break; +} + + if (pattern) +emit_insn (pattern); +} + /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: Index: gcc/internal-fn.h === --- gcc/internal-fn.h (revision 229276) +++ gcc/internal-fn.h (working copy) @@ -20,6 +20,11 @@ along with GCC; see the file COPYING3. #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +/* INTEGER_CST value
Re: [PATCH v4] SH FDPIC backend support
On Fri, 2015-10-23 at 02:32 -0400, Rich Felker wrote: > Here's my updated version of the FDPIC patch with all requested > changes made and Changelog added. I've included all the original > authors. This is my first time writing such an extensive Changelog > entry so please let me know if there are things I got wrong. I took the liberty and fixed some minor formatting trivia and extracted functions sh_emit_storesi and sh_emit_storehi which are used in sh_trampoline_init to effectively memcpy code into the trampoline area. Can you please check it? If it's OK I'll commit the attached patch to trunk. Cheers, OlegIndex: gcc/config/sh/constraints.md === --- gcc/config/sh/constraints.md (revision 229290) +++ gcc/config/sh/constraints.md (working copy) @@ -25,6 +25,7 @@ ;; Bsc: SCRATCH - for the scratch register in movsi_ie in the ;; fldi0 / fldi0 cases ;; Cxx: Constants other than only CONST_INT +;; Ccl: call site label ;; Css: signed 16-bit constant, literal or symbolic ;; Csu: unsigned 16-bit constant, literal or symbolic ;; Csy: label or symbol @@ -233,6 +234,11 @@ hence mova is being used, hence do not select this pattern." (match_code "scratch")) +(define_constraint "Ccl" + "A call site label, for bsrf." + (and (match_code "unspec") + (match_test "XINT (op, 1) == UNSPEC_CALLER"))) + (define_constraint "Css" "A signed 16-bit constant, literal or symbolic." (and (match_code "const") Index: gcc/config/sh/linux.h === --- gcc/config/sh/linux.h (revision 229290) +++ gcc/config/sh/linux.h (working copy) @@ -67,7 +67,8 @@ #define GLIBC_DYNAMIC_LINKER "/lib/ld-linux.so.2" #undef SUBTARGET_LINK_EMUL_SUFFIX -#define SUBTARGET_LINK_EMUL_SUFFIX "_linux" +#define SUBTARGET_LINK_EMUL_SUFFIX "%{mfdpic:_fd;:_linux}" + #undef SUBTARGET_LINK_SPEC #define SUBTARGET_LINK_SPEC \ "%{shared:-shared} \ Index: gcc/config/sh/sh-c.c === --- gcc/config/sh/sh-c.c (revision 229290) +++ gcc/config/sh/sh-c.c (working copy) @@ -137,6 +137,11 @@ builtin_define ("__HITACHI__"); if (TARGET_FMOVD) builtin_define ("__FMOVD_ENABLED__"); + if (TARGET_FDPIC) +{ + builtin_define ("__SH_FDPIC__"); + builtin_define ("__FDPIC__"); +} builtin_define (TARGET_LITTLE_ENDIAN ? "__LITTLE_ENDIAN__" : "__BIG_ENDIAN__"); Index: gcc/config/sh/sh-mem.cc === --- gcc/config/sh/sh-mem.cc (revision 229290) +++ gcc/config/sh/sh-mem.cc (working copy) @@ -108,29 +108,30 @@ rtx r4 = gen_rtx_REG (SImode, 4); rtx r5 = gen_rtx_REG (SImode, 5); - function_symbol (func_addr_rtx, "__movmemSI12_i4", SFUNC_STATIC); + rtx lab = function_symbol (func_addr_rtx, "__movmemSI12_i4", + SFUNC_STATIC).lab; force_into (XEXP (operands[0], 0), r4); force_into (XEXP (operands[1], 0), r5); - emit_insn (gen_block_move_real_i4 (func_addr_rtx)); + emit_insn (gen_block_move_real_i4 (func_addr_rtx, lab)); return true; } else if (! optimize_size) { - const char *entry_name; rtx func_addr_rtx = gen_reg_rtx (Pmode); - int dwords; rtx r4 = gen_rtx_REG (SImode, 4); rtx r5 = gen_rtx_REG (SImode, 5); rtx r6 = gen_rtx_REG (SImode, 6); - entry_name = (bytes & 4 ? "__movmem_i4_odd" : "__movmem_i4_even"); - function_symbol (func_addr_rtx, entry_name, SFUNC_STATIC); + rtx lab = function_symbol (func_addr_rtx, bytes & 4 + ? "__movmem_i4_odd" + : "__movmem_i4_even", + SFUNC_STATIC).lab; force_into (XEXP (operands[0], 0), r4); force_into (XEXP (operands[1], 0), r5); - dwords = bytes >> 3; + int dwords = bytes >> 3; emit_insn (gen_move_insn (r6, GEN_INT (dwords - 1))); - emit_insn (gen_block_lump_real_i4 (func_addr_rtx)); + emit_insn (gen_block_lump_real_i4 (func_addr_rtx, lab)); return true; } else @@ -144,10 +145,10 @@ rtx r5 = gen_rtx_REG (SImode, 5); sprintf (entry, "__movmemSI%d", bytes); - function_symbol (func_addr_rtx, entry, SFUNC_STATIC); + rtx lab = function_symbol (func_addr_rtx, entry, SFUNC_STATIC).lab; force_into (XEXP (operands[0], 0), r4); force_into (XEXP (operands[1], 0), r5); - emit_insn (gen_block_move_real (func_addr_rtx)); + emit_insn (gen_block_move_real (func_addr_rtx, lab)); return true; } @@ -161,7 +162,7 @@ rtx r5 = gen_rtx_REG (SImode, 5); rtx r6 = gen_rtx_REG (SImode, 6); - function_symbol (func_addr_rtx, "__movmem", SFUNC_STATIC); + rtx lab = function_symbol (func_addr_rtx, "__movmem", SFUNC_STATIC).lab; force_into (XEXP (operands[0], 0), r4); force_into (XEXP (operands[1], 0), r5); @@ -174,7 +175,7 @@ final_switch = 16 - ((bytes / 4) % 16); while_loop = ((bytes / 4) / 16 - 1) * 16;
Re: [OpenACC 7/11] execution model
Jakub, Richard, here's an updated version of patch 7, the early half of OpenACC lowering. I've addressed all of Jakub's earlier comments. The significant change is that now the head/tail unique markers are threaded on a data dependency variable. I'd not noticed its lack being a problem, but this is certainly more robust in showing the ordering dependency between calls. The dependency var is the 2nd parameter, and all others are simply shifted along by one. At RTL generation time the date dependency is exposed to the RTL expander, which in the PTX case simply does a src->dst move, which will eventually be deleted as unnecessary. comments? nathan 2015-10-25 Nathan Sidwell * internal-fn.def (IFN_GOACC_LOOP): New. * internal-fn.h (enum ifn_unique_kind): Add IFN_UNIQUE_OACC_FORK, IFN_UNIQUE_OACC_JOIN, IFN_UNIQUE_OACC_HEAD_MARK, IFN_UNIQUE_OACC_TAIL_MARK. (enum ifn_goacc_loop_kind): New. * internal-fn.c (expand_UNIQUE): Add IFN_UNIQUE_OACC_FORK, IFN_UNIQUE_OACC_JOIN cases. (expand_OACC_LOOP): New. (IFN_GOACC_LOOP_CHUNKS, IFN_GOACC_LOOP_STEP, IFN_GOACC_LOOP_OFFSET, IFN_GOACC_LOOP_BOUND): New. * internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_FORK, IFN_UNIQUE_OACC_JOIN. (expand_GOACC_DIM_SIZE, expand_GOACC_DIM_POS, expand_GOACC_LOOP): New. * omp-low.c (struct omp_context): Remove gwv_below, gwv_this fields. (enum oacc_loop_flags): New. (enclosing_target_ctx): May return NULL. (ctx_in_oacc_kernels_region): New. (is_oacc_parallel, is_oaccc_kernels): New. (check_oacc_kernel_gwv): New. (oacc_loop_or_target_p): Delete. (scan_omp_for): Don't calculate gwv mask. Check parallel clause operands. Strip reductions fro kernels. (scan_omp_target): Don't calculate gwv mask. (lower_oacc_head_mark, lower_oacc_loop_marker, lower_oacc_head_tail): New. (expand_omp_for_static_nochunk, expand_omp_for_static_chunk): Remove OpenACC handling. (struct oacc_collapse): New. (expand_oacc_collapse_init, expand_oacc_collapse_vars): New. (expand_oacc_for): New. (expand_omp_for): Call expand_oacc_for. (lower_omp_for): Call lower_oacc_head_tail. Index: gcc/internal-fn.def === --- gcc/internal-fn.def (revision 229276) +++ gcc/internal-fn.def (working copy) @@ -65,9 +65,12 @@ DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL) /* An unduplicable, uncombinable function. Generally used to preserve a CFG property in the face of jump threading, tail merging or other such optimizations. The first argument distinguishes between uses. See internal-fn.h for usage. */ DEF_INTERNAL_FN (UNIQUE, ECF_NOTHROW, NULL) + +/* OpenACC looping abstraction. See internal-fn.h for usage. */ +DEF_INTERNAL_FN (GOACC_LOOP, ECF_PURE | ECF_NOTHROW, NULL) Index: gcc/internal-fn.c === --- gcc/internal-fn.c (revision 229276) +++ gcc/internal-fn.c (working copy) @@ -1958,30 +1958,69 @@ expand_VA_ARG (gcall *stmt ATTRIBUTE_UNU gcc_unreachable (); } /* Expand the IFN_UNIQUE function according to its first argument. */ static void expand_UNIQUE (gcall *stmt) { rtx pattern = NULL_RTX; int code = TREE_INT_CST_LOW (gimple_call_arg (stmt, 0)); switch (code) { default: gcc_unreachable (); case IFN_UNIQUE_UNSPEC: #ifdef HAVE_unique pattern = gen_unique (); #endif break; + +case IFN_UNIQUE_OACC_FORK: +case IFN_UNIQUE_OACC_JOIN: + { + tree lhs = gimple_call_lhs (stmt); + rtx target = const0_rtx; + + if (lhs) + target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + + rtx data_dep = expand_normal (gimple_call_arg (stmt, 1)); + rtx axis = expand_normal (gimple_call_arg (stmt, 2)); + + if (code == IFN_UNIQUE_OACC_FORK) + { +#ifdef HAVE_oacc_fork + pattern = gen_oacc_fork (target, data_dep, axis); +#else + gcc_unreachable (); +#endif + } + else + { +#ifdef HAVE_oacc_join + pattern = gen_oacc_join (target, data_dep, axis); +#else + gcc_unreachable (); +#endif + } + } + break; } if (pattern) emit_insn (pattern); } +/* This is expanded by oacc_device_lower pass. */ + +static void +expand_GOACC_LOOP (gcall *stmt ATTRIBUTE_UNUSED) +{ + gcc_unreachable (); +} + /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: Index: gcc/internal-fn.h === --- gcc/internal-fn.h (revision 229276) +++ gcc/internal-fn.h (working copy) @@ -20,10 +20,52 @@ along with GCC; see the file COPYING3. #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H /* INTEGER_CST values for IFN_UNIQUE function arg-0. */ enum ifn_unique_kind { IFN_UNIQUE_UNSPEC, /*
[committed] Skip g++.dg/Wno-frame-address.C on hppa
It is not possible to access arbitrary stack frames on hppa. Committed to trunk. Dave -- John David Anglin dave.ang...@bell.net 2015-10-25 John David Anglin * g++.dg/Wno-frame-address.C: Skip on hppa*-*-*. Index: g++.dg/Wno-frame-address.C === --- g++.dg/Wno-frame-address.C (revision 229257) +++ g++.dg/Wno-frame-address.C (working copy) @@ -1,5 +1,5 @@ // { dg-do compile } -// { dg-skip-if "Cannot access arbitrary stack frames." { arm*-*-* } } +// { dg-skip-if "Cannot access arbitrary stack frames." { arm*-*-* hppa*-*-* } } // { dg-options "-Werror" } // Verify that -Wframe-address is not enabled by default by enabling
[committed] Canonicalize both function and method types for comparison
The attached change fixes a bug compiling the qtbase-opensource-src package. It compares pointers to method types. This failed as only FUNCTION_TYPES were canonicalized on hppa. On 32-bit hppa, pointers to functions including methods point to non unique function descriptors and need canonicalization prior to comparison. The attached change fixes this problem. 32-bit hppa is the only target that currently needs this canonicalization. Tested on hppa2.0w-hp-hpux11.11 and hppa-unknown-linux-gnu with no observed regressions. Committed to trunk. Dave -- John David Anglin dave.ang...@bell.net 2015-10-25 John David Anglin PR middle-end/68079 * dojump.c (do_compare_and_jump): Canonicalize both function and method types. Index: dojump.c === --- dojump.c(revision 229123) +++ dojump.c(working copy) @@ -1207,12 +1207,10 @@ If one side isn't, we want a noncanonicalized comparison. See PR middle-end/17564. */ if (targetm.have_canonicalize_funcptr_for_compare () - && TREE_CODE (TREE_TYPE (treeop0)) == POINTER_TYPE - && TREE_CODE (TREE_TYPE (TREE_TYPE (treeop0))) - == FUNCTION_TYPE - && TREE_CODE (TREE_TYPE (treeop1)) == POINTER_TYPE - && TREE_CODE (TREE_TYPE (TREE_TYPE (treeop1))) - == FUNCTION_TYPE) + && POINTER_TYPE_P (TREE_TYPE (treeop0)) + && POINTER_TYPE_P (TREE_TYPE (treeop1)) + && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (TREE_TYPE (treeop0))) + && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (TREE_TYPE (treeop1 { rtx new_op0 = gen_reg_rtx (mode); rtx new_op1 = gen_reg_rtx (mode);
Re: [Patch, fortran] PR67171 - [6 regression] sourced allocation
Hi Paul, I had a look at your patch, especially at the allocate() specific part and have nothing serious to complain about. I would love to see more comments to help beginners find there way in the code, but that's a thing I will never get. :-) Therefore, from my perspective ok for trunk and thanks for the patch. Regards, Andre On Sat, 24 Oct 2015 15:08:30 +0200 Paul Richard Thomas wrote: > Dear All, > > This patch does four things: > (i) On deallocating class components, the vptr is set to point to the > vtable of the declared type; > (ii) When digging out the last class reference, a NULL is returned if > the allocatable component is to the right of a part reference with > non-zero rank, so that the resulting ICE is removed. The previous > modification takes care of these cases for gfc_reset_vptr and > gfc_reset_len; > (iii) gfc_reset_vptr has been simplified by the use of > gfc_get_vptr_from_expr; and > (iv) All variable expressions for the source are passed to > gfc_trans-assignment, so that array sections work correctly. > > I see that Andre has already reserved the testcase > allocate_with_source_10, for the pending patch that I undertook to > review, so I will change this to #12 on submission > > OK for trunk? > > Cheers > > Paul > > 2015-01-24 Paul Thomas > > PR fortran/67171 > * trans-array.c (structure_alloc_comps): On deallocation of > class components, reset the vptr to the declared type vtable > and reset the _len field of unlimited polymorphic components. > *trans-expr.c (gfc_find_and_cut_at_last_class_ref): Bail out on > allocatable component references to the right of part reference > with non-zero rank and return NULL. > (gfc_reset_vptr): Simplify this function by using the function > gfc_get_vptr_from_expr. Return if the vptr is NULL_TREE. > (gfc_reset_len): If gfc_find_and_cut_at_last_class_ref returns > NULL return. > * trans-stmt.c (gfc_trans_allocate): Rely on the use of > gfc_trans_assignment if expr3 is a variable expression since > this deals correctly with array sections. > > 2015-01-24 Paul Thomas > > PR fortran/67171 > * gfortran.dg/allocate_with_source_10.f03: New test -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: [Patch, fortran] PR67171 - [6 regression] sourced allocation
Dear Andre, I will gladly add some comments before I commit - no problem. Thanks for the review. Cheers Paul On 25 October 2015 at 16:47, Andre Vehreschild wrote: > Hi Paul, > > I had a look at your patch, especially at the allocate() specific part > and have nothing serious to complain about. I would love to see more > comments to help beginners find there way in the code, but that's a > thing I will never get. :-) > > Therefore, from my perspective ok for trunk and thanks for the patch. > > Regards, > Andre > > On Sat, 24 Oct 2015 15:08:30 +0200 > Paul Richard Thomas wrote: > >> Dear All, >> >> This patch does four things: >> (i) On deallocating class components, the vptr is set to point to the >> vtable of the declared type; >> (ii) When digging out the last class reference, a NULL is returned if >> the allocatable component is to the right of a part reference with >> non-zero rank, so that the resulting ICE is removed. The previous >> modification takes care of these cases for gfc_reset_vptr and >> gfc_reset_len; >> (iii) gfc_reset_vptr has been simplified by the use of >> gfc_get_vptr_from_expr; and >> (iv) All variable expressions for the source are passed to >> gfc_trans-assignment, so that array sections work correctly. >> >> I see that Andre has already reserved the testcase >> allocate_with_source_10, for the pending patch that I undertook to >> review, so I will change this to #12 on submission >> >> OK for trunk? >> >> Cheers >> >> Paul >> >> 2015-01-24 Paul Thomas >> >> PR fortran/67171 >> * trans-array.c (structure_alloc_comps): On deallocation of >> class components, reset the vptr to the declared type vtable >> and reset the _len field of unlimited polymorphic components. >> *trans-expr.c (gfc_find_and_cut_at_last_class_ref): Bail out on >> allocatable component references to the right of part reference >> with non-zero rank and return NULL. >> (gfc_reset_vptr): Simplify this function by using the function >> gfc_get_vptr_from_expr. Return if the vptr is NULL_TREE. >> (gfc_reset_len): If gfc_find_and_cut_at_last_class_ref returns >> NULL return. >> * trans-stmt.c (gfc_trans_allocate): Rely on the use of >> gfc_trans_assignment if expr3 is a variable expression since >> this deals correctly with array sections. >> >> 2015-01-24 Paul Thomas >> >> PR fortran/67171 >> * gfortran.dg/allocate_with_source_10.f03: New test > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: TR1 Special Math
On 10/24/2015 11:38 PM, Jonathan Wakely wrote: On 8 May 2015 at 15:05, Ed Smith-Rowland <3dw...@verizon.net> wrote: On 05/07/2015 12:06 PM, Jonathan Wakely wrote: Hi Ed, The C++ committee is considering the http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4437.pdf proposal to make C++17 include the contents of ISO 29124:2010 (the special math functions from TR1 that went into a separate standard, not into C++11). What is the status of our TR1 implementation? Is it complete? Good enough quality to move out of the tr1 sub-dir? Even if N4437 isn't accepted for C++17 we could move things around to turn the TR1 code into an iso29124 implementation, do you think that would make sense? That would make absolute sense. I actually have a tree where I've done that. All the functions are in there (29124 removed the hypergeometric functions. I'd like to keep those as extensions. I have some bugfixes also. I have a better version of the Carlson elliptic functions (which are used in the 29124 elliptic functions). Ed Hi Ed, Florian, Here's a patch to re-use the TR1 math functions to implement IS 29124, what do you think of this approach? Ed, were you just going to copy the files and have duplicated code? We should probably uglify the names of the hypergeometric functions if they are not in the final standard. This doesn't include Florian's patch, which should be applied. (I want to get this done before stage 1 ends in a couple of weeks, so am posting this for review now, but I'll be unavailable for the next week or two and might not be able to actually commit anything until stage 3). Hi all! I am actually very aware of the stage 1 deadline and am working furiously! This patch adds the hypergeometric and confluent hypergeometric functions that were actually stricken fromTR29124. I actually had a mind to add those back especially since the confluent one is actually pretty stable in it's realm and is used in some statistics tests. I expect that some people have ventures to use both and so TR29129 would not be a full replacement for TR1 without them. I intend to post within the next few days. I have to realize that some of my hopes and dreams would be better done with these in tree! ;-) Thank you for lighting a fire Jonathan! Ed
[committed] Define EH_FRAME_THROUGH_COLLECT2 in config/pa/som.h
The change of Andrew Dixie and David Edelsohn on 2015-09-18 introduced a new define, EH_FRAME_THROUGH_COLLECT2, and broke EH frame handling on 32-bit hppa*-*-hpux*. We now need to define EH_FRAME_THROUGH_COLLECT2 on this target. Tested on hppa2.0w-hp-hpux11.11. Committed to trunk. Dave -- John David Anglin dave.ang...@bell.net 2015-10-25 John David Anglin * config/pa/som.h (EH_FRAME_THROUGH_COLLECT2): Define. Index: config/pa/som.h === --- config/pa/som.h (revision 229301) +++ config/pa/som.h (working copy) @@ -340,6 +340,11 @@ this suffix when generating constructor/destructor names. */ #define SHLIB_SUFFIX ".sl" +/* We don't have named sections. */ #define TARGET_HAVE_NAMED_SECTIONS false #define TARGET_ASM_TM_CLONE_TABLE_SECTION pa_som_tm_clone_table_section + +/* Generate specially named labels to identify DWARF 2 frame unwind + information. */ +#define EH_FRAME_THROUGH_COLLECT2
C++ PATCH for DR 1518 (c++/54835, c++/60417)
It seems to me that there's a discrepancy in handling explicit default constructors. Based on my tests, this works: struct X {explicit X() {}}; void f(X) {} int main() { f({}); } However, if the explicit constructor is defaulted, gcc accepts the code: struct X {explicit X() = default;}; void f(X) {} int main() { f({}); }
Re: TR1 Special Math
On 25 October 2015 at 17:46, Ed Smith-Rowland <3dw...@verizon.net> wrote: > On 10/24/2015 11:38 PM, Jonathan Wakely wrote: >> >> On 8 May 2015 at 15:05, Ed Smith-Rowland <3dw...@verizon.net> wrote: >>> >>> On 05/07/2015 12:06 PM, Jonathan Wakely wrote: Hi Ed, The C++ committee is considering the http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4437.pdf proposal to make C++17 include the contents of ISO 29124:2010 (the special math functions from TR1 that went into a separate standard, not into C++11). What is the status of our TR1 implementation? Is it complete? Good enough quality to move out of the tr1 sub-dir? Even if N4437 isn't accepted for C++17 we could move things around to turn the TR1 code into an iso29124 implementation, do you think that would make sense? >>> That would make absolute sense. >>> I actually have a tree where I've done that. >>> All the functions are in there (29124 removed the hypergeometric >>> functions. >>> I'd like to keep those as extensions. >>> I have some bugfixes also. >>> >>> I have a better version of the Carlson elliptic functions (which are used >>> in >>> the 29124 elliptic functions). >>> >>> Ed >>> >> Hi Ed, Florian, >> >> Here's a patch to re-use the TR1 math functions to implement IS 29124, >> what do you think of this approach? Ed, were you just going to copy >> the files and have duplicated code? >> >> We should probably uglify the names of the hypergeometric functions if >> they are not in the final standard. >> >> This doesn't include Florian's patch, which should be applied. >> >> (I want to get this done before stage 1 ends in a couple of weeks, so >> am posting this for review now, but I'll be unavailable for the next >> week or two and might not be able to actually commit anything until >> stage 3). > > Hi all! > > I am actually very aware of the stage 1 deadline and am working furiously! > > This patch adds the hypergeometric and confluent hypergeometric functions > that were actually stricken fromTR29124. > I actually had a mind to add those back especially since the confluent one > is actually pretty stable in it's realm and is used in some statistics > tests. > I expect that some people have ventures to use both and so TR29129 would not > be a full replacement for TR1 without them. > > I intend to post within the next few days. I have to realize that some of > my hopes and dreams would be better done with these in tree! ;-) > > Thank you for lighting a fire Jonathan! Excellent, glad to hear you're on this, as you know the code and the specs, whereas I'm poking around blindly :-)
Re: [Patch, fortran] PR67171 - [6 regression] sourced allocation
Dear Andre, Committed as revision 229303 with extra comments and tests for PR61819 and PR61830. I'll see what I can do to backport the fix for the latter, since it is a revision. Thanks for the review Paul On 25 October 2015 at 16:47, Andre Vehreschild wrote: > Hi Paul, > > I had a look at your patch, especially at the allocate() specific part > and have nothing serious to complain about. I would love to see more > comments to help beginners find there way in the code, but that's a > thing I will never get. :-) > > Therefore, from my perspective ok for trunk and thanks for the patch. > > Regards, > Andre > > On Sat, 24 Oct 2015 15:08:30 +0200 > Paul Richard Thomas wrote: > >> Dear All, >> >> This patch does four things: >> (i) On deallocating class components, the vptr is set to point to the >> vtable of the declared type; >> (ii) When digging out the last class reference, a NULL is returned if >> the allocatable component is to the right of a part reference with >> non-zero rank, so that the resulting ICE is removed. The previous >> modification takes care of these cases for gfc_reset_vptr and >> gfc_reset_len; >> (iii) gfc_reset_vptr has been simplified by the use of >> gfc_get_vptr_from_expr; and >> (iv) All variable expressions for the source are passed to >> gfc_trans-assignment, so that array sections work correctly. >> >> I see that Andre has already reserved the testcase >> allocate_with_source_10, for the pending patch that I undertook to >> review, so I will change this to #12 on submission >> >> OK for trunk? >> >> Cheers >> >> Paul >> >> 2015-01-24 Paul Thomas >> >> PR fortran/67171 >> * trans-array.c (structure_alloc_comps): On deallocation of >> class components, reset the vptr to the declared type vtable >> and reset the _len field of unlimited polymorphic components. >> *trans-expr.c (gfc_find_and_cut_at_last_class_ref): Bail out on >> allocatable component references to the right of part reference >> with non-zero rank and return NULL. >> (gfc_reset_vptr): Simplify this function by using the function >> gfc_get_vptr_from_expr. Return if the vptr is NULL_TREE. >> (gfc_reset_len): If gfc_find_and_cut_at_last_class_ref returns >> NULL return. >> * trans-stmt.c (gfc_trans_allocate): Rely on the use of >> gfc_trans_assignment if expr3 is a variable expression since >> this deals correctly with array sections. >> >> 2015-01-24 Paul Thomas >> >> PR fortran/67171 >> * gfortran.dg/allocate_with_source_10.f03: New test > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: C++ PATCH for DR 1518 (c++/54835, c++/60417)
On 25 October 2015 at 22:15, Ville Voutilainen wrote: > It seems to me that there's a discrepancy in handling explicit > default constructors. Based on my tests, this works: > > struct X {explicit X() {}}; > > void f(X) {} > > int main() > { > f({}); > } > > However, if the explicit constructor is defaulted, gcc accepts the code: > > struct X {explicit X() = default;}; > > void f(X) {} > > int main() > { > f({}); > } And to clarify, I'd expect both of those snippets to be rejected, but only the former is.
Re: [c++-delayed-folding] Introduce convert_to_pointer_nofold
On 10/19/2015 05:33 AM, Marek Polacek wrote: + if (fold_p) + expr = fold_build1_loc (loc, NOP_EXPR, totype, expr); + else + expr = build1_loc (loc, NOP_EXPR, totype, expr); Rather than duplicate code like this everywhere, maybe we should introduce a maybe_fold_build1_loc macro that takes fold_p as an argument. Jason
[PATCH] PR fortran/36192 -- Check for valid BT_INTEGER
The attached patch fixes a segfault in f951 for some poorly written invalid code. See the testcase for the code in question. Built and tested on i386-*-freebsd. Ok to commit? 2015-10-25 Steven G. Kargl PR fortran/36192 * array.c (gfc_ref_dimen_size): Check for BT_INTEGER before calling mpz_set. 2015-10-25 Steven G. Kargl PR fortran/36192 * gfortran.dg/pr36192.f90: New test. -- Steve Index: gcc/fortran/array.c === --- gcc/fortran/array.c (revision 229301) +++ gcc/fortran/array.c (working copy) @@ -2208,7 +2208,8 @@ gfc_ref_dimen_size (gfc_array_ref *ar, i if (ar->start[dimen] == NULL) { if (ar->as->lower[dimen] == NULL - || ar->as->lower[dimen]->expr_type != EXPR_CONSTANT) + || ar->as->lower[dimen]->expr_type != EXPR_CONSTANT + || ar->as->lower[dimen]->ts.type != BT_INTEGER) goto cleanup; mpz_set (lower, ar->as->lower[dimen]->value.integer); } @@ -,7 +2223,8 @@ gfc_ref_dimen_size (gfc_array_ref *ar, i if (ar->end[dimen] == NULL) { if (ar->as->upper[dimen] == NULL - || ar->as->upper[dimen]->expr_type != EXPR_CONSTANT) + || ar->as->upper[dimen]->expr_type != EXPR_CONSTANT + || ar->as->upper[dimen]->ts.type != BT_INTEGER) goto cleanup; mpz_set (upper, ar->as->upper[dimen]->value.integer); } Index: gcc/testsuite/gfortran.dg/pr36192.f90 === --- gcc/testsuite/gfortran.dg/pr36192.f90 (revision 0) +++ gcc/testsuite/gfortran.dg/pr36192.f90 (working copy) @@ -0,0 +1,9 @@ +! { dg-do compile } +! PR fortran/36192.f90 +! +program three_body + real, parameter :: n = 2, d = 2 + real, dimension(n,d) :: x ! { dg-error "of INTEGER type|of INTEGER type" } + x(1,:) = (/ 1.0, 0.0 /) +end program three_body +! { dg-prune-output "have constant shape" }
Re: [PATCH 5/9] ENABLE_CHECKING refactoring: pool allocators
On 10/21/2015 01:57 PM, Richard Biener wrote: > Ugh (stupid templates). > > @@ -387,10 +389,10 @@ base_pool_allocator ::allocate () >block = m_virgin_free_list; >header = (allocation_pool_list*) allocation_object::get_data (block); >header->next = NULL; > -#ifdef ENABLE_CHECKING > + >/* Mark the element to be free. */ > - ((allocation_object*) block)->id = 0; > -#endif > + if (flag_checking) > + ((allocation_object*) block)->id = 0; > > just set id to zero unconditionally. That'll be faster than checking > flag_checking. I fixed this and other issues, and committed the attached patch. -- Regards, Mikhail Maltsev diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 81d0e1c..d8a22c3 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2015-10-26 Mikhail Maltsev + + * alloc-pool.h (base_pool_allocator::initialize, ::allocate): Remove + conditional compilation. + (base_pool_allocator::remove): Use flag_checking. + 2015-10-25 John David Anglin * config/pa/som.h (EH_FRAME_THROUGH_COLLECT2): Define. diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h index 70105ba..404b558 100644 --- a/gcc/alloc-pool.h +++ b/gcc/alloc-pool.h @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see #define ALLOC_POOL_H #include "memory-block.h" +#include "options.h" // for flag_checking extern void dump_alloc_pool_statistics (void); @@ -275,7 +276,6 @@ base_pool_allocator ::initialize () m_elts_per_block = (TBlockAllocator::block_size - header_size) / size; gcc_checking_assert (m_elts_per_block != 0); -#ifdef ENABLE_CHECKING /* Increase the last used ID and use it for this pool. ID == 0 is used for free elements of pool so skip it. */ last_id++; @@ -283,7 +283,6 @@ base_pool_allocator ::initialize () last_id++; m_id = last_id; -#endif } /* Free all memory allocated for the given memory pool. */ @@ -387,10 +386,9 @@ base_pool_allocator ::allocate () block = m_virgin_free_list; header = (allocation_pool_list*) allocation_object::get_data (block); header->next = NULL; -#ifdef ENABLE_CHECKING + /* Mark the element to be free. */ ((allocation_object*) block)->id = 0; -#endif VALGRIND_DISCARD (VALGRIND_MAKE_MEM_NOACCESS (header,size)); m_returned_free_list = header; m_virgin_free_list += m_elt_size; @@ -404,10 +402,8 @@ base_pool_allocator ::allocate () m_returned_free_list = header->next; m_elts_free--; -#ifdef ENABLE_CHECKING /* Set the ID for element. */ allocation_object::get_instance (header)->id = m_id; -#endif VALGRIND_DISCARD (VALGRIND_MAKE_MEM_UNDEFINED (header, size)); return (void *)(header); @@ -418,26 +414,23 @@ template inline void base_pool_allocator ::remove (void *object) { - gcc_checking_assert (m_initialized); - - allocation_pool_list *header; - int size ATTRIBUTE_UNUSED; - size = m_elt_size - offsetof (allocation_object, u.data); - -#ifdef ENABLE_CHECKING - gcc_assert (object + if (flag_checking) +{ + gcc_assert (m_initialized); + gcc_assert (object /* Check if we free more than we allocated, which is Bad (TM). */ && m_elts_free < m_elts_allocated /* Check whether the PTR was allocated from POOL. */ && m_id == allocation_object::get_instance (object)->id); - memset (object, 0xaf, size); + int size = m_elt_size - offsetof (allocation_object, u.data); + memset (object, 0xaf, size); +} /* Mark the element to be free. */ allocation_object::get_instance (object)->id = 0; -#endif - header = (allocation_pool_list*) object; + allocation_pool_list *header = (allocation_pool_list*) object; header->next = m_returned_free_list; m_returned_free_list = header; VALGRIND_DISCARD (VALGRIND_MAKE_MEM_NOACCESS (object, size));
Possible patch for PR fortran/66056
The problem: Statement labels within a type declaration are put in the statement label tree belonging to the type declaration's namespace's (instead of the current namespace). When the line is otherwise empty and an error is issued, gfc_free_st_label tries to delete the label from the label tree belonging to the current namespace and then frees the label structure, leaving an invalid statement label pointer in the type declaration's namespace's label tree. When that namespace is cleaned up, bad things can happen. The attached patch stores a namespace pointer in the statement label structure so that if a label is deleted early for some reason, it will be deleted from the proper namespace. Louis empty_label_typedecl.f90 Description: Binary data Index: gcc/fortran/gfortran.h === --- gcc/fortran/gfortran.h (revision 229302) +++ gcc/fortran/gfortran.h (working copy) @@ -1291,6 +1291,8 @@ typedef struct gfc_st_label tree backend_decl; locus where; + + gfc_namespace *ns; } gfc_st_label; Index: gcc/fortran/io.c === --- gcc/fortran/io.c(revision 229302) +++ gcc/fortran/io.c(working copy) @@ -28,7 +28,7 @@ along with GCC; see the file COPYING3. If not see gfc_st_label format_asterisk = {0, NULL, NULL, -1, ST_LABEL_FORMAT, ST_LABEL_FORMAT, NULL, - 0, {NULL, NULL}}; + 0, {NULL, NULL}, NULL}; typedef struct { Index: gcc/fortran/symbol.c === --- gcc/fortran/symbol.c(revision 229302) +++ gcc/fortran/symbol.c(working copy) @@ -2195,7 +2195,7 @@ gfc_free_st_label (gfc_st_label *label) if (label == NULL) return; - gfc_delete_bbt (&gfc_current_ns->st_labels, label, compare_st_labels); + gfc_delete_bbt (&label->ns->st_labels, label, compare_st_labels); if (label->format != NULL) gfc_free_expr (label->format); @@ -2260,6 +2260,7 @@ gfc_get_st_label (int labelno) lp->value = labelno; lp->defined = ST_LABEL_UNKNOWN; lp->referenced = ST_LABEL_UNKNOWN; + lp->ns = ns; gfc_insert_bbt (&ns->st_labels, lp, compare_st_labels);
[PATCH] rs6000: p8vector-builtin-8.c test requires int128
For 32-bit targets p8vector_ok does not imply we have int128. Tested with -m32,-m32/-mpowerpc64,-m64; okay for trunk? Segher 2015-10-26 Segher Boessenkool gcc/testsuite/ * gcc.target/powerpc/p8vector-builtin-8.c: Add "target int128". --- gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c index bb5e182..b204d99 100644 --- a/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c +++ b/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target int128 } } */ /* { dg-require-effective-target powerpc_p8vector_ok } */ /* { dg-options "-mpower8-vector -O2" } */ -- 1.9.3
[PATCH] rs6000: Fix tests for xvmadd and xvnmsub
The patterns involved can create vmadd resp. vnmsub instructions instead. This patch changes the testcases to allow those. Tested with -m32,-m32/-mpowerpc64,-m64; okay for trunk? Segher 2015-10-26 Segher Boessenkool gcc/testsuite/ * gcc.target/powerpc/vsx-builtin-2.c: Allow vmadd and vnmsub as well as xvmadd and xvnmsub. * gcc.target/powerpc/vsx-vector-2.c: Allow vmadd as well as xvmadd. --- gcc/testsuite/gcc.target/powerpc/vsx-builtin-2.c | 4 ++-- gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-2.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-2.c index d5d1e2d..7b5ad7d 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-2.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-2.c @@ -6,10 +6,10 @@ /* { dg-final { scan-assembler "xvaddsp" } } */ /* { dg-final { scan-assembler "xvsubsp" } } */ /* { dg-final { scan-assembler "xvmulsp" } } */ -/* { dg-final { scan-assembler "xvmadd" } } */ +/* { dg-final { scan-assembler "vmadd" } } */ /* { dg-final { scan-assembler "xvmsub" } } */ /* { dg-final { scan-assembler "xvnmadd" } } */ -/* { dg-final { scan-assembler "xvnmsub" } } */ +/* { dg-final { scan-assembler "vnmsub" } } */ /* { dg-final { scan-assembler "xvdivsp" } } */ /* { dg-final { scan-assembler "xvmaxsp" } } */ /* { dg-final { scan-assembler "xvminsp" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c b/gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c index db3aa38..34dbd57 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-vector-2.c @@ -7,7 +7,7 @@ /* { dg-final { scan-assembler "xvsubsp" } } */ /* { dg-final { scan-assembler "xvmulsp" } } */ /* { dg-final { scan-assembler "xvdivsp" } } */ -/* { dg-final { scan-assembler "xvmadd" } } */ +/* { dg-final { scan-assembler "vmadd" } } */ /* { dg-final { scan-assembler "xvmsub" } } */ /* { dg-final { scan-assembler "xvsqrtsp" } } */ /* { dg-final { scan-assembler "xvcpsgnsp" } } */ -- 1.9.3
Re: [PATCH GCC]Improve rtl loop inv cost by checking if the inv can be propagated to address uses
On Wed, Oct 21, 2015 at 11:55 AM, Bin.Cheng wrote: > On Fri, Oct 9, 2015 at 8:04 PM, Bernd Schmidt wrote: >> On 10/09/2015 02:00 PM, Bin.Cheng wrote: >>> >>> I further bootstrap and test attached patch on aarch64. Also three >>> cases in spec2k6/fp are improved by 3~6%, two cases in spec2k6/fp are >>> regressed by ~2%. Overall score is improved by ~0.8% for spec2k6/fp >>> on aarch64 of my run. I may later analyze the regression. >>> >>> So is this patch OK? >> > Hi Bernd, > Thanks for reviewing this patch. I further collected perf data for > spec2k on AArch64. Three fp cases are improved by 3-5%, no obvious > regression. As for int cases, perlbmk is improved by 8%, but crafty > is regressed by 3.8%. Together with spec2k6 data, I think this patch > is generally good. I scanned hot functions in crafty but didn't find > obvious regression because lim hoist decision is very different > because of this change. The regression could be caused by register > pressure.. > >> >> I'll approve this with one change, but please keep an eye out for >> performance regressions on other targets. > Sure. > >> >>> * loop-invariant.c (struct def): New field cant_prop_to_addr_uses. >>> (inv_cant_prop_to_addr_use): New function. >> >> >> I would like these to have switched truthvalues, i.e. can_prop_to_addr_uses, >> inv_can_prop_to_addr_use. Otherwise we end up with double negations like >> !def->cant_prop_to_addr_uses which can be slightly confusing. >> >> You'll probably slightly need to tweak the initialization when n_addr_uses >> goes from zero to one. > Here is the new version patch with your comments incorporated. > Given the patch was pre-approved and there is no other comments, I will apply it later. Thanks, bin