Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
On Wed, Nov 28, 2012 at 7:24 PM, Marek Polacek wrote: > On Wed, Nov 28, 2012 at 10:52:17AM +0100, Eric Botcazou wrote: >> No, I don't think that's the problem. The above messages are admittedly a >> bit >> terse, they should say: >> >> JUMP-BYPASS: Proved reg 59 in jump_insn 15 equals constant (const_int 3 >> [0x3]) >> when BB 4 is entered from BB 9. Redirect edge 9->4 to 5. >> >> so you can have different constants for BB 3 and BB 9. The patch to tweak >> the >> dump messages along these lines is pre-approved. > > Ouch. Okay, I'll post a separate patch for improving the message. > >> The ICE in merge_latch_edges means that the loop structure and the CFG aren't >> in sync anymore. Does the cprop pass modify the former without declaring it? > > I admit I'm not sure what to look at, maybe cprop should have in > properties_destroyed PROP_loops? Well, then we need to remove one > assert in loop-init.c. So something like: Definitely not - that means to not preserve loops until after cprop. The goal is to preserve loops everywhere! Richard. > --- gcc/cprop.c.mp 2012-11-28 16:55:03.520375191 +0100 > +++ gcc/cprop.c 2012-11-28 16:55:35.992246623 +0100 > @@ -1927,7 +1927,7 @@ struct rtl_opt_pass pass_rtl_cprop = >TV_CPROP, /* tv_id */ >PROP_cfglayout, /* properties_required */ >0,/* properties_provided */ > - 0,/* properties_destroyed */ > + PROP_loops, /* properties_destroyed */ >0,/* todo_flags_start */ >TODO_df_finish | TODO_verify_rtl_sharing | >TODO_verify_flow | TODO_ggc_collect /* todo_flags_finish */ > --- gcc/loop-init.c.mp 2012-11-28 16:55:08.924398879 +0100 > +++ gcc/loop-init.c 2012-11-28 16:55:17.684437276 +0100 > @@ -54,8 +54,6 @@ loop_optimizer_init (unsigned flags) > } >else > { > - gcc_assert (cfun->curr_properties & PROP_loops); > - >/* Ensure that the dominators are computed, like flow_loops_find does. > */ >calculate_dominance_info (CDI_DOMINATORS); > > This quashes the ICE. I've regtested it, it caused one > regression though: > FAIL: gcc.dg/unroll_5.c scan-rtl-dump-times loop2_unroll "realistic > bound: 299" 1 > > But there probably is something else. > > Thanks for the review, > > Marek
Re: patch to fix PR55512
On Wed, Nov 28, 2012 at 12:43:15PM -0500, Vladimir Makarov wrote: > The following patch fixes > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55512 > > The patch was successfully tested and bootstrapped on x86/x86-64. > > Committed as rev. 193901. > > 2012-11-28 Vladimir Makarov > > PR rtl-optimization/55512 > * lra-assigns.c (assign_by_spills): Assigned arbitrary hard regs > to failed reload pseudos instead of changing asm pattern. > * lra-constraints.c (MAX_CONSTRAINT_ITERATION_NUMBER): Increase > value. > > 2012-11-28 Vladimir Makarov > > PR rtl-optimization/55512 > * gcc.target/i386/pr55512-[1234].c: New tests. Thanks, I've tweaked the tests slightly, the #define __builtin_unreachable() was there just to put 4 different tests for 3 different ICEs into one test. 2012-11-29 Jakub Jelinek PR rtl-optimization/55512 * gcc.target/i386/pr55512-2.c: Remove unnecessary define. * gcc.target/i386/pr55512-4.c: Likewise. --- gcc.target/i386/pr55512-2.c (revision 193922) +++ gcc.target/i386/pr55512-2.c (revision 193923) @@ -1,8 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2" } */ -#define __builtin_unreachable() do { } while (0) - int foo (int x) { @@ -10,7 +8,6 @@ foo (int x) "r" (x + 4), "r" (x + 5), "r" (x + 6), "r" (x + 7), "r" (x + 8), "r" (x + 9), "r" (x + 10), "r" (x + 11), "r" (x + 12), "r" (x + 13), "r" (x + 14), "r" (x + 15) : : lab); - __builtin_unreachable (); lab: return 0; } --- gcc.target/i386/pr55512-4.c (revision 193922) +++ gcc.target/i386/pr55512-4.c (revision 193923) @@ -1,8 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2" } */ -#define __builtin_unreachable() do { } while (0) - int bar (int x) { @@ -11,7 +9,6 @@ bar (int x) "r" (x + 8), "r" (x + 9), "r" (x + 10), "r" (x + 11), "r" (x + 12), "r" (x + 13), "r" (x + 14), "r" (x + 15), "r" (x + 16) : : lab); - __builtin_unreachable (); lab: return 0; } Jakub
Re: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold
On Thu, Nov 29, 2012 at 5:18 AM, H.J. Lu wrote: > From: "H.J. Lu" > To: gcc-patches@gcc.gnu.org > Cc: Joseph Myers , Paolo Bonzini > Bcc: > Subject: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold > Reply-To: > > Hi, > > Binutils supports 2 linkers, ld.gold and ld.bfd. One of them is > configured as the default linker, ld, which is used by GCC. Sometimes, > we want to use the alternate linker with GCC at run-time. This patch > adds -fuse-ld=bfd and -fuse-ld=gold options to GCC driver. It changes > collect2.c to pick either ld.bfd or ld.gold. It also adds > ORIGINAL_LD_BFD_FOR_TARGET and ORIGINAL_LD_GOLD_FOR_TARGET to > exec-tool.in to add -fuse-ld=bfd and -fuse-ld=gold support to > collect-ld. Since ld.bfd nor ld.gold know the new options, you > will get > > # ./xgcc -B./ /tmp/x.c -fuse-ld=gold -v > ... > ./collect-ld --eh-frame-hdr -m elf_x86_64 -dynamic-linker > /lib64/ld-linux-x86-64.so.2 -fuse-ld=gold /lib/../lib64/crt1.o > /lib/../lib64/crti.o ./crtbegin.o -L. -L/lib/../lib64 -L/usr/lib/../lib64 > /tmp/cclVWcGz.o -v -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc > --as-needed -lgcc_s --no-as-needed ./crtend.o /lib/../lib64/crtn.o > GNU gold (Linux/GNU Binutils 2.23.51.0.7.20121127) 1.11 > /usr/local/bin/ld.gold: fatal error: -f/--auxiliary may not be used without > -shared > collect2: error: ld returned 1 exit status > > This is because we pass everything to ld and ld.bfd/ld.gold doesn't > understand -fuse-ld=bfd/-fuse-ld=gold. It isn't a problem for collect2 > since it will filter-out -fuse-ld=bfd/-fuse-ld=gold. I will submit a > binutils patch to ignore -fuse-ld=bfd/-fuse-ld=gold, similar to -flto > options. > > OK to install? Do we need to consider GNU ld and gold coming from different binutils versions and thus do we need two sets of linker feature tests at configure time? Eventually also if GNU ld and gold are not in feature-parity for the same binutils version? That is, isn't this going to create hard to debug issues for users? Thanks, Richard. > Thanks. > > > H.J. > --- > 2012-11-28 Nick Clifton > Matthias Klose > Doug Kwan > H.J. Lu > > PR driver/55470 > * collect2.c (main): Support -fuse-ld=bfd and -fuse-ld=gold. > > * common.opt: Add fuse-ld=bfd and fuse-ld=gold. > > * configure.ac (ORIGINAL_LD_BFD_FOR_TARGET): New AC_PATH_PROG. > (ORIGINAL_LD_GOLD_FOR_TARGET): Likewise. > * configure: Regenerated. > > * exec-tool.in (ORIGINAL_LD_BFD_FOR_TARGET): New variable. > (ORIGINAL_LD_GOLD_FOR_TARGET): Likewise. > Use $ORIGINAL_LD_BFD_FOR_TARGET for collect-ld when seeing > -fuse-ld=bfd. Use =$ORIGINAL_LD_GOLD_FOR_TARGET for collect-ld > when seeing -fuse-ld=gold. Issue an error if linker isn't > found. > > * gcc.c (LINK_COMMAND_SPEC): Pass -fuse-ld=* to collect2. > > * opts.c (comman_handle_option): Ignore -fuse-ld=bfd and > -fuse-ld=gold. > > * doc/invoke.texi: Document -fuse-ld=bfd and -fuse-ld=gold. > diff --git a/gcc/collect2.c b/gcc/collect2.c > index 49c4030..4e8cdf0 100644 > --- a/gcc/collect2.c > +++ b/gcc/collect2.c > @@ -842,8 +842,21 @@ maybe_run_lto_and_relink (char **lto_ld_argv, char > **object_lst, > int > main (int argc, char **argv) > { > - static const char *const ld_suffix = "ld"; > - static const char *const plugin_ld_suffix = PLUGIN_LD_SUFFIX; > + enum linker_select > +{ > + USE_DEFAULT_LD, > + USE_PLUGIN_LD, > + USE_GOLD_LD, > + USE_BFD_LD, > + USE_LD_MAX > +} selected_linker = USE_DEFAULT_LD; > + static const char *const ld_suffixes[USE_LD_MAX] = > +{ > + "ld", > + PLUGIN_LD_SUFFIX, > + "ld.gold", > + "ld.bfd" > +}; >static const char *const real_ld_suffix = "real-ld"; >static const char *const collect_ld_suffix = "collect-ld"; >static const char *const nm_suffix = "nm"; > @@ -854,16 +867,13 @@ main (int argc, char **argv) >static const char *const strip_suffix = "strip"; >static const char *const gstrip_suffix = "gstrip"; > > + const char *full_ld_suffixes[USE_LD_MAX]; > #ifdef CROSS_DIRECTORY_STRUCTURE >/* If we look for a program in the compiler directories, we just use > the short name, since these directories are already system-specific. > But it we look for a program in the system directories, we need to > qualify the program name with the target machine. */ > > - const char *const full_ld_suffix = > -concat(target_machine, "-", ld_suffix, NULL); > - const char *const full_plugin_ld_suffix = > -concat(target_machine, "-", plugin_ld_suffix, NULL); >const char *const full_nm_suffix = > concat (target_machine, "-", nm_suffix, NULL); >const char *const full_gnm_suffix = > @@ -877,13 +887,11 @@ main (int argc, char **argv) >const char *const full_gstrip_suffix = > concat (target_machine, "-", gstrip_suffix, NULL); > #else > - c
Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
On Thu, Nov 29, 2012 at 9:34 AM, Richard Biener wrote: > On Wed, Nov 28, 2012 at 7:24 PM, Marek Polacek wrote: >> I admit I'm not sure what to look at, maybe cprop should have in >> properties_destroyed PROP_loops? Well, then we need to remove one >> assert in loop-init.c. So something like: > > Definitely not - that means to not preserve loops until after cprop. The goal > is to preserve loops everywhere! It'd be nice if this was documented somewhere... Ciao! Steven
Re: [PATCH, generic] Fix for define_subst
So, ok for commit this patch? Changelog: 2012-11-29 Michael Zolotukhin * gensupport.c (maybe_eval_c_test): Remove not-null check for expr. * read-rtl.c (apply_iterators): Initialize condition with "" instead of NULL. On 28 November 2012 23:46, Michael Zolotukhin wrote: >> Well, there does seem to be a mistake -- the use of NULL in the first >> place. It seems to me that the easiest fix is >> >> condition = ""; >> >> right at the beginning. > Yep, that'll work too, you're right. > > On 28 November 2012 22:36, Richard Henderson wrote: >> On 11/28/2012 09:20 AM, Michael Zolotukhin wrote: Where was the null condition created in the first place? >>> The reason it's happening is following. Before introduction of >>> define_subst, function apply_iterators had the following loop: >>> condition = NULL; >>> FOR_EACH_VEC_ELT (iterator_uses, i, iuse) >>> { >>> v = iuse->iterator->current_value; >>> iuse->iterator->group->apply_iterator (iuse->ptr, v->number); >>> condition = join_c_conditions (condition, v->string); >>> } >>> apply_attribute_uses (); >>> x = copy_rtx_for_iterators (original); >>> add_condition_to_rtx (x, condition); >>> >>> This loop always iterated at least once, so 'condition' always became >>> not-null (though, it could become empty-string ""). So, function >>> add_condition_to_rtx always had not-null string in the arguments. >>> >>> With subst-iterators this loop is looking like this: >>> condition = NULL; >>> FOR_EACH_VEC_ELT (iterator_uses, i, iuse) >>> { >>> if (iuse->iterator->group == &substs) >>> continue; >>> v = iuse->iterator->current_value; >>> iuse->iterator->group->apply_iterator (iuse->ptr, v->number); >>> condition = join_c_conditions (condition, v->string); >>> } >>> So, it's possible that join_c_condition wouldn't be called at all, and >>> 'condition' will remain NULL. Then it goes to add_condition_to_rtx and >>> we get the fail we've seen. >>> >>> There is no mistake in such behaviour, but now we should be aware of >>> possible NULL-string. It should be handled properly, and I see two >>> places where we could do that - in join_c_conditions or in add_c_tests >>> and maybe_eval_c_test. >>> >> >> Well, there does seem to be a mistake -- the use of NULL in the first >> place. It seems to me that the easiest fix is >> >> condition = ""; >> >> right at the beginning. >> >> >> r~ > > > > -- > --- > Best regards, > Michael V. Zolotukhin, > Software Engineer > Intel Corporation. -- --- Best regards, Michael V. Zolotukhin, Software Engineer Intel Corporation. null-condition-fix.patch Description: Binary data
Re: [Patch, Fortran, committed] PR52161 - Fix bound checking of SYNC IMAGES( n )
Tobias Burnus wrote: Committed as Rev. 193908 Looking at the code again, the condition for -fcoarray=lib was wrong and never triggered (-fcoarray=single was okay). Fixed by the following commit after building and regtesting. Tobias Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 193923) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,6 +1,12 @@ 2012-11-28 Tobias Burnus PR fortran/52161 + * trans-stmt.c (gfc_trans_sync): Fix bound checking + for -fcoarray=lib. + +2012-11-28 Tobias Burnus + + PR fortran/52161 * trans-stmt.c (gfc_trans_sync): Fix bound checking. 2012-11-27 Tobias Burnus Index: gcc/fortran/trans-stmt.c === --- gcc/fortran/trans-stmt.c (Revision 193923) +++ gcc/fortran/trans-stmt.c (Arbeitskopie) @@ -784,12 +784,12 @@ gfc_trans_sync (gfc_code *code, gfc_exec_op type) else { tree cond2; - cond = fold_build2_loc (input_location, GE_EXPR, boolean_type_node, + cond = fold_build2_loc (input_location, GT_EXPR, boolean_type_node, images, gfort_gvar_caf_num_images); cond2 = fold_build2_loc (input_location, LT_EXPR, boolean_type_node, images, build_int_cst (TREE_TYPE (images), 1)); - cond = fold_build2_loc (input_location, TRUTH_AND_EXPR, + cond = fold_build2_loc (input_location, TRUTH_OR_EXPR, boolean_type_node, cond, cond2); } gfc_trans_runtime_check (true, false, cond, &se.pre, Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (Revision 193923) +++ gcc/testsuite/ChangeLog (Arbeitskopie) @@ -1,3 +1,8 @@ +2012-11-28 Tobias Burnus + + PR fortran/52161 + * coarray/sync_3.f90: Extend test. + 2012-11-29 Jakub Jelinek PR rtl-optimization/55512 Index: gcc/testsuite/gfortran.dg/coarray/sync_3.f90 === --- gcc/testsuite/gfortran.dg/coarray/sync_3.f90 (Revision 193923) +++ gcc/testsuite/gfortran.dg/coarray/sync_3.f90 (Arbeitskopie) @@ -1,5 +1,6 @@ ! { dg-do run } ! { dg-options "-fcheck=all" } +! { dg-shouldfail "Invalid image number -1 in SYNC IMAGES" } ! ! As sync_1, but with bounds checking enabled. ! PR fortran/52161 @@ -65,4 +66,10 @@ n = 5 sync images (*,errmsg=str,stat=n) if (n /= 0) call abort() +n = -1 +sync images ( num_images() ) +sync images (n) ! Invalid: "-1" + end + +! { dg-output "Fortran runtime error: Invalid image number -1 in SYNC IMAGES" }
Re: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold
On 2012.11.29 at 09:43 +0100, Richard Biener wrote: > On Thu, Nov 29, 2012 at 5:18 AM, H.J. Lu wrote: > > From: "H.J. Lu" > > To: gcc-patches@gcc.gnu.org > > Cc: Joseph Myers , Paolo Bonzini > > Bcc: > > Subject: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold > > Reply-To: > > > > Hi, > > > > Binutils supports 2 linkers, ld.gold and ld.bfd. One of them is > > configured as the default linker, ld, which is used by GCC. Sometimes, > > we want to use the alternate linker with GCC at run-time. This patch > > adds -fuse-ld=bfd and -fuse-ld=gold options to GCC driver. It changes > > collect2.c to pick either ld.bfd or ld.gold. It also adds > > ORIGINAL_LD_BFD_FOR_TARGET and ORIGINAL_LD_GOLD_FOR_TARGET to > > exec-tool.in to add -fuse-ld=bfd and -fuse-ld=gold support to > > collect-ld. Since ld.bfd nor ld.gold know the new options, you > > will get > > > > # ./xgcc -B./ /tmp/x.c -fuse-ld=gold -v > > ... > > ./collect-ld --eh-frame-hdr -m elf_x86_64 -dynamic-linker > > /lib64/ld-linux-x86-64.so.2 -fuse-ld=gold /lib/../lib64/crt1.o > > /lib/../lib64/crti.o ./crtbegin.o -L. -L/lib/../lib64 -L/usr/lib/../lib64 > > /tmp/cclVWcGz.o -v -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc > > --as-needed -lgcc_s --no-as-needed ./crtend.o /lib/../lib64/crtn.o > > GNU gold (Linux/GNU Binutils 2.23.51.0.7.20121127) 1.11 > > /usr/local/bin/ld.gold: fatal error: -f/--auxiliary may not be used without > > -shared > > collect2: error: ld returned 1 exit status > > > > This is because we pass everything to ld and ld.bfd/ld.gold doesn't > > understand -fuse-ld=bfd/-fuse-ld=gold. It isn't a problem for collect2 > > since it will filter-out -fuse-ld=bfd/-fuse-ld=gold. I will submit a > > binutils patch to ignore -fuse-ld=bfd/-fuse-ld=gold, similar to -flto > > options. > > > > OK to install? > > Do we need to consider GNU ld and gold coming from different binutils versions > and thus do we need two sets of linker feature tests at configure > time? Eventually > also if GNU ld and gold are not in feature-parity for the same binutils > version? > > That is, isn't this going to create hard to debug issues for users? Additionally, what is the rationale for this patch? IOW why isn't the following enough? x4 ~ # ln -f /usr/x86_64-pc-linux-gnu/binutils-bin/git/ld.gold /usr/x86_64-pc-linux-gnu/binutils-bin/git/ld x4 ~ # ld -v GNU gold (GNU Binutils 2.23.51.20121126) 1.11 x4 ~ # ln -f /usr/x86_64-pc-linux-gnu/binutils-bin/git/ld.bfd /usr/x86_64-pc-linux-gnu/binutils-bin/git/ld x4 ~ # ld -v GNU ld (GNU Binutils) 2.23.51.20121126 Or is this meant as a temporary workaround hack for gold bugs? -- Markus
Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
On Thu, Nov 29, 2012 at 9:56 AM, Steven Bosscher wrote: > On Thu, Nov 29, 2012 at 9:34 AM, Richard Biener wrote: >> On Wed, Nov 28, 2012 at 7:24 PM, Marek Polacek wrote: >>> I admit I'm not sure what to look at, maybe cprop should have in >>> properties_destroyed PROP_loops? Well, then we need to remove one >>> assert in loop-init.c. So something like: >> >> Definitely not - that means to not preserve loops until after cprop. The >> goal >> is to preserve loops everywhere! > > It'd be nice if this was documented somewhere... On my TODO list for stage3 ;) Richard. > Ciao! > Steven
[PATCH,ARM] Subdivide alu into alu_reg and simple_alu_imm
For attribute named "type", subdivide "alu" into "alu_reg" and "simple_alu_imm". Set type attribute as appropriate in define_insn patterns with immediate operands. Update pipeline descriptions to use the new values of type attribute. No regression on qemu arm-none-eabi -mcpu=cortex-a15/cortex-a7. Bootstrap successful on Cortex-A15. No difference in generated assembly when compiling all of preprocessed sources of gcc 4.8 as a test in various configurations: -mcpu=cortex-a15 -march=armv6t2 -marm/-mthumb -O0/-O1/-O2/-O3/-Os. The motivation for this patch is cortex-a7 pipeline description, which will be submitted separately. Ok for trunk? Thanks, Greta ChangeLog gcc/ 2012-11-28 Ramana Radhakrishnan Greta Yorsh * config/arm/arm.md (type): Subdivide "alu" into "alu_reg" and "simple_alu_imm". (core_cycles): Use new names. (arm_addsi3): Set type attribute for patterns involving simple_alu_imm. (addsi3_compare0, addsi3_compare0_scratch): Likewise. (addsi3_compare_op1, addsi3_compare_op2, compare_addsi2_op0): Likewise. (compare_addsi2_op1, arm_subsi3_insn, subsi3_compare0): Likewise. (subsi3_compare, arm_decscc,arm_andsi3_insn): Likewise. (thumb1_andsi3_insn, andsi3_compare0_scratch): Likewise. (zeroextractsi_compare0_scratch, iorsi3_insn, iorsi3_compare0): Likewise. (iorsi3_compare0_scratch, arm_xorsi3, thumb1_xorsi3_insn): Likewise. (xorsi3_compare0, xorsi3_compare0_scratch, thumb1_zero_extendhisi2): Likewise. (arm_zero_extendhisi2_v6, thumb1_zero_extendqisi2_v): Likewise. (arm_zero_extendqisi2_v6, thumb1_extendhisi2, arm_extendqisi_v6): Likewise. (thumb1_extendqisi2, arm_movsi_insn): Likewise. (movsi_compare0, movhi_insn_arch4, movhi_bytes): Likewise. (arm_movqi_insn, thumb1_movqi_insn, arm_cmpsi_insn): Likewise. (movsicc_insn, if_plus_move, if_move_plus): Likewise. * config/arm/neon.md (neon_mov/VDX): Likewise. (neon_mov/VQXMOV): Likewise. * config/arm/arm1020e.md (1020alu_op): Likewise. * config/arm/fmp626.md (mp626_alu_op): Likewise. * config/arm/fa726te.md (726te_alu_op): Likewise. * config/arm/fa626te.md (626te_alu_op): Likewise. * config/arm/fa606te.md (606te_alu_op): Likewise. * config/arm/fa526.md (526_alu_op): Likewise. * config/arm/cortex-r4.md (cortex_r4_alu, cortex_r4_mov): Likewise. * config/arm/cortex-m4.md (cortex_m4_alu): Likewise. * config/arm/cortex-a9.md (cprtex_a9_dp): Likewise. * config/arm/cortex-a8.md (cortex_a8_alu, cortex_a8_mov): Likewise. * config/arm/cortex-a5.md (cortex_a5_alu): Likewise. * config/arm/cortex-a15.md (cortex_a15_alu): Likewise. * config/arm/arm926ejs.md (9_alu_op): Likewise. * config/arm/arm1136jfs.md (11_alu_op): Likewise. * config/arm/arm1026ejs.md (alu_op): Likewise. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 7e92b69ad861fe90ed409494d451854f30888462.. 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -323,8 +323,14 @@ (define_attr "insn" ; Classification of each insn ; Note: vfp.md has different meanings for some of these, and some further ; types as well. See that file for details. -; alu any alu instruction that doesn't hit memory or fp -; regs or have a shifted source operand +; simple_alu_imm a simple alu instruction that doesn't hit memory or fp +; regs or have a shifted source operand and has an immediate +; operand. This currently only tracks very basic immediate +; alu operations. +; alu_reg any alu instruction that doesn't hit memory or fp +; regs or have a shifted source operand +; and does not have an immediate operand. This is +; also the default ; alu_shiftany data instruction that doesn't hit memory or fp ; regs, but has a source operand shifted by a constant ; alu_shift_regany data instruction that doesn't hit memory or fp @@ -354,7 +360,8 @@ (define_attr "insn" ; (define_attr "type" - "alu,\ + "simple_alu_imm,\ + alu_reg,\ alu_shift,\ alu_shift_reg,\ mult,\ @@ -398,7 +405,7 @@ (define_attr "type" (eq_attr "insn" "smulxy,smlaxy,smlalxy,smulwy,smlawx,mul,muls,mla,mlas,\ umull,umulls,umlal,umlals,smull,smulls,smlal,smlals") (const_string "mult") -(const_string "alu"))) +(const_string "alu_reg"))) ; Is this an (integer side) multiply with a 64-bit result? (define_attr "mul64" "no,yes" @@ -536,7 +543,7 @@ (define_attr "write_conflict" "no,yes" ; than one on the main cpu execution unit. (define_attr "core_cycles" "single,multi" (if_then_else (eq_attr "type" -"alu,alu_shift,float,fdivd,fdivs") +"simple_alu_imm,alu_reg,alu_shift,float,fdivd,fdivs
Re: [PATCH, RFC] PR 55415 : Pessimistic misalignment from eipa_sra pass
Hi, thanks for the review. When writing a reply I realized I indeed made a mistake or two in the part concerning prev_base and the code was not what it intended to be. I'll re-write it today. Nevertheless, I also have a question regarding a different place of the patch: On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote: > On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor wrote: > > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 > > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 > > *** unmodified_by_ref_scalar_representative > > *** 3891,3902 > > return repr; > > } > > > > ! /* Return true iff this access precludes IPA-SRA of the parameter it is > > !associated with. */ > > > > static bool > > ! access_precludes_ipa_sra_p (struct access *access) > > { > > /* Avoid issues such as the second simple testcase in PR 42025. The > > problem > >is incompatible assign in a call statement (and possibly even in asm > >statements). This can be relaxed by using a new temporary but only > > for > > --- 3891,3903 > > return repr; > > } > > > > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is > > !associated with. REQ_ALIGN is the minimum required alignment. */ > > > > static bool > > ! access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) > > { > > + unsigned int exp_align; > > /* Avoid issues such as the second simple testcase in PR 42025. The > > problem > >is incompatible assign in a call statement (and possibly even in asm > >statements). This can be relaxed by using a new temporary but only > > for > > *** access_precludes_ipa_sra_p (struct acces > > *** 3908,3913 > > --- 3909,3918 > > || gimple_code (access->stmt) == GIMPLE_ASM)) > > return true; > > > > + exp_align = get_object_alignment (access->expr); > > Is access->expr from the callee or the caller? The callee. > > > + if (exp_align < req_align) > > + return true; > > + > > return false; > > } > > > > *** splice_param_accesses (tree parm, bool * > > *** 3943,3949 > > tree a1_alias_type; > > access = (*access_vec)[i]; > > modification = access->write; > > ! if (access_precludes_ipa_sra_p (access)) > > return NULL; > > a1_alias_type = reference_alias_ptr_type (access->expr); > > > > --- 3948,3954 > > tree a1_alias_type; > > access = (*access_vec)[i]; > > modification = access->write; > > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) > > access->type is not the type of the base object, right? No, it is the actual type of the scalar memory access. > Which means it is not correct here - the alignment of the access is that what > get_object_alignment returns, which looks at the base as to whether to > lower the alignment compared to TYPE_ALIGN of the access type itself. Oh, so I have to do something like (or even reuse) may_be_unaligned_p from tree-ssa-loop-ivopts.c? (If so, I'm wondering whether a few other uses of get_object_alignment get this right...) Thanks, Martin
[patch mingw]: Correct page-size granularity and retry for address-mapping on mingw-hosts.
Hi, this patch fixes a latent issue about page-size granularity for Windows OSes - like Vista, Win7, etc. - and avoid a race happening for mapping of memory for multiple instances of compiler. ChangeLog 2012-11-29 Kai Tietz * host-mingw32.c (va_granularity): Make none-const. (mingw32_gt_pch_alloc_granularity): Return OS' allocation granularity. (mingw32_gt_pch_use_address): Retry mapping of used address as multiple instances might interfer. Tested for i686-w64-mingw32, and x86_64-w64-mingw32. Will apply tomorrow if there aren't any objections. Regards, Kai Index: host-mingw32.c === --- host-mingw32.c (Revision 193925) +++ host-mingw32.c (Arbeitskopie) @@ -27,6 +27,7 @@ #define WIN32_LEAN_AND_MEAN /* Not so important if we have windows.h.gch. */ #include +#include static void * mingw32_gt_pch_get_address (size_t, int); static int mingw32_gt_pch_use_address (void *, size_t, int, size_t); @@ -45,7 +46,7 @@ static const size_t pch_VA_max_size = 128 * 1024 * 1024; /* Granularity for reserving address space. */ -static const size_t va_granularity = 0x1; +static size_t va_granularity = 0x1; /* Print out the GetLastError() translation. */ static inline void @@ -66,8 +67,14 @@ } /* Granularity for reserving address space. */ -static size_t mingw32_gt_pch_alloc_granularity (void) +static size_t +mingw32_gt_pch_alloc_granularity (void) { + SYSTEM_INFO si; + + GetSystemInfo (&si); + va_granularity = (size_t) si.dwAllocationGranularity; + return va_granularity; } @@ -132,6 +139,8 @@ and earlier, backslashes are invalid in object name. So, we need to check if we are on Windows2000 or higher. */ OSVERSIONINFO version_info; + int r; + version_info.dwOSVersionInfoSize = sizeof (version_info); if (size == 0) @@ -154,7 +163,6 @@ OBJECT_NAME_FMT "%lx", GetCurrentProcessId()); object_name = local_object_name; } - mmap_handle = CreateFileMappingA ((HANDLE) _get_osfhandle (fd), NULL, PAGE_WRITECOPY | SEC_COMMIT, 0, 0, object_name); @@ -164,8 +172,19 @@ w32_error (__FUNCTION__, __FILE__, __LINE__, "CreateFileMapping"); return -1; } - mmap_addr = MapViewOfFileEx (mmap_handle, FILE_MAP_COPY, 0, offset, - size, addr); + + /* Retry five times, as here might occure a race with multiple gcc's + instances at same time. */ + for (r = 0; r < 5; r++) + { + mmap_addr = MapViewOfFileEx (mmap_handle, FILE_MAP_COPY, 0, offset, + size, addr); + if (mmap_addr == addr) + break; + if (r != 4) +Sleep (500); + } + if (mmap_addr != addr) { w32_error (__FUNCTION__, __FILE__, __LINE__, "MapViewOfFileEx");
Re: [PATCH,ARM] Subdivide alu into alu_reg and simple_alu_imm
; ??? Check Thumb-2 split length (define_insn_and_split "*arm_subsi3_insn" - [(set (match_operand:SI 0 "s_register_operand" "=r,r,rk,r") - (minus:SI (match_operand:SI 1 "reg_or_int_operand" "rI,r,k,?n") - (match_operand:SI 2 "reg_or_int_operand" "r,rI,r, r")))] + [(set (match_operand:SI 0 "s_register_operand" "=r,r,r,rk,r") + (minus:SI (match_operand:SI 1 "reg_or_int_operand" "rI,r,r,k,?n") + (match_operand:SI 2 "reg_or_int_operand" "r,I,r,r, r")))] "TARGET_32BIT" "@ rsb%?\\t%0, %2, %1 sub%?\\t%0, %1, %2 sub%?\\t%0, %1, %2 + sub%?\\t%0, %1, %2 #" "&& (CONST_INT_P (operands[1]) && !const_ok_for_arm (INTVAL (operands[1])))" @@ -1270,8 +1295,9 @@ (define_insn_and_split "*arm_subsi3_insn INTVAL (operands[1]), operands[0], operands[2], 0); DONE; " - [(set_attr "length" "4,4,4,16") - (set_attr "predicable" "yes")] + [(set_attr "length" "4,4,4,4,16") + (set_attr "predicable" "yes") + (set_attr "type" "*,simple_alu_imm,*,*,*")] ) There's something wrong here. MINUS (reg, imm) should be canonicalized elsewhere to PLUS (reg, -imm), so the alternative you've split /should/ never match anything. On the other hand, you haven't split the first alternative (that generates RSB), which is a legitimate use of an immediate in MINUS. Otherwise, OK. R. On 29/11/12 09:57, Greta Yorsh wrote: For attribute named "type", subdivide "alu" into "alu_reg" and "simple_alu_imm". Set type attribute as appropriate in define_insn patterns with immediate operands. Update pipeline descriptions to use the new values of type attribute. No regression on qemu arm-none-eabi -mcpu=cortex-a15/cortex-a7. Bootstrap successful on Cortex-A15. No difference in generated assembly when compiling all of preprocessed sources of gcc 4.8 as a test in various configurations: -mcpu=cortex-a15 -march=armv6t2 -marm/-mthumb -O0/-O1/-O2/-O3/-Os. The motivation for this patch is cortex-a7 pipeline description, which will be submitted separately. Ok for trunk? Thanks, Greta ChangeLog gcc/ 2012-11-28 Ramana Radhakrishnan Greta Yorsh * config/arm/arm.md (type): Subdivide "alu" into "alu_reg" and "simple_alu_imm". (core_cycles): Use new names. (arm_addsi3): Set type attribute for patterns involving simple_alu_imm. (addsi3_compare0, addsi3_compare0_scratch): Likewise. (addsi3_compare_op1, addsi3_compare_op2, compare_addsi2_op0): Likewise. (compare_addsi2_op1, arm_subsi3_insn, subsi3_compare0): Likewise. (subsi3_compare, arm_decscc,arm_andsi3_insn): Likewise. (thumb1_andsi3_insn, andsi3_compare0_scratch): Likewise. (zeroextractsi_compare0_scratch, iorsi3_insn, iorsi3_compare0): Likewise. (iorsi3_compare0_scratch, arm_xorsi3, thumb1_xorsi3_insn): Likewise. (xorsi3_compare0, xorsi3_compare0_scratch, thumb1_zero_extendhisi2): Likewise. (arm_zero_extendhisi2_v6, thumb1_zero_extendqisi2_v): Likewise. (arm_zero_extendqisi2_v6, thumb1_extendhisi2, arm_extendqisi_v6): Likewise. (thumb1_extendqisi2, arm_movsi_insn): Likewise. (movsi_compare0, movhi_insn_arch4, movhi_bytes): Likewise. (arm_movqi_insn, thumb1_movqi_insn, arm_cmpsi_insn): Likewise. (movsicc_insn, if_plus_move, if_move_plus): Likewise. * config/arm/neon.md (neon_mov/VDX): Likewise. (neon_mov/VQXMOV): Likewise. * config/arm/arm1020e.md (1020alu_op): Likewise. * config/arm/fmp626.md (mp626_alu_op): Likewise. * config/arm/fa726te.md (726te_alu_op): Likewise. * config/arm/fa626te.md (626te_alu_op): Likewise. * config/arm/fa606te.md (606te_alu_op): Likewise. * config/arm/fa526.md (526_alu_op): Likewise. * config/arm/cortex-r4.md (cortex_r4_alu, cortex_r4_mov): Likewise. * config/arm/cortex-m4.md (cortex_m4_alu): Likewise. * config/arm/cortex-a9.md (cprtex_a9_dp): Likewise. * config/arm/cortex-a8.md (cortex_a8_alu, cortex_a8_mov): Likewise. * config/arm/cortex-a5.md (cortex_a5_alu): Likewise. * config/arm/cortex-a15.md (cortex_a15_alu): Likewise. * config/arm/arm926ejs.md (9_alu_op): Likewise. * config/arm/arm1136jfs.md (11_alu_op): Likewise. * config/arm/arm1026ejs.md (alu_op): Likewise. 1-split-alu-type-attr.v2.patch.txt diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 7e92b69ad861fe90ed409494d451854f30888462.. 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -323,8 +323,14 @@ (define_attr "insn" ; Classification of each insn ; Note: vfp.md has different meanings for some of these, and some further ; types as well. See that file for details. -; alu any alu instruction that doesn't hit memory or fp -; regs or have a shifted source op
[patch mingw]: Cleanup about static/shared libgcc
Hi, this patch synchronize handling - and fixes latent bug - of static/shared flags with cygwin-host. in static case prior code added to linker-script reference to shared libgcc, which is obviously wrong. Due link-order the shared version doesn't get actual linked, but nevertheless there is a latent issue. ChangeLog 2012-11-29 Kai Tietz * config/i386/mingw32.h (SHARED_LIBGCC_SPEC): Synchronize with cygwin-host. Tested for i686-w64-mingw32, and x86_64-w64-mingw32. I will apply tomorrow if there aren't any objections. Regards, Kai Index: mingw32.h === --- mingw32.h (Revision 193925) +++ mingw32.h (Arbeitskopie) @@ -121,15 +121,24 @@ /* Include in the mingw32 libraries with libgcc */ #ifdef ENABLE_SHARED_LIBGCC -#define SHARED_LIBGCC_SPEC "%{shared-libgcc:-lgcc_s} %{!shared-libgcc:-lgcc_eh}" +#define SHARED_LIBGCC_SPEC " \ + %{static|static-libgcc:-lgcc -lgcc_eh} \ + %{!static: \ + %{!static-libgcc: \ + %{!shared: \ + %{!shared-libgcc:-lgcc -lgcc_eh} \ + %{shared-libgcc:-lgcc_s -lgcc} \ + } \ + %{shared:-lgcc_s -lgcc} \ +} \ + } " #else -#define SHARED_LIBGCC_SPEC /*empty*/ +#define SHARED_LIBGCC_SPEC " -lgcc " #endif #undef REAL_LIBGCC_SPEC #define REAL_LIBGCC_SPEC \ "%{mthreads:-lmingwthrd} -lmingw32 \ "SHARED_LIBGCC_SPEC" \ - -lgcc \ -lmoldname -lmingwex -lmsvcrt" #undef STARTFILE_SPEC
Re: PR55124 - tentative patch for ICE in PRE
On Wed, Nov 28, 2012 at 3:05 PM, Tom de Vries wrote: > On 27/11/12 13:41, Richard Biener wrote: >> On Mon, Nov 19, 2012 at 3:33 AM, Tom de Vries wrote: >>> Richard, >>> >>> Consider the PR55124 example test.c: >>> ... >>> int a, b; >>> long c; >>> >>> static void inline >>> f2 (void) >>> { >>> unsigned long k = 1; >>> >>> foo (b ? k = 0 : 0); >>> >>> b = (((c = b) >>> ? (k ?: (c = 0)) >>> : a) >>>* c); >>> } >>> >>> void >>> f1 (void) >>> { >>> f2 (); >>> a = b | c; >>> } >>> ... >>> >>> when compiling with -O2, we're running into the following assertion in pre: >>> ... >>> test.c:18:1: internal compiler error: in find_or_generate_expression, at >>> tree-ssa-pre.c:2802 >>> f1 (void) >>> ^ >>> 0xcf41d3 find_or_generate_expression >>> gcc/tree-ssa-pre.c:2802 >>> 0xcf42f6 create_expression_by_pieces >>> gcc/tree-ssa-pre.c:2861 >>> 0xcf4193 find_or_generate_expression >>> gcc/tree-ssa-pre.c:2799 >>> 0xcf42f6 create_expression_by_pieces >>> gcc/tree-ssa-pre.c:2861 >>> 0xcf4e28 insert_into_preds_of_block >>> gcc/tree-ssa-pre.c:3096 >>> 0xcf5c7d do_regular_insertion >>> gcc/tree-ssa-pre.c:3386 >>> ... >>> >>> We're hitting the assert at the end of find_or_generate_expression: >>> ... >>> static tree >>> find_or_generate_expression (basic_block block, tree op, gimple_seq *stmts) >>> { >>> pre_expr expr = get_or_alloc_expr_for (op); >>> unsigned int lookfor = get_expr_value_id (expr); >>> pre_expr leader = bitmap_find_leader (AVAIL_OUT (block), lookfor); >>> if (leader) >>> { >>> if (leader->kind == NAME) >>> return PRE_EXPR_NAME (leader); >>> else if (leader->kind == CONSTANT) >>> return PRE_EXPR_CONSTANT (leader); >>> } >>> >>> /* It must be a complex expression, so generate it recursively. */ >>> bitmap exprset = VEC_index (bitmap, value_expressions, lookfor); >>> bitmap_iterator bi; >>> unsigned int i; >>> EXECUTE_IF_SET_IN_BITMAP (exprset, 0, i, bi) >>> { >>> pre_expr temp = expression_for_id (i); >>> if (temp->kind != NAME) >>> return create_expression_by_pieces (block, temp, stmts, >>> get_expr_type (expr)); >>> } >>> >>> gcc_unreachable (); >>> } >>> ... >>> >>> The state in which we're asserting is the following: >>> ... >>> #5 0x00cf41d4 in find_or_generate_expression (block=0x76210f08, >>> op=0x762384c8, stmts=0x7fffdb78) at gcc/tree-ssa-pre.c:2802 >>> 2802 gcc_unreachable (); >>> (gdb) p block.index >>> $13 = 4 >>> (gdb) call debug_generic_expr (op) >>> b.4_3 >>> (gdb) call debug_pre_expr (expr) >>> b.4_3 >>> (gdb) p lookfor >>> $11 = 7 >>> (gdb) call debug_bitmap_set (((bb_value_sets_t) ((block)->aux))->avail_out) >>> debug[0] := { b.4_8 (0012), a.10_13 (0013), _14 (0014), iftmp.5_15 (0015) } >>> (gdb) p leader >>> $12 = (pre_expr) 0x0 >>> (gdb) call debug_bitmap ( exprset ) >>> first = 0x21fb058 current = 0x21fb058 indx = 0 >>> 0x21fb058 next = (nil) prev = (nil) indx = 0 >>> bits = { 22 25 } >>> (gdb) call debug_pre_expr (expression_for_id (22)) >>> b.4_3 >>> (gdb) call debug_pre_expr (expression_for_id (25)) >>> b.4_31 >>> ... >>> We're trying to find or generate an expr with value-id 0007 in block 4. We >>> can't >>> find it (there's no leader) and we can't generate it because there are no >>> exprs >>> with that value that are not names. >>> >>> Higher up in the call stack and skipping create_expression_by_pieces, the >>> state >>> is as follows: >>> ... >>> #7 0x00cf4194 in find_or_generate_expression (block=0x76210f08, >>> op=0x76238558, stmts=0x7fffdb78) at gcc/tree-ssa-pre.c:2799 >>> 2799get_expr_type (expr)); >>> (gdb) call debug_generic_expr (op) >>> c.6_5 >>> (gdb) call debug_pre_expr (expr) >>> c.6_5 >>> (gdb) p lookfor >>> $14 = 9 >>> (gdb) p leader >>> $15 = (pre_expr) 0x0 >>> (gdb) call debug_bitmap ( exprset ) >>> first = 0x21fb0f8 current = 0x21fb0f8 indx = 0 >>> 0x21fb0f8 next = (nil) prev = (nil) indx = 0 >>> bits = { 23 24 26 27 } >>> (gdb) call debug_pre_expr (temp) >>> {nop_expr,b.4_3} >>> (gdb) call debug_pre_expr (expression_for_id (23)) >>> c.6_5 >>> (gdb) call debug_pre_expr (expression_for_id (24)) >>> {nop_expr,b.4_3} >>> (gdb) call debug_pre_expr (expression_for_id (26)) >>> c.6_32 >>> (gdb) call debug_pre_expr (expression_for_id (27)) >>> {mem_ref<0B>,addr_expr<&c>}@.MEM_28 >>> ... >>> We're trying to find or generate an expr with value-id 0009 (in block 4). We >>> can't find it. We're trying to generate it using {nop_expr,b.4_3}, but as >>> we've >>> seen above that won't work. The generation using expr 27 would work though. >>> >>> Again higher up in the call stack and skipping create_expression_by_pieces, >>> the >>> state is as follows: >>> ... >>> (gdb) up >>> #9 0x00cf4e29 in insert_into_preds_of_bloc
Re: [0/8] Add optabs alternatives for insv, extv and extzv
Eric Botcazou writes: >> This is because the structure we are given is: >> >> (mem/v/j:SI (reg/v/f:SI 110 [ t ]) [3 t_2(D)->a+0 S1 A32]) >> >> i.e. a 1-byte SImode reference. The strange size comes from the >> set_mem_attributes_minus_bitpos code that was mentioned earlier >> in the series, where we set the size based on the type even if >> it doesn't match the mode. > > I think that's wrong, we should have S4 and drop the MEM_EXPR as we would do > it with adjust_bitfield_address. In other words, if the size computed from > the tree is smaller than the mode size, we don't use it and clear MEM_EXPR. I agree that this kind of MEM is less than ideal, but I thought: set_mem_attributes_minus_bitpos (to_rtx, to, 0, bitpos); said that the attributes of TO_RTX will to be TO _once a pending offset-and- narrowing operation has been applied_. So we have: /* If we modified OFFSET based on T, then subtract the outstanding bit position offset. Similarly, increase the size of the accessed object to contain the negative offset. */ if (apply_bitpos) { gcc_assert (attrs.offset_known_p); attrs.offset -= apply_bitpos / BITS_PER_UNIT; if (attrs.size_known_p) attrs.size += apply_bitpos / BITS_PER_UNIT; } I didn't think we necessarily expected the width of the reference (TO_RTX) and the width of the type (TO) to match at this stage. That's different from adjust_bitfield_address, where the offset-and-narrowing operation itself is applied. The difference between the width of the reference and the width of T is what led to: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00262.html As things stand, APPLY_BITPOS is only nonzero if we set both the MEM_EXPR and MEM_SIZE from T. There are also cases (like this one) where we don't set the MEM_EXPR from T but do set the MEM_SIZE from T. The bitpos will be applied either way, so I thought MEM_SIZE should be the same in both cases. That doesn't fix this problem of course, it's just an argument that the relationship between the width of the reference mode, the MEM_SIZE and the width of T seems pretty complicated with the current interface. Maybe set_mem_attributes_minus_bitpos (but not set_mem_attributes) should only set the MEM_EXPR and leave the MEM_SIZE unchanged? Before submitting the patched linked above, I tried getting rid of set_mem_attributes_minus_bitpos and passing the tree down instead. Then we could set the attributes at the time of the offset-and-narrowing operation, where the size and offset of the final reference are known. That didn't seem like an easy change to make though, and became a bit of a distraction from the main patches. Anyway, given the breakage that this series has already caused, I'd prefer not to tackle stuff like this as well. I'd only used MEM_SIZE in the first attempted patch out of habit. I think the revised patch more obviously matches the *_fixed_bit_field functions and is more generally in keeping with the existing checks. (It's deliberately more conservative though, only using register bitfields if both the bit_field_mode_iterator and strict volatile bitfield rules are met.) Richard
Re: PR fortran/51727: make module files reproducible, question on C++ in gcc
Dear gfortraners, On 2012-10-14 23:35, Janne Blomqvist wrote: - I'd be vary wrt backporting, in my experience the module.c code is somewhat fragile and easily causes regressions. In any case, AFAICS PR 51727 is not a regression. Can this be reconsidered, as the benefits for users seem to be fairly large? According to bugzilla, the patch didn't cause any regressions after three weeks in the trunk. Joost has used this patch in hand-patched 4.7 compilers for a few weeks more with no negative effects. Cheers, - Tobi
Re: [PATCH] Instance information in DWARF discriminators
* Richard Biener, 2012-11-28 : > You need to stream the id with LTO, no? Also increasing the size > of line_map_ordinary might not be the most welcom change ... Thanks for your feedback Richard, I'll take care of adding the missing LTO pieces, and improve documentation of the flag. As for the size impact on line_map_ordinary, indeed that's an additional int here, but I'm not sure we allocate so many objects of that kind that the increase is significant. Now, an alternative might be to have an array of instance IDs stored at the struct line_maps level, with the same indices at the set of ordinary maps, which would be allocated only when flag_debug_instances is used; when it is not used the cost would then be a single NULL pointer in struct line_maps. Would that be acceptable? Side note: I first tried to implement this option using a vec, but it appears that vec.h cannot currently be used from within libcpp. So, I'll have to either change that (but that seems like a pretty significant change) or else manually manage this array through xrealloc. Thomas. -- Thomas Quinot, Ph.D. ** qui...@adacore.com ** Senior Software Engineer AdaCore -- Paris, France -- New York, USA
Re: [Patch, Fortran] PR55469 - fix I/O memory leaks in case of failure and iostat= being present
Tobias Burnus wrote: l_push_char allocates memory which is freed with free_line. However, currently, the memory is not always freed when calling generate_error. If one aborts, that's fine. However, generate_error can also set the iostat variable. Updated version: Corrected PR number - and ensured that if convert_real fails, free_saved is called (cf. additional test case in the PR). Build and regtested on x86-64-gnu-linux. OK for the trunk? Tobias 2012-11-29 Tobias Burnus PR fortran/55469 * io/list_read (parse_repeat, read_integer, read_character, parse_real, read_real, check_type, list_formatted_read_scalar, finish_list_read): Call list_free. diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c index 403e719..5063c36 100644 --- a/libgfortran/io/list_read.c +++ b/libgfortran/io/list_read.c @@ -617,6 +617,7 @@ parse_repeat (st_parameter_dt *dtp) free_saved (dtp); if (c == EOF) { + free_line (dtp); hit_eof (dtp); return 1; } @@ -905,11 +906,14 @@ read_integer (st_parameter_dt *dtp, int length) free_saved (dtp); if (c == EOF) { + free_line (dtp); hit_eof (dtp); return; } else if (c != '\n') eat_line (dtp); + + free_line (dtp); snprintf (message, MSGLEN, "Bad integer for item %d in list input", dtp->u.p.item_count); generate_error (&dtp->common, LIBERROR_READ_VALUE, message); @@ -1078,7 +1082,6 @@ read_character (st_parameter_dt *dtp, int length __attribute__ ((unused))) unget_char (dtp, c); eat_separator (dtp); dtp->u.p.saved_type = BT_CHARACTER; - free_line (dtp); } else { @@ -1087,10 +1090,12 @@ read_character (st_parameter_dt *dtp, int length __attribute__ ((unused))) dtp->u.p.item_count); generate_error (&dtp->common, LIBERROR_READ_VALUE, message); } + free_line (dtp); return; eof: free_saved (dtp); + free_line (dtp); hit_eof (dtp); } @@ -1283,11 +1288,14 @@ parse_real (st_parameter_dt *dtp, void *buffer, int length) free_saved (dtp); if (c == EOF) { + free_line (dtp); hit_eof (dtp); return 1; } else if (c != '\n') eat_line (dtp); + + free_line (dtp); snprintf (message, MSGLEN, "Bad floating point number for item %d", dtp->u.p.item_count); generate_error (&dtp->common, LIBERROR_READ_VALUE, message); @@ -1387,11 +1395,14 @@ eol_4: free_saved (dtp); if (c == EOF) { + free_line (dtp); hit_eof (dtp); return; } else if (c != '\n') eat_line (dtp); + + free_line (dtp); snprintf (message, MSGLEN, "Bad complex value in item %d of list input", dtp->u.p.item_count); generate_error (&dtp->common, LIBERROR_READ_VALUE, message); @@ -1624,7 +1635,10 @@ read_real (st_parameter_dt *dtp, void * dest, int length) eat_separator (dtp); push_char (dtp, '\0'); if (convert_real (dtp, dest, dtp->u.p.saved_string, length)) -return; +{ + free_saved (dtp); + return; +} free_saved (dtp); dtp->u.p.saved_type = BT_REAL; @@ -1762,12 +1776,14 @@ read_real (st_parameter_dt *dtp, void * dest, int length) free_saved (dtp); if (c == EOF) { + free_line (dtp); hit_eof (dtp); return; } else if (c != '\n') eat_line (dtp); + free_line (dtp); snprintf (message, MSGLEN, "Bad real number in item %d of list input", dtp->u.p.item_count); generate_error (&dtp->common, LIBERROR_READ_VALUE, message); @@ -1784,6 +1800,7 @@ check_type (st_parameter_dt *dtp, bt type, int len) if (dtp->u.p.saved_type != BT_UNKNOWN && dtp->u.p.saved_type != type) { + free_line (dtp); snprintf (message, MSGLEN, "Read type %s where %s was expected for item %d", type_name (dtp->u.p.saved_type), type_name (type), dtp->u.p.item_count); @@ -1797,6 +1814,7 @@ check_type (st_parameter_dt *dtp, bt type, int len) if (dtp->u.p.saved_length != len) { + free_line (dtp); snprintf (message, MSGLEN, "Read kind %d %s where kind %d is required for item %d", dtp->u.p.saved_length, type_name (dtp->u.p.saved_type), len, @@ -1970,7 +1988,10 @@ list_formatted_read_scalar (st_parameter_dt *dtp, bt type, void *p, cleanup: if (err == LIBERROR_END) -hit_eof (dtp); +{ + free_line (dtp); + hit_eof (dtp); +} return err; } @@ -2018,7 +2039,10 @@ finish_list_read (st_parameter_dt *dtp) err = eat_line (dtp); if (err == LIBERROR_END) -hit_eof (dtp); +{ + free_line (dtp); + hit_eof (dtp); +} } /* NAMELIST INPUT
Re: [PATCH, RFC] PR 55415 : Pessimistic misalignment from eipa_sra pass
On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor wrote: > Hi, > > thanks for the review. When writing a reply I realized I indeed made > a mistake or two in the part concerning prev_base and the code was not > what it intended to be. I'll re-write it today. > > Nevertheless, I also have a question regarding a different place of > the patch: > > On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote: >> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor wrote: >> > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 >> > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 >> > *** unmodified_by_ref_scalar_representative >> > *** 3891,3902 >> > return repr; >> > } >> > >> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is >> > !associated with. */ >> > >> > static bool >> > ! access_precludes_ipa_sra_p (struct access *access) >> > { >> > /* Avoid issues such as the second simple testcase in PR 42025. The >> > problem >> >is incompatible assign in a call statement (and possibly even in asm >> >statements). This can be relaxed by using a new temporary but only >> > for >> > --- 3891,3903 >> > return repr; >> > } >> > >> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is >> > !associated with. REQ_ALIGN is the minimum required alignment. */ >> > >> > static bool >> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int >> > req_align) >> > { >> > + unsigned int exp_align; >> > /* Avoid issues such as the second simple testcase in PR 42025. The >> > problem >> >is incompatible assign in a call statement (and possibly even in asm >> >statements). This can be relaxed by using a new temporary but only >> > for >> > *** access_precludes_ipa_sra_p (struct acces >> > *** 3908,3913 >> > --- 3909,3918 >> > || gimple_code (access->stmt) == GIMPLE_ASM)) >> > return true; >> > >> > + exp_align = get_object_alignment (access->expr); >> >> Is access->expr from the callee or the caller? > > The callee. > >> >> > + if (exp_align < req_align) >> > + return true; >> > + >> > return false; >> > } >> > >> > *** splice_param_accesses (tree parm, bool * >> > *** 3943,3949 >> > tree a1_alias_type; >> > access = (*access_vec)[i]; >> > modification = access->write; >> > ! if (access_precludes_ipa_sra_p (access)) >> > return NULL; >> > a1_alias_type = reference_alias_ptr_type (access->expr); >> > >> > --- 3948,3954 >> > tree a1_alias_type; >> > access = (*access_vec)[i]; >> > modification = access->write; >> > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) >> >> access->type is not the type of the base object, right? > > No, it is the actual type of the scalar memory access. > >> Which means it is not correct here - the alignment of the access is that >> what >> get_object_alignment returns, which looks at the base as to whether to >> lower the alignment compared to TYPE_ALIGN of the access type itself. > > Oh, so I have to do something like (or even reuse) may_be_unaligned_p > from tree-ssa-loop-ivopts.c? (If so, I'm wondering whether a few other > uses of get_object_alignment get this right...) Well, I think you should use get_object_alignment here, too, and punt if that is less than TYPE_ALIGN (access->type) if you will end up using that. get_object_alignment (ref) is the authorative answer to alignment questions. Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted depends on whether get_object_alignment (ref) returns the same or bigger alignment. But if the type is all you can propagate in this case (and not an explicit alignment value) you have to punt if the information from the type isn't conservative. Richard. > Thanks, > > Martin
Re: [Patch, Fortran] PR55469 - fix I/O memory leaks in case of failure and iostat= being present
On Thu, Nov 29, 2012 at 1:38 PM, Tobias Burnus wrote: > Tobias Burnus wrote: >> >> l_push_char allocates memory which is freed with free_line. However, >> currently, the memory is not always freed when calling generate_error. If >> one aborts, that's fine. However, generate_error can also set the iostat >> variable. > > > Updated version: Corrected PR number - and ensured that if convert_real > fails, free_saved is called (cf. additional test case in the PR). > > Build and regtested on x86-64-gnu-linux. > OK for the trunk? IIRC this is supposed to be a cache than can be subsequently reused without freeing and allocating it again. So it might be better to free it once when the unit is closed. At some point this line buffer should be removed completely and replaced with the fbuf_*() machinery. But so far nobody has found the time to work on that. -- Janne Blomqvist
[patch c++]: 1 of 7 Fix for PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Hello, this trivial patch fixes a bootstrap issue on LLP64 hosts. cp/ PR target/53912 * class.c (dump_class_hierarchy_r): Cast from pointer via uintptr_t. (dump_vtable): Likewise. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-unknown-gnu-linux. Ok for apply? Regards, Kai Index: cp/class.c === --- cp/class.c (Revision 193925) +++ cp/class.c (Arbeitskopie) @@ -7817,7 +7817,7 @@ dump_class_hierarchy_r (FILE *stream, indented = maybe_indent_hierarchy (stream, indent, 0); fprintf (stream, "%s (0x%lx) ", type_as_string (BINFO_TYPE (binfo), TFF_PLAIN_IDENTIFIER), - (unsigned long) binfo); + (unsigned long) (uintptr_t) binfo); if (binfo != igo) { fprintf (stream, "alternative-path\n"); @@ -7842,7 +7842,7 @@ dump_class_hierarchy_r (FILE *stream, fprintf (stream, " primary-for %s (0x%lx)", type_as_string (BINFO_TYPE (BINFO_INHERITANCE_CHAIN (binfo)), TFF_PLAIN_IDENTIFIER), - (unsigned long)BINFO_INHERITANCE_CHAIN (binfo)); + (unsigned long) (uintptr_t) BINFO_INHERITANCE_CHAIN (binfo)); } if (BINFO_LOST_PRIMARY_P (binfo)) { @@ -7975,7 +7975,8 @@ dump_vtable (tree t, tree binfo, tree vtable) if (ctor_vtbl_p) { if (!BINFO_VIRTUAL_P (binfo)) - fprintf (stream, " (0x%lx instance)", (unsigned long)binfo); + fprintf (stream, " (0x%lx instance)", (unsigned long) +(uintptr_t) binfo); fprintf (stream, " in %s", type_as_string (t, TFF_PLAIN_IDENTIFIER)); } fprintf (stream, "\n");
[patch]: 2 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Hello, this trivial patch fixes a bootstrap issue on LLP64 hosts. ChangeLog 2012-11-29 Kai Tietz PR target/53912 * ggc-common.c (POINTER_HASH): Cast from pointer via intptr_t. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-unknown-gnu-linux. Ok for apply? Regards, Kai Index: ggc-common.c === --- ggc-common.c(Revision 193925) +++ ggc-common.c(Arbeitskopie) @@ -304,7 +304,7 @@ struct ptr_data enum gt_types_enum type; }; -#define POINTER_HASH(x) (hashval_t)((long)x >> 3) +#define POINTER_HASH(x) (hashval_t)((intptr_t)x >> 3) /* Register an object in the hash table. */
[patch]: 3 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Hello, this trivial patch fixes a bootstrap issue on LLP64 hosts. ChangeLog 2012-11-29 Kai Tietz PR target/53912 * pointer-set.c (hash1): Cast from pointer via uintptr_t. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-unknown-gnu-linux. Ok for apply? Regards, Kai Index: pointer-set.c === --- pointer-set.c (Revision 193925) +++ pointer-set.c (Arbeitskopie) @@ -64,7 +64,7 @@ hash1 (const void *p, unsigned long max, unsigned #endif const unsigned long shift = HOST_BITS_PER_LONG - logmax; - return ((A * (unsigned long) p) >> shift) & (max - 1); + return ((A * (uintptr_t) p) >> shift) & (max - 1); } /* Allocate an empty pointer set. */
[patch prefix.c]: 4 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Hello, this trivial patch fixes a bootstrap issue on LLP64 hosts. ChangeLog 2012-11-29 Kai Tietz PR target/53912 * prefix.c (lookup_key): Explicit cast return-type of xmalloc/xrealloc to char *. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-unknown-gnu-linux. Ok for apply? Regards, Kai Index: prefix.c === --- prefix.c(Revision 193925) +++ prefix.c(Arbeitskopie) @@ -157,12 +157,12 @@ lookup_key (char *key) } size = 32; - dst = xmalloc (size); + dst = (char *) xmalloc (size); res = RegQueryValueExA (reg_key, key, 0, &type, (LPBYTE) dst, &size); if (res == ERROR_MORE_DATA && type == REG_SZ) { - dst = xrealloc (dst, size); + dst = (char *) xrealloc (dst, size); res = RegQueryValueExA (reg_key, key, 0, &type, (LPBYTE) dst, &size); }
[patch print-tree.c]: 5 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Hello, this trivial patch fixes a bootstrap issue on LLP64 hosts. ChangeLog 2012-11-29 Kai Tietz PR target/53912 * print-tree.c (print_node): Cast from pointer via uintptr_t. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-unknown-gnu-linux. Ok for apply? Regards, Kai Index: print-tree.c === --- print-tree.c(Revision 193925) +++ print-tree.c(Arbeitskopie) @@ -255,7 +255,7 @@ print_node (FILE *file, const char *prefix, tree n /* Allow this function to be called if the table is not there. */ if (table) { - hash = ((unsigned long) node) % HASH_SIZE; + hash = ((uintptr_t) node) % HASH_SIZE; /* If node is in the table, just mention its address. */ for (b = table[hash]; b; b = b->next)
[patch stmt.c]: 6 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Hello, this trivial patch fixes a bootstrap issue on LLP64 hosts. ChangeLog 2012-11-29 Kai Tietz PR target/53912 * stmt.c (compute_cases_per_edge): Cast from pointer via intptr_t. (expand_case): Likewise. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-unknown-gnu-linux. Ok for apply? Regards, Kai Index: stmt.c === --- stmt.c (Revision 193925) +++ stmt.c (Arbeitskopie) @@ -2061,7 +2061,7 @@ compute_cases_per_edge (gimple stmt) tree lab = CASE_LABEL (elt); basic_block case_bb = label_to_block_fn (cfun, lab); edge case_edge = find_edge (bb, case_bb); - case_edge->aux = (void *)((long)(case_edge->aux) + 1); + case_edge->aux = (void *)((intptr_t)(case_edge->aux) + 1); } } @@ -2176,7 +2176,7 @@ expand_case (gimple stmt) edge case_edge = find_edge (bb, case_bb); case_list = add_case_node ( case_list, low, high, lab, - case_edge->probability / (long)(case_edge->aux), + case_edge->probability / (intptr_t)(case_edge->aux), case_node_pool); } pointer_set_destroy (seen_labels);
[patch tree-dump.c]: 7 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Hello, this trivial patch fixes a bootstrap issue on LLP64 hosts. ChangeLog 2012-11-29 Kai Tietz PR target/53912 * tree-dump.c (dump_pointer): Cast from pointer via intptr_t. Tested for i686-w64-mingw32, x86_64-w64-mingw32, and x86_64-unknown-gnu-linux. Ok for apply? Regards, Kai Index: tree-dump.c === --- tree-dump.c (Revision 193925) +++ tree-dump.c (Arbeitskopie) @@ -177,7 +177,8 @@ void dump_pointer (dump_info_p di, const char *field, void *ptr) { dump_maybe_newline (di); - fprintf (di->stream, "%-4s: %-8lx ", field, (unsigned long) ptr); + fprintf (di->stream, "%-4s: %-8lx ", field, + (unsigned long) (uintptr_t) ptr); di->column += 15; }
Re: PR55124 - tentative patch for ICE in PRE
On 29/11/12 11:26, Richard Biener wrote: > I'm continuing trying to move value-ids back to PRE land. Your patch > would be nice in a form that verifies the order is indeed topological, > maybe you can re-work it in a way that does this in > sorted_array_from_bitmap_set in a ENABLE_CHECKING piece? Richard, These are my current patches, tested together with tree-ssa.exp. The first patch checks the topological order in sorted_array_from_bitmap_set. Testing only this one with tree-ssa.exp gives 400 failures. Btw, I'm not 100% sure if this patch checks the required order. It's clear what topological order means if there is one expression per value. I've ran tree-ssa.exp with an assert that the number of expressions and values in the bitmap_set is equal in sorted_array_from_bitmap_set, and that passed, so that seems to be the case generally, but I don't know if that's by design. If there are more expressions with the same value, this patch is a 'weak' check, meaning a value is considered available if one expression with that value is available. A 'strong' check would consider a value available if all expressions with that value are available. I can imagine doing clean on a strongly or weakly ordered array could give different results. The second patch calculates value_id during pre. If you're working on the value_id part, I'll stop here. Thanks, - Tom 2012-11-29 Tom de Vries * tree-ssa-pre.c (sorted_array_from_bitmap_set): Use EXECUTE_IF_AND_IN_BITMAP instead of EXECUTE_IF_SET_IN_BITMAP. Check sort result ifdef ENABLE_CHECKING. Check if the sorted array has the same number of expressions as the bitmap_set. Index: gcc/tree-ssa-pre.c === --- gcc/tree-ssa-pre.c (revision 193497) +++ gcc/tree-ssa-pre.c (working copy) @@ -718,6 +781,9 @@ unsigned int i, j; bitmap_iterator bi, bj; VEC(pre_expr, heap) *result; +#ifdef ENABLE_CHECKING + bitmap values_done = BITMAP_ALLOC (NULL); +#endif /* Pre-allocate roughly enough space for the array. */ result = VEC_alloc (pre_expr, heap, bitmap_count_bits (&set->values)); @@ -735,13 +801,70 @@ If this is somehow a significant lose for some cases, we can choose which set to walk based on the set size. */ bitmap exprset = VEC_index (bitmap, value_expressions, i); - EXECUTE_IF_SET_IN_BITMAP (exprset, 0, j, bj) + EXECUTE_IF_AND_IN_BITMAP (exprset, &set->expressions, 0, j, bj) { - if (bitmap_bit_p (&set->expressions, j)) - VEC_safe_push (pre_expr, heap, result, expression_for_id (j)); -} + pre_expr expr = expression_for_id (j); + VEC_safe_push (pre_expr, heap, result, expr); + +#ifdef ENABLE_CHECKING + switch (expr->kind) + { + case NARY: + { + vn_nary_op_t nary = PRE_EXPR_NARY (expr); + unsigned int k; + for (k = 0; k < nary->length; k++) + { + tree op = nary->op[k]; + if (TREE_CODE (op) != SSA_NAME) + continue; + unsigned int v = VN_INFO (op)->value_id; + if (!bitmap_bit_p (values_done, v) + && bitmap_bit_p (&set->values, v)) + gcc_unreachable (); + } + } + break; + + case REFERENCE: + { + vn_reference_t ref = PRE_EXPR_REFERENCE (expr); + vn_reference_op_t vro; + + unsigned int k; + FOR_EACH_VEC_ELT (vn_reference_op_s, ref->operands, k, vro) + { + tree ops[3] = { vro->op0, vro->op1, vro->op2}; + unsigned int l; + for (l = 0; l < 3; l++) + { + tree op = ops[l]; + if (op == NULL_TREE + || TREE_CODE (op) != SSA_NAME) + continue; + unsigned int v = VN_INFO (op)->value_id; + if (!bitmap_bit_p (values_done, v) + && bitmap_bit_p (&set->values, v)) + gcc_unreachable (); + } + } + } + break; + + default: + break; + } + + bitmap_set_bit (values_done, get_expr_value_id (expr)); +#endif + } } + gcc_assert (bitmap_count_bits (&set->expressions) + == VEC_length (pre_expr, result)); +#ifdef ENABLE_CHECKING + BITMAP_FREE (values_done); +#endif return result; } Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c (revision 193497) +++ gcc/tree-ssa-sccvn.c (working copy) @@ -3967,8 +3967,8 @@ /* Set the value ids in the valid hash tables. */ -static void -set_hashtable_value_ids (void) +void +vn_set_hashtable_value_ids (void) { htab_iterator hi; vn_nary_op_t vno; @@ -4000,7 +4000,6 @@ { size_t i; tree param; - bool changed = true; default_vn_walk_kind = default_vn_walk_kind_; @@ -4029,45 +4028,6 @@ } } - /* Initialize the value ids. */ - - for (i = 1; i < num_ssa_names; ++i) -{ - tree name = ssa_name (i); - vn_ssa_aux_t info; - if (!name) - continue; - info = VN_INFO (name); - if (info->valnum == name - || info->valnum == VN_TOP) - info->value_id = get_next_value_id (); - else if (is_gimple_
[tsan] instrument_expr improvements
Hi! As discussed earlier on mailing list and on IRC with Richard, pt_solution_includes can be used for testing of whether var might escape current function (for -O0 unfortunately returns true even for !may_be_aliased automatic vars, so I'm testing for that too), and there are IMHO tons of completely unnecessary tests and restrictions what we want to instrument and what not. instrument_expr is guarded with gimple_store_p or gimple_assign_load_p, so it shouldn't see SSA_NAMEs, and many other ops, basically just DECL_Ps, handled_components, MEM_REF or TARGET_MEM_REF where for handled_components the base address is one of DECL_P, MEM_REF or TARGET_MEM_REF. I'm actually surprised that insturment_expr handles also is_gimple_call in it, yet nothing calls it for calls, I bet for calls we also want to call it for gimple_store_p calls. Dmitry, could you please test this on something where the previous tsan code has been tested on (if anything)? All I've done was eyeball a few short testcases. TSAN will badly need similar optimization pass to what ASAN needs (after deferring expansion of the shadow memory checks), e.g. var++ right now results in __tsan_read4 (&var); followed soon by __tsan_write4 (&var);. With no intervening calls (we could ignore many string/memory builtins I guess) and no intervening atomics it should be fine to just use __tsan_write4 (&var); for that, right? Similarly for multiple accesses to the same var, where first use dominates the other accesses and there are no intervening calls. As for bitfields, I think we want to use DECL_BIT_FIELD_REPRESENTATIVE. 2012-11-29 Jakub Jelinek * tsan.c (is_load_of_const_p): Removed. (instrument_expr): Use result of get_inner_reference instead of get_base_address, avoid some unnecessary tests, use !pt_solution_includes and !may_be_aliased tests to check whether base might escape current function. --- gcc/tsan.c.jj 2012-11-23 15:27:38.0 +0100 +++ gcc/tsan.c 2012-11-29 12:41:26.816857288 +0100 @@ -84,65 +84,18 @@ is_vptr_store (gimple stmt, tree expr, b return NULL; } -/* Checks as to whether EXPR refers to constant var/field/param. - Don't bother to instrument them. */ - -static bool -is_load_of_const_p (tree expr, bool is_write) -{ - if (is_write) -return false; - if (TREE_CODE (expr) == COMPONENT_REF) -expr = TREE_OPERAND (expr, 1); - if (TREE_CODE (expr) == VAR_DECL - || TREE_CODE (expr) == PARM_DECL - || TREE_CODE (expr) == FIELD_DECL) -{ - if (TREE_READONLY (expr)) - return true; -} - return false; -} - /* Instruments EXPR if needed. If any instrumentation is inserted, return true. */ static bool instrument_expr (gimple_stmt_iterator gsi, tree expr, bool is_write) { - enum tree_code tcode; tree base, rhs, expr_type, expr_ptr, builtin_decl; basic_block bb; HOST_WIDE_INT size; gimple stmt, g; location_t loc; - base = get_base_address (expr); - if (base == NULL_TREE - || TREE_CODE (base) == SSA_NAME - || TREE_CODE (base) == STRING_CST) -return false; - - tcode = TREE_CODE (expr); - - /* Below are things we do not instrument - (no possibility of races or not implemented yet). */ - if (/* Compiler-emitted artificial variables. */ - (DECL_P (expr) && DECL_ARTIFICIAL (expr)) - /* The var does not live in memory -> no possibility of races. */ - || (tcode == VAR_DECL - && !TREE_ADDRESSABLE (expr) - && TREE_STATIC (expr) == 0) - /* Not implemented. */ - || TREE_CODE (TREE_TYPE (expr)) == RECORD_TYPE - /* Not implemented. */ - || tcode == CONSTRUCTOR - /* Not implemented. */ - || tcode == PARM_DECL - /* Load of a const variable/parameter/field. */ - || is_load_of_const_p (expr, is_write)) -return false; - size = int_size_in_bytes (TREE_TYPE (expr)); if (size == -1) return false; @@ -153,18 +106,29 @@ instrument_expr (gimple_stmt_iterator gs tree offset; enum machine_mode mode; int volatilep = 0, unsignedp = 0; - get_inner_reference (expr, &bitsize, &bitpos, &offset, - &mode, &unsignedp, &volatilep, false); - if (bitpos % (size * BITS_PER_UNIT) - || bitsize != size * BITS_PER_UNIT) + base = get_inner_reference (expr, &bitsize, &bitpos, &offset, + &mode, &unsignedp, &volatilep, false); + + /* No need to instrument accesses to decls that don't escape, + they can't escape to other threads then. */ + if (DECL_P (base)) +{ + struct pt_solution pt; + memset (&pt, 0, sizeof (pt)); + pt.escaped = 1; + pt.ipa_escaped = flag_ipa_pta != 0; + pt.nonlocal = 1; + if (!pt_solution_includes (&pt, base)) + return false; + if (!is_global_var (base) && !may_be_aliased (base)) + return false; +} + + if (TREE_READONLY (base)) return false; - /* TODO: handle other case: ARRAY_RANGE_REF. */ - if (
Re: [patch c++]: 1 of 7 Fix for PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
On 11/29/2012 12:03 PM, Kai Tietz wrote: > Hello, > > this trivial patch fixes a bootstrap issue on LLP64 hosts. But it's trivial at the expense of truncating the pointer. Looking around, I'd think that something like casting to hwi (after uintptr_t), and then use HOST_WIDE_INT_PRINT_HEX as format would be the gcc way. > Index: cp/class.c > === > --- cp/class.c (Revision 193925) > +++ cp/class.c (Arbeitskopie) > @@ -7817,7 +7817,7 @@ dump_class_hierarchy_r (FILE *stream, >indented = maybe_indent_hierarchy (stream, indent, 0); >fprintf (stream, "%s (0x%lx) ", >type_as_string (BINFO_TYPE (binfo), TFF_PLAIN_IDENTIFIER), > - (unsigned long) binfo); > + (unsigned long) (uintptr_t) binfo); -- Pedro Alves
Re: [PATCH, RFC] PR 55415 : Pessimistic misalignment from eipa_sra pass
Hi, On Thu, Nov 29, 2012 at 12:54:20PM +0100, Richard Biener wrote: > On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor wrote: > > Hi, > > > > thanks for the review. When writing a reply I realized I indeed made > > a mistake or two in the part concerning prev_base and the code was not > > what it intended to be. I'll re-write it today. > > > > Nevertheless, I also have a question regarding a different place of > > the patch: > > > > On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote: > >> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor wrote: > >> > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 > >> > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 > >> > *** unmodified_by_ref_scalar_representative > >> > *** 3891,3902 > >> > return repr; > >> > } > >> > > >> > ! /* Return true iff this access precludes IPA-SRA of the parameter it is > >> > !associated with. */ > >> > > >> > static bool > >> > ! access_precludes_ipa_sra_p (struct access *access) > >> > { > >> > /* Avoid issues such as the second simple testcase in PR 42025. The > >> > problem > >> >is incompatible assign in a call statement (and possibly even in > >> > asm > >> >statements). This can be relaxed by using a new temporary but > >> > only for > >> > --- 3891,3903 > >> > return repr; > >> > } > >> > > >> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it is > >> > !associated with. REQ_ALIGN is the minimum required alignment. */ > >> > > >> > static bool > >> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int > >> > req_align) > >> > { > >> > + unsigned int exp_align; > >> > /* Avoid issues such as the second simple testcase in PR 42025. The > >> > problem > >> >is incompatible assign in a call statement (and possibly even in > >> > asm > >> >statements). This can be relaxed by using a new temporary but > >> > only for > >> > *** access_precludes_ipa_sra_p (struct acces > >> > *** 3908,3913 > >> > --- 3909,3918 > >> > || gimple_code (access->stmt) == GIMPLE_ASM)) > >> > return true; > >> > > >> > + exp_align = get_object_alignment (access->expr); > >> > >> Is access->expr from the callee or the caller? > > > > The callee. > > > >> > >> > + if (exp_align < req_align) > >> > + return true; > >> > + > >> > return false; > >> > } > >> > > >> > *** splice_param_accesses (tree parm, bool * > >> > *** 3943,3949 > >> > tree a1_alias_type; > >> > access = (*access_vec)[i]; > >> > modification = access->write; > >> > ! if (access_precludes_ipa_sra_p (access)) > >> > return NULL; > >> > a1_alias_type = reference_alias_ptr_type (access->expr); > >> > > >> > --- 3948,3954 > >> > tree a1_alias_type; > >> > access = (*access_vec)[i]; > >> > modification = access->write; > >> > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN > >> > (access->type))) > >> > >> access->type is not the type of the base object, right? > > > > No, it is the actual type of the scalar memory access. > > > >> Which means it is not correct here - the alignment of the access is that > >> what > >> get_object_alignment returns, which looks at the base as to whether to > >> lower the alignment compared to TYPE_ALIGN of the access type itself. > > > > Oh, so I have to do something like (or even reuse) may_be_unaligned_p > > from tree-ssa-loop-ivopts.c? (If so, I'm wondering whether a few other > > uses of get_object_alignment get this right...) > > Well, I think you should use get_object_alignment here, too, and punt if > that is less than TYPE_ALIGN (access->type) if you will end up using that. > > get_object_alignment (ref) is the authorative answer to alignment questions. > Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted > depends on whether get_object_alignment (ref) returns the same or bigger > alignment. But if the type is all you can propagate in this case (and not > an explicit alignment value) you have to punt if the information from the > type isn't conservative. > This must be some sort of misunderstanding, that is exactly what the code is doing. access_precludes_ipa_sra_p called get_object_alignment on the reference and then returned false (i.e. prevented the transformation) if the result was smaller than the TYPE_ALIGN of the type that was going to be propagated to the callee. Thanks, Martin
Re: [patch c++]: 1 of 7 Fix for PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
2012/11/29 Pedro Alves : > On 11/29/2012 12:03 PM, Kai Tietz wrote: >> Hello, >> >> this trivial patch fixes a bootstrap issue on LLP64 hosts. > > But it's trivial at the expense of truncating the pointer. > Looking around, I'd think that something like casting to hwi (after > uintptr_t), > and then use HOST_WIDE_INT_PRINT_HEX as format would be the gcc way. > >> Index: cp/class.c >> === >> --- cp/class.c (Revision 193925) >> +++ cp/class.c (Arbeitskopie) >> @@ -7817,7 +7817,7 @@ dump_class_hierarchy_r (FILE *stream, >>indented = maybe_indent_hierarchy (stream, indent, 0); >>fprintf (stream, "%s (0x%lx) ", >>type_as_string (BINFO_TYPE (binfo), TFF_PLAIN_IDENTIFIER), >> - (unsigned long) binfo); >> + (unsigned long) (uintptr_t) binfo); > > -- > Pedro Alves Well, hwi format is of course the way to display pointer complete. On the other hand are in most cases lower 32-bit sufficent on debug-output. Kai
Re: [patch c++]: 1 of 7 Fix for PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
On 11/29/2012 12:45 PM, Kai Tietz wrote: > 2012/11/29 Pedro Alves : >> On 11/29/2012 12:03 PM, Kai Tietz wrote: >>> Hello, >>> >>> this trivial patch fixes a bootstrap issue on LLP64 hosts. >> >> But it's trivial at the expense of truncating the pointer. >> Looking around, I'd think that something like casting to hwi (after >> uintptr_t), >> and then use HOST_WIDE_INT_PRINT_HEX as format would be the gcc way. >> >>> Index: cp/class.c >>> === >>> --- cp/class.c (Revision 193925) >>> +++ cp/class.c (Arbeitskopie) >>> @@ -7817,7 +7817,7 @@ dump_class_hierarchy_r (FILE *stream, >>>indented = maybe_indent_hierarchy (stream, indent, 0); >>>fprintf (stream, "%s (0x%lx) ", >>>type_as_string (BINFO_TYPE (binfo), TFF_PLAIN_IDENTIFIER), >>> - (unsigned long) binfo); >>> + (unsigned long) (uintptr_t) binfo); >> > Well, hwi format is of course the way to display pointer complete. On > the other hand are in most cases lower 32-bit sufficent on > debug-output. But what's the point of not making it complete? It's not like we're talking about a large amount of work to get it right. It should be a two-line patch? -- Pedro Alves
Re: PR55124 - tentative patch for ICE in PRE
On Thu, Nov 29, 2012 at 1:14 PM, Tom de Vries wrote: > On 29/11/12 11:26, Richard Biener wrote: >> I'm continuing trying to move value-ids back to PRE land. Your patch >> would be nice in a form that verifies the order is indeed topological, >> maybe you can re-work it in a way that does this in >> sorted_array_from_bitmap_set in a ENABLE_CHECKING piece? > > Richard, > > These are my current patches, tested together with tree-ssa.exp. > > > The first patch checks the topological order in sorted_array_from_bitmap_set. > Testing only this one with tree-ssa.exp gives 400 failures. > > Btw, I'm not 100% sure if this patch checks the required order. It's clear > what > topological order means if there is one expression per value. I've ran > tree-ssa.exp with an assert that the number of expressions and values in the > bitmap_set is equal in sorted_array_from_bitmap_set, and that passed, so that > seems to be the case generally, but I don't know if that's by design. > If there are more expressions with the same value, this patch is a 'weak' > check, > meaning a value is considered available if one expression with that value is > available. A 'strong' check would consider a value available if all > expressions > with that value are available. I can imagine doing clean on a strongly or > weakly > ordered array could give different results. > > > The second patch calculates value_id during pre. If you're working on the > value_id part, I'll stop here. Thanks. For the moment I am fixing value-id assigns in tree-ssa-sccvn.c but there seems to be an issue with possibly assigning the same value-ids for conflicting topology: if () { a = b + c; d = a + 1; } else { f = g + h; i = f - 1; } if for some reason we value-number 'a' and 'i' the same and 'd' and 'f' the same then there is no value-id ordering that sorts correctly according to expression dependencies (some reason being for example g == b + c, h = 2). So I'm indeed not sure that simply sorting after value-id and assigning "proper" value-ids will work at all. patch that adjusts value-id assignment in SCCVN according to PREs expectations attached. Richard. > Thanks, > - Tom > > 2012-11-29 Tom de Vries > > * tree-ssa-pre.c (sorted_array_from_bitmap_set): Use > EXECUTE_IF_AND_IN_BITMAP instead of EXECUTE_IF_SET_IN_BITMAP. Check > sort result ifdef ENABLE_CHECKING. Check if the sorted array has the > same number of expressions as the bitmap_set. > > p2 Description: Binary data
Re: [patch c++]: 1 of 7 Fix for PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Ok, here is the tested adjusted version using (u)intptr_t and hwi. Ok for apply? Kai Index: cp/class.c === --- cp/class.c (Revision 193925) +++ cp/class.c (Arbeitskopie) @@ -7815,9 +7815,9 @@ dump_class_hierarchy_r (FILE *stream, int i; indented = maybe_indent_hierarchy (stream, indent, 0); - fprintf (stream, "%s (0x%lx) ", + fprintf (stream, "%s (0x" HOST_WIDE_INT_PRINT_HEX ") ", type_as_string (BINFO_TYPE (binfo), TFF_PLAIN_IDENTIFIER), - (unsigned long) binfo); + (HOST_WIDE_INT) (uintptr_t) binfo); if (binfo != igo) { fprintf (stream, "alternative-path\n"); @@ -7839,10 +7839,10 @@ dump_class_hierarchy_r (FILE *stream, if (BINFO_PRIMARY_P (binfo)) { indented = maybe_indent_hierarchy (stream, indent + 3, indented); - fprintf (stream, " primary-for %s (0x%lx)", + fprintf (stream, " primary-for %s (0x" HOST_WIDE_INT_PRINT_HEX ")", type_as_string (BINFO_TYPE (BINFO_INHERITANCE_CHAIN (binfo)), TFF_PLAIN_IDENTIFIER), - (unsigned long)BINFO_INHERITANCE_CHAIN (binfo)); + (HOST_WIDE_INT) (uintptr_t) BINFO_INHERITANCE_CHAIN (binfo)); } if (BINFO_LOST_PRIMARY_P (binfo)) { if (ctor_vtbl_p) { if (!BINFO_VIRTUAL_P (binfo)) - fprintf (stream, " (0x%lx instance)", (unsigned long)binfo); + fprintf (stream, " (0x" HOST_WIDE_INT_PRINT_HEX " instance)", +(HOST_WIDE_INT) (uintptr_t) binfo); fprintf (stream, " in %s", type_as_string (t, TFF_PLAIN_IDENTIFIER)); } fprintf (stream, "\n");
constexpr vector operations
Hello, this patch fixes several ICEs when using constexpr SIMD vectors. Support for subscripting is still missing though, which is why I am not adding some static_asserts to the testcase. I don't use build_vector_from_ctor because it doesn't check for non-constant elements and doesn't handle vectors in a constructor (the middle-end produces those, and I expect the front-ends will too eventually). Bootstrap+testsuite on x86_64-linux-gnu. 2012-11-29 Marc Glisse PR c++/53094 gcc/ * fold-const.c (fold): Replace a CONSTRUCTOR with a VECTOR_CST. gcc/cp/ * cvt.c (ocp_convert): Call convert_to_vector. gcc/testsuite/ * g++.dg/ext/vector20.C: New testcase. -- Marc GlisseIndex: gcc/cp/cvt.c === --- gcc/cp/cvt.c(revision 193922) +++ gcc/cp/cvt.c(working copy) @@ -683,20 +683,22 @@ ocp_convert (tree type, tree expr, int c might be expected, since if one of the types is a typedef; the comparison in fold is just equality of pointers, not a call to comptypes. We don't call fold in this case because that can result in infinite recursion; fold will call convert, which will call ocp_convert, etc. */ return e; /* For complex data types, we need to perform componentwise conversion. */ else if (TREE_CODE (type) == COMPLEX_TYPE) return fold_if_not_in_template (convert_to_complex (type, e)); + else if (TREE_CODE (type) == VECTOR_TYPE) + return fold_if_not_in_template (convert_to_vector (type, e)); else if (TREE_CODE (e) == TARGET_EXPR) { /* Don't build a NOP_EXPR of class type. Instead, change the type of the temporary. */ TREE_TYPE (e) = TREE_TYPE (TARGET_EXPR_SLOT (e)) = type; return e; } else { /* We shouldn't be treating objects of ADDRESSABLE type as Index: gcc/testsuite/g++.dg/ext/vector20.C === --- gcc/testsuite/g++.dg/ext/vector20.C (revision 0) +++ gcc/testsuite/g++.dg/ext/vector20.C (revision 0) @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-std=c++11" } */ + +typedef long vec __attribute__((vector_size (2 * sizeof (long; +constexpr vec v = { 3, 4 }; +constexpr vec s = v + v; +constexpr vec w = __builtin_shuffle (v, v); Property changes on: gcc/testsuite/g++.dg/ext/vector20.C ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/fold-const.c === --- gcc/fold-const.c(revision 193922) +++ gcc/fold-const.c(working copy) @@ -14380,20 +14380,49 @@ fold (tree expr) && tree_int_cst_lt (op1, TREE_OPERAND (index, 0))) end = middle; else return (*elts)[middle].value; } } return t; } + /* Return a VECTOR_CST if possible. */ +case CONSTRUCTOR: + { + tree type = TREE_TYPE (t); + if (TREE_CODE (type) != VECTOR_TYPE) + return t; + + tree *vec = XALLOCAVEC (tree, TYPE_VECTOR_SUBPARTS (type)); + unsigned HOST_WIDE_INT idx, pos = 0; + tree value; + + FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (t), idx, value) + { + if (!CONSTANT_CLASS_P (value)) + return t; + if (TREE_CODE (value) == VECTOR_CST) + { + for (unsigned i = 0; i < VECTOR_CST_NELTS (value); ++i) + vec[pos++] = VECTOR_CST_ELT (value, i); + } + else + vec[pos++] = value; + } + for (; pos < TYPE_VECTOR_SUBPARTS (type); ++pos) + vec[pos] = build_zero_cst (TREE_TYPE (type)); + + return build_vector (type, vec); + } + case CONST_DECL: return fold (DECL_INITIAL (t)); default: return t; } /* switch (code) */ } #ifdef ENABLE_FOLD_CHECKING #undef fold
Re: [patch tree-dump.c]: 7 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Updated variant using HOST_WIDE_INT_PRINT. Tested for i686-w64-mingw32 and x86_64-w64-mingw32. Ok for apply? Kai Index: tree-dump.c === --- tree-dump.c (Revision 193925) +++ tree-dump.c (Arbeitskopie) @@ -177,7 +177,8 @@ void dump_pointer (dump_info_p di, const char *field, void *ptr) { dump_maybe_newline (di); - fprintf (di->stream, "%-4s: %-8lx ", field, (unsigned long) ptr); + fprintf (di->stream, "%-4s: %-8" HOST_WIDE_INT_PRINT "x ", field, + (HOST_WIDE_INT) (uintptr_t) ptr); di->column += 15; }
[PATCH] PRE TLC
When working on PR55124 I noticed we do some useless work in PRE. This cleans it up and makes it more obvious what we do in compute_avail. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-11-29 Richard Biener * tree-ssa-pre.c (get_expr_value_id): Do not add expr to the set of value expressions here. (add_to_exp_gen, make_values_for_phi): Fold into ... (compute_avail): ... here, and avoid useless work. Dump avail sets in processing order. (do_pre): Do not dump avail sets here. Index: gcc/tree-ssa-pre.c === --- gcc/tree-ssa-pre.c (revision 193932) +++ gcc/tree-ssa-pre.c (working copy) @@ -612,28 +612,28 @@ bitmap_set_new (void) static unsigned int get_expr_value_id (pre_expr expr) { + unsigned int id; switch (expr->kind) { case CONSTANT: - { - unsigned int id; - id = get_constant_value_id (PRE_EXPR_CONSTANT (expr)); - if (id == 0) - { - id = get_or_alloc_constant_value_id (PRE_EXPR_CONSTANT (expr)); - add_to_value (id, expr); - } - return id; - } + id = get_or_alloc_constant_value_id (PRE_EXPR_CONSTANT (expr)); + break; case NAME: - return VN_INFO (PRE_EXPR_NAME (expr))->value_id; + id = VN_INFO (PRE_EXPR_NAME (expr))->value_id; + break; case NARY: - return PRE_EXPR_NARY (expr)->value_id; + id = PRE_EXPR_NARY (expr)->value_id; + break; case REFERENCE: - return PRE_EXPR_REFERENCE (expr)->value_id; + id = PRE_EXPR_REFERENCE (expr)->value_id; + break; default: gcc_unreachable (); } + /* ??? We cannot assert that expr has a value-id (it can be 0), because + we assign value-ids only to expressions that have a result + in set_hashtable_value_ids. */ + return id; } /* Return a SCCVN valnum (SSA name or constant) for the PRE value-id VAL. */ @@ -3624,48 +3624,6 @@ insert (void) } -/* Add OP to EXP_GEN (block), and possibly to the maximal set. */ - -static void -add_to_exp_gen (basic_block block, tree op) -{ - pre_expr result; - - if (TREE_CODE (op) == SSA_NAME && ssa_undefined_value_p (op)) -return; - - result = get_or_alloc_expr_for_name (op); - bitmap_value_insert_into_set (EXP_GEN (block), result); -} - -/* Create value ids for PHI in BLOCK. */ - -static void -make_values_for_phi (gimple phi, basic_block block) -{ - tree result = gimple_phi_result (phi); - unsigned i; - - /* We have no need for virtual phis, as they don't represent - actual computations. */ - if (virtual_operand_p (result)) -return; - - pre_expr e = get_or_alloc_expr_for_name (result); - add_to_value (get_expr_value_id (e), e); - bitmap_value_insert_into_set (AVAIL_OUT (block), e); - bitmap_insert_into_set (PHI_GEN (block), e); - for (i = 0; i < gimple_phi_num_args (phi); ++i) -{ - tree arg = gimple_phi_arg_def (phi, i); - if (TREE_CODE (arg) == SSA_NAME) - { - e = get_or_alloc_expr_for_name (arg); - add_to_value (get_expr_value_id (e), e); - } -} -} - /* Compute the AVAIL set for all basic blocks. This function performs value numbering of the statements in each basic @@ -3703,6 +3661,14 @@ compute_avail (void) bitmap_value_insert_into_set (AVAIL_OUT (ENTRY_BLOCK_PTR), e); } + if (dump_file && (dump_flags & TDF_DETAILS)) +{ + print_bitmap_set (dump_file, TMP_GEN (ENTRY_BLOCK_PTR), + "tmp_gen", ENTRY_BLOCK); + print_bitmap_set (dump_file, AVAIL_OUT (ENTRY_BLOCK_PTR), + "avail_out", ENTRY_BLOCK); +} + /* Allocate the worklist. */ worklist = XNEWVEC (basic_block, n_basic_blocks); @@ -3731,7 +3697,19 @@ compute_avail (void) /* Generate values for PHI nodes. */ for (gsi = gsi_start_phis (block); !gsi_end_p (gsi); gsi_next (&gsi)) - make_values_for_phi (gsi_stmt (gsi), block); + { + tree result = gimple_phi_result (gsi_stmt (gsi)); + + /* We have no need for virtual phis, as they don't represent +actual computations. */ + if (virtual_operand_p (result)) + continue; + + pre_expr e = get_or_alloc_expr_for_name (result); + add_to_value (get_expr_value_id (e), e); + bitmap_value_insert_into_set (AVAIL_OUT (block), e); + bitmap_insert_into_set (PHI_GEN (block), e); + } BB_MAY_NOTRETURN (block) = 0; @@ -3775,7 +3753,12 @@ compute_avail (void) continue; FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE) - add_to_exp_gen (block, op); + { + if (ssa_undefined_value_p (op)) + continue; + pre_expr e = get_or_alloc_expr_for_name (op); + bitmap_value_insert_into_set (EXP_GEN (block), e); + } switch (gimple_code (
Re: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold
On Thu, Nov 29, 2012 at 1:10 AM, Markus Trippelsdorf wrote: > On 2012.11.29 at 09:43 +0100, Richard Biener wrote: >> On Thu, Nov 29, 2012 at 5:18 AM, H.J. Lu wrote: >> > From: "H.J. Lu" >> > To: gcc-patches@gcc.gnu.org >> > Cc: Joseph Myers , Paolo Bonzini >> > >> > Bcc: >> > Subject: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold >> > Reply-To: >> > >> > Hi, >> > >> > Binutils supports 2 linkers, ld.gold and ld.bfd. One of them is >> > configured as the default linker, ld, which is used by GCC. Sometimes, >> > we want to use the alternate linker with GCC at run-time. This patch >> > adds -fuse-ld=bfd and -fuse-ld=gold options to GCC driver. It changes >> > collect2.c to pick either ld.bfd or ld.gold. It also adds >> > ORIGINAL_LD_BFD_FOR_TARGET and ORIGINAL_LD_GOLD_FOR_TARGET to >> > exec-tool.in to add -fuse-ld=bfd and -fuse-ld=gold support to >> > collect-ld. Since ld.bfd nor ld.gold know the new options, you >> > will get >> > >> > # ./xgcc -B./ /tmp/x.c -fuse-ld=gold -v >> > ... >> > ./collect-ld --eh-frame-hdr -m elf_x86_64 -dynamic-linker >> > /lib64/ld-linux-x86-64.so.2 -fuse-ld=gold /lib/../lib64/crt1.o >> > /lib/../lib64/crti.o ./crtbegin.o -L. -L/lib/../lib64 -L/usr/lib/../lib64 >> > /tmp/cclVWcGz.o -v -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc >> > --as-needed -lgcc_s --no-as-needed ./crtend.o /lib/../lib64/crtn.o >> > GNU gold (Linux/GNU Binutils 2.23.51.0.7.20121127) 1.11 >> > /usr/local/bin/ld.gold: fatal error: -f/--auxiliary may not be used >> > without -shared >> > collect2: error: ld returned 1 exit status >> > >> > This is because we pass everything to ld and ld.bfd/ld.gold doesn't >> > understand -fuse-ld=bfd/-fuse-ld=gold. It isn't a problem for collect2 >> > since it will filter-out -fuse-ld=bfd/-fuse-ld=gold. I will submit a >> > binutils patch to ignore -fuse-ld=bfd/-fuse-ld=gold, similar to -flto >> > options. >> > >> > OK to install? >> >> Do we need to consider GNU ld and gold coming from different binutils >> versions >> and thus do we need two sets of linker feature tests at configure >> time? Eventually >> also if GNU ld and gold are not in feature-parity for the same binutils >> version? This is no different from run-time linker vs configure-time linker. If it is a link-time failure, we can check: [hjl@gnu-6 tmp]$ gcc x.o -v -Wl,-v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.7.2/lto-wrapper Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --disable-build-with-cxx --disable-build-poststage1-with-cxx --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --with-multilib-list=m32,m64,mx32 --build=x86_64-redhat-linux Thread model: posix gcc version 4.7.2 20120921 (Red Hat 4.7.2-2) (GCC) COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/4.7.2/:/usr/libexec/gcc/x86_64-redhat-linux/4.7.2/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/4.7.2/:/usr/lib/gcc/x86_64-redhat-linux/ LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/4.7.2/:/usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-mtune=generic' '-march=x86-64' /usr/libexec/gcc/x86_64-redhat-linux/4.7.2/collect2 --build-id --no-add-needed --eh-frame-hdr --hash-style=gnu -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib64/crt1.o /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux/4.7.2/crtbegin.o -L/usr/lib/gcc/x86_64-redhat-linux/4.7.2 -L/usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../.. x.o -v -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-redhat-linux/4.7.2/crtend.o /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib64/crtn.o collect2 version 4.7.2 20120921 (Red Hat 4.7.2-2) /usr/local/bin/ld --build-id --no-add-needed --eh-frame-hdr --hash-style=gnu -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib64/crt1.o /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../lib64/crti.o /usr/lib/gcc/x86_64-re
Re: [tsan] instrument_expr improvements
On Thu, Nov 29, 2012 at 4:24 PM, Jakub Jelinek wrote: > Hi! > > As discussed earlier on mailing list and on IRC with Richard, > pt_solution_includes can be used for testing of whether var > might escape current function (for -O0 unfortunately returns > true even for !may_be_aliased automatic vars, so I'm testing > for that too), This optimization would be great. -O0 performance is completely irrelevant, it just needs to work. tsan should be used with -O1. > and there are IMHO tons of completely unnecessary tests > and restrictions what we want to instrument and what not. > instrument_expr is guarded with gimple_store_p or > gimple_assign_load_p, so it shouldn't see SSA_NAMEs, and many other > ops, basically just DECL_Ps, handled_components, MEM_REF or TARGET_MEM_REF > where for handled_components the base address is one of DECL_P, MEM_REF > or TARGET_MEM_REF. > > I'm actually surprised that insturment_expr handles also is_gimple_call > in it, yet nothing calls it for calls, I bet for calls we also want to call > it for gimple_store_p calls. Originally I wrote it for slightly different instrumentation with no prior knowledge about gcc. The the pass placement was changes, then instrumentation was changed. It has a lot of artefacts. > Dmitry, could you please test this on something where the previous tsan > code has been tested on (if anything)? All I've done was eyeball a few > short testcases. OK, I will try to build it and test on something. I was not yet tested it on anything at all. > TSAN will badly need similar optimization pass to what ASAN needs (after > deferring expansion of the shadow memory checks), e.g. var++ right now > results in __tsan_read4 (&var); followed soon by __tsan_write4 (&var);. > With no intervening calls (we could ignore many string/memory builtins > I guess) and no intervening atomics it should be fine to just use > __tsan_write4 (&var); for that, right? Right. > Similarly for multiple accesses > to the same var, where first use dominates the other accesses and there > are no intervening calls. We can do it in some cases. However, it dominating access is read, and the second is conditional write, then we need to preserve both. > As for bitfields, I think we want to use DECL_BIT_FIELD_REPRESENTATIVE. > > 2012-11-29 Jakub Jelinek > > * tsan.c (is_load_of_const_p): Removed. > (instrument_expr): Use result of get_inner_reference > instead of get_base_address, avoid some unnecessary tests, > use !pt_solution_includes and !may_be_aliased tests to > check whether base might escape current function. > > --- gcc/tsan.c.jj 2012-11-23 15:27:38.0 +0100 > +++ gcc/tsan.c 2012-11-29 12:41:26.816857288 +0100 > @@ -84,65 +84,18 @@ is_vptr_store (gimple stmt, tree expr, b >return NULL; > } > > -/* Checks as to whether EXPR refers to constant var/field/param. > - Don't bother to instrument them. */ > - > -static bool > -is_load_of_const_p (tree expr, bool is_write) > -{ > - if (is_write) > -return false; > - if (TREE_CODE (expr) == COMPONENT_REF) > -expr = TREE_OPERAND (expr, 1); > - if (TREE_CODE (expr) == VAR_DECL > - || TREE_CODE (expr) == PARM_DECL > - || TREE_CODE (expr) == FIELD_DECL) > -{ > - if (TREE_READONLY (expr)) > - return true; > -} > - return false; > -} > - > /* Instruments EXPR if needed. If any instrumentation is inserted, > return true. */ > > static bool > instrument_expr (gimple_stmt_iterator gsi, tree expr, bool is_write) > { > - enum tree_code tcode; >tree base, rhs, expr_type, expr_ptr, builtin_decl; >basic_block bb; >HOST_WIDE_INT size; >gimple stmt, g; >location_t loc; > > - base = get_base_address (expr); > - if (base == NULL_TREE > - || TREE_CODE (base) == SSA_NAME > - || TREE_CODE (base) == STRING_CST) > -return false; > - > - tcode = TREE_CODE (expr); > - > - /* Below are things we do not instrument > - (no possibility of races or not implemented yet). */ > - if (/* Compiler-emitted artificial variables. */ > - (DECL_P (expr) && DECL_ARTIFICIAL (expr)) > - /* The var does not live in memory -> no possibility of races. */ > - || (tcode == VAR_DECL > - && !TREE_ADDRESSABLE (expr) > - && TREE_STATIC (expr) == 0) > - /* Not implemented. */ > - || TREE_CODE (TREE_TYPE (expr)) == RECORD_TYPE > - /* Not implemented. */ > - || tcode == CONSTRUCTOR > - /* Not implemented. */ > - || tcode == PARM_DECL > - /* Load of a const variable/parameter/field. */ > - || is_load_of_const_p (expr, is_write)) > -return false; > - >size = int_size_in_bytes (TREE_TYPE (expr)); >if (size == -1) > return false; > @@ -153,18 +106,29 @@ instrument_expr (gimple_stmt_iterator gs >tree offset; >enum machine_mode mode; >int volatilep = 0, unsignedp = 0; > - get_inner_reference (expr, &bitsize, &bitpos, &offset, > -
Re: [tsan] instrument_expr improvements
On Thu, 29 Nov 2012, Jakub Jelinek wrote: > Hi! > > As discussed earlier on mailing list and on IRC with Richard, > pt_solution_includes can be used for testing of whether var > might escape current function (for -O0 unfortunately returns > true even for !may_be_aliased automatic vars, so I'm testing > for that too), and there are IMHO tons of completely unnecessary tests > and restrictions what we want to instrument and what not. > instrument_expr is guarded with gimple_store_p or > gimple_assign_load_p, so it shouldn't see SSA_NAMEs, and many other > ops, basically just DECL_Ps, handled_components, MEM_REF or TARGET_MEM_REF > where for handled_components the base address is one of DECL_P, MEM_REF > or TARGET_MEM_REF. > > I'm actually surprised that insturment_expr handles also is_gimple_call > in it, yet nothing calls it for calls, I bet for calls we also want to call > it for gimple_store_p calls. > > Dmitry, could you please test this on something where the previous tsan > code has been tested on (if anything)? All I've done was eyeball a few > short testcases. > > TSAN will badly need similar optimization pass to what ASAN needs (after > deferring expansion of the shadow memory checks), e.g. var++ right now > results in __tsan_read4 (&var); followed soon by __tsan_write4 (&var);. > With no intervening calls (we could ignore many string/memory builtins > I guess) and no intervening atomics it should be fine to just use > __tsan_write4 (&var); for that, right? Similarly for multiple accesses > to the same var, where first use dominates the other accesses and there > are no intervening calls. > > As for bitfields, I think we want to use DECL_BIT_FIELD_REPRESENTATIVE. >From a middle-end POV this looks ok. Thanks, Richard. > 2012-11-29 Jakub Jelinek > > * tsan.c (is_load_of_const_p): Removed. > (instrument_expr): Use result of get_inner_reference > instead of get_base_address, avoid some unnecessary tests, > use !pt_solution_includes and !may_be_aliased tests to > check whether base might escape current function. > > --- gcc/tsan.c.jj 2012-11-23 15:27:38.0 +0100 > +++ gcc/tsan.c2012-11-29 12:41:26.816857288 +0100 > @@ -84,65 +84,18 @@ is_vptr_store (gimple stmt, tree expr, b >return NULL; > } > > -/* Checks as to whether EXPR refers to constant var/field/param. > - Don't bother to instrument them. */ > - > -static bool > -is_load_of_const_p (tree expr, bool is_write) > -{ > - if (is_write) > -return false; > - if (TREE_CODE (expr) == COMPONENT_REF) > -expr = TREE_OPERAND (expr, 1); > - if (TREE_CODE (expr) == VAR_DECL > - || TREE_CODE (expr) == PARM_DECL > - || TREE_CODE (expr) == FIELD_DECL) > -{ > - if (TREE_READONLY (expr)) > - return true; > -} > - return false; > -} > - > /* Instruments EXPR if needed. If any instrumentation is inserted, > return true. */ > > static bool > instrument_expr (gimple_stmt_iterator gsi, tree expr, bool is_write) > { > - enum tree_code tcode; >tree base, rhs, expr_type, expr_ptr, builtin_decl; >basic_block bb; >HOST_WIDE_INT size; >gimple stmt, g; >location_t loc; > > - base = get_base_address (expr); > - if (base == NULL_TREE > - || TREE_CODE (base) == SSA_NAME > - || TREE_CODE (base) == STRING_CST) > -return false; > - > - tcode = TREE_CODE (expr); > - > - /* Below are things we do not instrument > - (no possibility of races or not implemented yet). */ > - if (/* Compiler-emitted artificial variables. */ > - (DECL_P (expr) && DECL_ARTIFICIAL (expr)) > - /* The var does not live in memory -> no possibility of races. */ > - || (tcode == VAR_DECL > - && !TREE_ADDRESSABLE (expr) > - && TREE_STATIC (expr) == 0) > - /* Not implemented. */ > - || TREE_CODE (TREE_TYPE (expr)) == RECORD_TYPE > - /* Not implemented. */ > - || tcode == CONSTRUCTOR > - /* Not implemented. */ > - || tcode == PARM_DECL > - /* Load of a const variable/parameter/field. */ > - || is_load_of_const_p (expr, is_write)) > -return false; > - >size = int_size_in_bytes (TREE_TYPE (expr)); >if (size == -1) > return false; > @@ -153,18 +106,29 @@ instrument_expr (gimple_stmt_iterator gs >tree offset; >enum machine_mode mode; >int volatilep = 0, unsignedp = 0; > - get_inner_reference (expr, &bitsize, &bitpos, &offset, > -&mode, &unsignedp, &volatilep, false); > - if (bitpos % (size * BITS_PER_UNIT) > - || bitsize != size * BITS_PER_UNIT) > + base = get_inner_reference (expr, &bitsize, &bitpos, &offset, > + &mode, &unsignedp, &volatilep, false); > + > + /* No need to instrument accesses to decls that don't escape, > + they can't escape to other threads then. */ > + if (DECL_P (base)) > +{ > + struct pt_solution pt; > + memset (&pt, 0, sizeof (pt)); > + pt.escaped = 1;
[0/3][ARM] AArch32 NEON vrint builtins and intrinsics
Hi all, This set of patches adds support for the AArch32 vrint NEON builtins. It also adds related NEON intrinsics. It is organised in 3 patches: * The first patch adds support for the builtin versions of the instructions. * The second is a small testsuite patch that adds the effective target check and appropriate option-adding procedure for v8 NEON support and is required by the new intrinsic tests (and any further AArch32 NEON tests that are added in the future). * The third patch adds the intrinsics i.e. it contains the changes to the .ml scripts that are required to generate the arm_neon.h file as well as the tests and documentation. The patches have been tested on arm-none-eabi with qemu/model. New tests pass, no regressions. Thanks, Kyrill
[PATCH][ARM][2/3] AArch32 NEON vrint builtins and intrinsics
Hi all, This patch adds an effective target check to the testsuite for ARMv8 NEON support. A corresponding add_options procedure is added. This is used by the AArch32 NEON intrinsics tests. An entry in the documentation is added as well. Ok for trunk? Thanks, Kyrill gcc/ChangeLog 2012-11-29 Kyrylo Tkachov * doc/sourcebuild.texi: Document arm_v8_neon_ok. gcc/testsuite/ChangeLog 2012-11-29 Kyrylo Tkachov * lib/target-supports.exp (check_effective_target_arm_v8_neon_ok): New procedure. (add_options_for_arm_v8_neon): Likewise.diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 0f29326..ae05681 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1560,6 +1560,10 @@ Some multilibs may be incompatible with these options. ARM target supports @code{-mfpu=fp-armv8 -mfloat-abi=softfp}. Some multilibs may be incompatible with these options. +@item arm_v8_neon_ok +ARM target supports @code{-mfpu=neon-fp-armv8 -mfloat-abi=softfp}. +Some multilibs may be incompatible with these options. + @item arm_prefer_ldrd_strd ARM target prefers @code{LDRD} and @code{STRD} instructions over @code{LDM} and @code{STM} instructions. diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 5935346..ca78808 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -2107,6 +2107,22 @@ proc check_effective_target_arm_v8_vfp_ok {} { } } +# Return 1 if this is an ARM target supporting -mfpu=neon-fp-armv8 +# -mfloat-abi=softfp +proc check_effective_target_arm_v8_neon_ok {} { +if { [check_effective_target_arm32] } { + return [check_no_compiler_messages arm_v8_neon_ok object { + int foo (void) + { +__asm__ volatile ("vrintn.f32 q0, q0"); + return 0; + } + } "-mfpu=neon-fp-armv8 -mfloat-abi=softfp"] +} else { + return 0 +} +} + # Return 1 if this is an ARM target supporting -mfpu=vfp # -mfloat-abi=hard. Some multilibs may be incompatible with these # options. @@ -2166,6 +2182,13 @@ proc add_options_for_arm_v8_vfp { flags } { return "$flags -mfpu=fp-armv8 -mfloat-abi=softfp" } +proc add_options_for_arm_v8_neon { flags } { +if { ! [check_effective_target_arm_v8_neon_ok] } { +return "$flags" +} +return "$flags -mfpu=neon-fp-armv8 -mfloat-abi=softfp" +} + # Add the options needed for NEON. We need either -mfloat-abi=softfp # or -mfloat-abi=hard, but if one is already specified by the # multilib, use it. Similarly, if a -mfpu option already enables
[PATCH][ARM][1/3] AArch32 NEON vrint builtins and intrinsics
Hi all, This patch adds support for the vrint builtins. It also gathers the unspec definitions in the various .md files in one file: unspecs.md. A new iterator is defined that iterates over some new unspecs, in a similar way to the scalar vrint implementation. No regressions on arm-none-eabi. Ok for trunk? Thanks, Kyrill gcc/ChangeLog 2012-11-29 Kyrylo Tkachov * config/arm/arm.c (neon_itype): Define NEON_RINT enum element. (neon_builtin_data): Register vrintn, vrinta, vrintp, vrintm, vrintz, vrintx neon builtins. (arm_init_neon_builtins): Handle NEON_RINT. (arm_expand_neon_builtin): Likewise. * config/arm/unspecs.md: New file. * config/arm/arm.md ("unspec"): Move to unspecs.md. * config/arm/iterators.md (NEON_VRINT): New int iterator. (nvrint_variant): New int attribute. * config/arm/neon.md (neon_vrint): New pattern. ("unspec"): Move to unspecs.md. * config/arm/iwmmxt2.md ("unspec"): Move to unspecs.md.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 50dcff5..49feac9 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -19002,6 +19002,7 @@ typedef enum { NEON_GETLANE, NEON_SETLANE, NEON_CREATE, + NEON_RINT, NEON_DUP, NEON_DUPLANE, NEON_COMBINE, @@ -19201,6 +19202,12 @@ static neon_builtin_datum neon_builtin_data[] = VAR4 (FIXCONV, vcvt_n, v2si, v2sf, v4si, v4sf), VAR10 (SELECT, vbsl, v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di), + VAR2 (RINT, vrintn, v2sf, v4sf), + VAR2 (RINT, vrinta, v2sf, v4sf), + VAR2 (RINT, vrintp, v2sf, v4sf), + VAR2 (RINT, vrintm, v2sf, v4sf), + VAR2 (RINT, vrintz, v2sf, v4sf), + VAR2 (RINT, vrintx, v2sf, v4sf), VAR1 (VTBL, vtbl1, v8qi), VAR1 (VTBL, vtbl2, v8qi), VAR1 (VTBL, vtbl3, v8qi), @@ -19828,6 +19835,7 @@ arm_init_neon_builtins (void) is_store = 1; /* Fall through. */ case NEON_UNOP: + case NEON_RINT: case NEON_BINOP: case NEON_LOGICBINOP: case NEON_SHIFTINSERT: @@ -21015,6 +21023,7 @@ arm_expand_neon_builtin (int fcode, tree exp, rtx target) NEON_ARG_COPY_TO_REG, NEON_ARG_STOP); case NEON_DUP: +case NEON_RINT: case NEON_SPLIT: case NEON_REINTERP: return arm_expand_neon_args (target, icode, 1, type_mode, exp, fcode, diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index cff1148..c2328cc 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -58,112 +58,6 @@ ] ) -;; UNSPEC Usage: -;; Note: sin and cos are no-longer used. -;; Unspec enumerators for Neon are defined in neon.md. -;; Unspec enumerators for iwmmxt2 are defined in iwmmxt2.md - -(define_c_enum "unspec" [ - UNSPEC_PUSH_MULT ; `push multiple' operation: -; operand 0 is the first register, -; subsequent registers are in parallel (use ...) -; expressions. - UNSPEC_PIC_SYM; A symbol that has been treated properly for pic -; usage, that is, we will add the pic_register -; value to it before trying to dereference it. - UNSPEC_PIC_BASE ; Add PC and all but the last operand together, -; The last operand is the number of a PIC_LABEL -; that points at the containing instruction. - UNSPEC_PRLG_STK ; A special barrier that prevents frame accesses -; being scheduled before the stack adjustment insn. - UNSPEC_REGISTER_USE ; As USE insns are not meaningful after reload, -; this unspec is used to prevent the deletion of -; instructions setting registers for EH handling -; and stack frame generation. Operand 0 is the -; register to "use". - UNSPEC_CHECK_ARCH ; Set CCs to indicate 26-bit or 32-bit mode. - UNSPEC_WSHUFH ; Used by the intrinsic form of the iWMMXt WSHUFH instruction. - UNSPEC_WACC ; Used by the intrinsic form of the iWMMXt WACC instruction. - UNSPEC_TMOVMSK; Used by the intrinsic form of the iWMMXt TMOVMSK instruction. - UNSPEC_WSAD ; Used by the intrinsic form of the iWMMXt WSAD instruction. - UNSPEC_WSADZ ; Used by the intrinsic form of the iWMMXt WSADZ instruction. - UNSPEC_WMACS ; Used by the intrinsic form of the iWMMXt WMACS instruction. - UNSPEC_WMACU ; Used by the intrinsic form of the iWMMXt WMACU instruction. - UNSPEC_WMACSZ ; Used by the intrinsic form of the iWMMXt WMACSZ instruction. - UNSPEC_WMACUZ ; Used by the intrinsic form of the iWMMXt WMACUZ instruction. - UNSPEC_CLRDI ; Used by the intrinsic form of the iWMMXt CLRDI instruction. - UNSPEC_WALIGNI; Used by the intrinsic form of the iWMMXt WALIGN instruction. - UNSPEC_TLS; A sym
[PATCH][ARM][3/3] AArch32 NEON vrint builtins and intrinsics
Hi all, This patch adds the intrinsics support for the vrnd intrinsics that are implemented by the vrint instructions. The .ml scripts contain the new information and should used to regenerate the arm_neon.h header file, tests and documentation. In particular: * config/arm/arm_neon.h should be regenerated using config/arm/neon-gen.ml. * doc/arm-neon-intrinsics.texi should be regenerated using config/arm/neon-docgen.ml. * The tests in testsuite/gcc.target/arm/neon/ should be generated using config/arm/neon-testgen.ml. All three of these scripts should be linked against the compiled neon.ml file i.e: $ ocamlc -c neon.ml $ ocamlc -o neon-gen neon.cmo neon-gen.ml The following intrinsics are defined: vrnd_f32 (float32x2_t a) (generating a vrintz instruction) vrndq_f32 (float32x4_t a) (generating a vrintz instruction) vrnda_f32 (float32x2_t a) (generating a vrinta instruction) vrndqa_f32 (float32x4_t a) (generating a vrinta instruction) vrndm_f32 (float32x2_t a) (generating a vrintm instruction) vrndqm_f32 (float32x4_t a) (generating a vrintm instruction) vrndn_f32 (float32x2_t a) (generating a vrintn instruction) vrndqn_f32 (float32x4_t a) (generating a vrintn instruction) vrndp_f32 (float32x2_t a) (generating a vrintp instruction) vrndqp_f32 (float32x4_t a) (generating a vrintp instruction) Note that AArch32 NEON does not support double precision floats, so we don't have _f64 versions. Tested on arm-none-eabi. New tests pass, no regressions (once the effective target checks patch is added). Ok for trunk? Thanks, Kyrill 2012-11-29 Kyrylo Tkachov * config/arm/neon.ml (opcode): Add Vrintn, Vrinta, Vrintp, Vrintm, Vrintz to type. (type features): Add Requires_arch type constructor. (ops): Define Vrintn, Vrinta, Vrintp, Vrintm, Vrintz features. * config/arm/neon-docgen.ml (intrinsic_groups): Define Vrintn, Vrinta, Vrintp, Vrintm, Vrintz, Vrintx. * config/arm/neon-testgen.ml (effective_target): Define check for Requires_arch 8. * config/arm/neon-gen.ml (print_feature_test_start): Handle Requires_arch. (print_feature_test_end): Likewise. * doc/arm-neon-intrinsics.texi: Regenerate. * config/arm/arm_neon.h: Regenerate. gcc/testsuite/ChangeLog 2012-11-29 Kyrylo Tkachov * gcc.target/arm/neon/vrndaf32.c: New test. * gcc.target/arm/neon/vrndqaf32.c: Likewise. * gcc.target/arm/neon/vrndf32.c: Likewise. * gcc.target/arm/neon/vrndqf32.c: Likewise. * gcc.target/arm/neon/vrndmf32.c: Likewise. * gcc.target/arm/neon/vrndqmf32.c: Likewise. * gcc.target/arm/neon/vrndnf32.c: Likewise. * gcc.target/arm/neon/vrndqnf32.c: Likewise. * gcc.target/arm/neon/vrndpf32.c: Likewise. * gcc.target/arm/neon/vrndqpf32.c: Likewise.diff --git a/gcc/config/arm/neon-docgen.ml b/gcc/config/arm/neon-docgen.ml index 043b1e0..228de16 100644 --- a/gcc/config/arm/neon-docgen.ml +++ b/gcc/config/arm/neon-docgen.ml @@ -105,6 +105,11 @@ let intrinsic_groups = "Multiply-subtract", single_opcode Vmls; "Fused-multiply-accumulate", single_opcode Vfma; "Fused-multiply-subtract", single_opcode Vfms; +"Round to integral (to nearest, ties to even)", single_opcode Vrintn; +"Round to integral (to nearest, ties away from zero)", single_opcode Vrinta; +"Round to integral (towards +Inf)", single_opcode Vrintp; +"Round to integral (towards -Inf)", single_opcode Vrintm; +"Round to integral (towards 0)", single_opcode Vrintz; "Subtraction", single_opcode Vsub; "Comparison (equal-to)", single_opcode Vceq; "Comparison (greater-than-or-equal-to)", single_opcode Vcge; diff --git a/gcc/config/arm/neon-gen.ml b/gcc/config/arm/neon-gen.ml index 6c4e272..c5f0583 100644 --- a/gcc/config/arm/neon-gen.ml +++ b/gcc/config/arm/neon-gen.ml @@ -290,17 +290,21 @@ let print_feature_test_start features = try match List.find (fun feature -> match feature with Requires_feature _ -> true +| Requires_arch _ -> true | _ -> false) features with Requires_feature feature -> Format.printf "#ifdef __ARM_FEATURE_%s@\n" feature +| Requires_arch arch -> +Format.printf "#if __ARM_ARCH >= %d@\n" arch | _ -> assert false with Not_found -> assert true let print_feature_test_end features = let feature = List.exists (function Requires_feature x -> true -| _ -> false) features in + | Requires_arch x -> true + | _ -> false) features in if feature then Format.printf "#endif@\n" diff --git a/gcc/config/arm/neon-testgen.ml b/gcc/config/arm/neon-testgen.ml index 4645f39..f6c8d9a 100644 --- a/gcc/config/arm/neon-testgen.ml +++ b/
Re: [patch] Fix PR middle-end/55321
> Well, there's a call at the tree level for __sync_synchronize as well. > > So the question is: why does __s_s ICE but memcmp doesn't? I suppose > it's the mere rarity with which HAVE_cmpmemsi is defined, and expansion > of gen_cmpmemsi fails. > > There are plenty of other calls to emit_library_call_value that are > not subsequently protected by emit_libcall_block_1, but they are probably > sufficiently rare on common platforms that they'd be difficult to trigger. You're right, here's a patch that marks all no-throw libcalls as no-nonlocal. Tested on x86_64-suse-linux, OK for the mainline? 2012-11-29 Eric Botcazou PR middle-end/55321 * calls.c (emit_library_call_value_1): Mark as no-nonlocal if no-throw. -- Eric BotcazouIndex: calls.c === --- calls.c (revision 193921) +++ calls.c (working copy) @@ -4196,13 +4196,11 @@ emit_library_call_value_1 (int retval, r that it should complain if nonvolatile values are live. For functions that cannot return, inform flow that control does not fall through. */ - if (flags & ECF_NORETURN) { /* The barrier note must be emitted immediately after the CALL_INSN. Some ports emit more than just a CALL_INSN above, so we must search for it here. */ - rtx last = get_last_insn (); while (!CALL_P (last)) { @@ -4214,6 +4212,21 @@ emit_library_call_value_1 (int retval, r emit_barrier_after (last); } + /* Consider that "regular" libcalls, i.e. all of them except for LCT_THROW + and LCT_RETURNS_TWICE, cannot perform non-local gotos. */ + if (flags & ECF_NOTHROW) +{ + rtx last = get_last_insn (); + while (!CALL_P (last)) + { + last = PREV_INSN (last); + /* There was no CALL_INSN? */ + gcc_assert (last != before_call); + } + + make_reg_eh_region_note_nothrow_nononlocal (last); +} + /* Now restore inhibit_defer_pop to its actual original value. */ OK_DEFER_POP;
Re: [PATCH, RFC] PR 55415 : Pessimistic misalignment from eipa_sra pass
On Thu, Nov 29, 2012 at 1:33 PM, Martin Jambor wrote: > Hi, > > On Thu, Nov 29, 2012 at 12:54:20PM +0100, Richard Biener wrote: >> On Thu, Nov 29, 2012 at 11:06 AM, Martin Jambor wrote: >> > Hi, >> > >> > thanks for the review. When writing a reply I realized I indeed made >> > a mistake or two in the part concerning prev_base and the code was not >> > what it intended to be. I'll re-write it today. >> > >> > Nevertheless, I also have a question regarding a different place of >> > the patch: >> > >> > On Wed, Nov 28, 2012 at 02:11:51PM +0100, Richard Biener wrote: >> >> On Tue, Nov 27, 2012 at 9:37 PM, Martin Jambor wrote: >> >> > *** /tmp/Hp0Wyc_tree-sra.c Tue Nov 27 21:34:54 2012 >> >> > --- gcc/tree-sra.c Tue Nov 27 21:28:53 2012 >> >> > *** unmodified_by_ref_scalar_representative >> >> > *** 3891,3902 >> >> > return repr; >> >> > } >> >> > >> >> > ! /* Return true iff this access precludes IPA-SRA of the parameter it >> >> > is >> >> > !associated with. */ >> >> > >> >> > static bool >> >> > ! access_precludes_ipa_sra_p (struct access *access) >> >> > { >> >> > /* Avoid issues such as the second simple testcase in PR 42025. >> >> > The problem >> >> >is incompatible assign in a call statement (and possibly even in >> >> > asm >> >> >statements). This can be relaxed by using a new temporary but >> >> > only for >> >> > --- 3891,3903 >> >> > return repr; >> >> > } >> >> > >> >> > ! /* Return true iff this ACCESS precludes IPA-SRA of the parameter it >> >> > is >> >> > !associated with. REQ_ALIGN is the minimum required alignment. */ >> >> > >> >> > static bool >> >> > ! access_precludes_ipa_sra_p (struct access *access, unsigned int >> >> > req_align) >> >> > { >> >> > + unsigned int exp_align; >> >> > /* Avoid issues such as the second simple testcase in PR 42025. >> >> > The problem >> >> >is incompatible assign in a call statement (and possibly even in >> >> > asm >> >> >statements). This can be relaxed by using a new temporary but >> >> > only for >> >> > *** access_precludes_ipa_sra_p (struct acces >> >> > *** 3908,3913 >> >> > --- 3909,3918 >> >> > || gimple_code (access->stmt) == GIMPLE_ASM)) >> >> > return true; >> >> > >> >> > + exp_align = get_object_alignment (access->expr); >> >> >> >> Is access->expr from the callee or the caller? >> > >> > The callee. >> > >> >> >> >> > + if (exp_align < req_align) >> >> > + return true; >> >> > + >> >> > return false; >> >> > } >> >> > >> >> > *** splice_param_accesses (tree parm, bool * >> >> > *** 3943,3949 >> >> > tree a1_alias_type; >> >> > access = (*access_vec)[i]; >> >> > modification = access->write; >> >> > ! if (access_precludes_ipa_sra_p (access)) >> >> > return NULL; >> >> > a1_alias_type = reference_alias_ptr_type (access->expr); >> >> > >> >> > --- 3948,3954 >> >> > tree a1_alias_type; >> >> > access = (*access_vec)[i]; >> >> > modification = access->write; >> >> > ! if (access_precludes_ipa_sra_p (access, TYPE_ALIGN >> >> > (access->type))) >> >> >> >> access->type is not the type of the base object, right? >> > >> > No, it is the actual type of the scalar memory access. >> > >> >> Which means it is not correct here - the alignment of the access is that >> >> what >> >> get_object_alignment returns, which looks at the base as to whether to >> >> lower the alignment compared to TYPE_ALIGN of the access type itself. >> > >> > Oh, so I have to do something like (or even reuse) may_be_unaligned_p >> > from tree-ssa-loop-ivopts.c? (If so, I'm wondering whether a few other >> > uses of get_object_alignment get this right...) >> >> Well, I think you should use get_object_alignment here, too, and punt if >> that is less than TYPE_ALIGN (access->type) if you will end up using that. >> >> get_object_alignment (ref) is the authorative answer to alignment questions. >> Whether TYPE_ALIGN (TREE_TYPE (ref)) can be conservatively trusted >> depends on whether get_object_alignment (ref) returns the same or bigger >> alignment. But if the type is all you can propagate in this case (and not >> an explicit alignment value) you have to punt if the information from the >> type isn't conservative. >> > > This must be some sort of misunderstanding, that is exactly what the > code is doing. access_precludes_ipa_sra_p called get_object_alignment > on the reference and then returned false (i.e. prevented the > transformation) if the result was smaller than the TYPE_ALIGN of the > type that was going to be propagated to the callee. Hm, ok, then this was a confusion on my part. Richard. > Thanks, > > Martin >
Re: [RFA/ARM] Fix PR54974: Thumb literal pools don't handle PC rounding
On 24 November 2012 00:27, Ramana Radhakrishnan wrote: > On Wed, Nov 21, 2012 at 7:59 PM, Matthew Gretton-Dann > wrote: [snip] >> The fix is to decrease the pool_range of all insns by 2 when generating >> Thumb code. There is no need to change neg_pool_range values as rounding >> down here will reduce the distance of the literal pool. > > A comment about this fact around thumb2_pool_range would be appropriate. > [snip] >> >> Tested arm-none-linux-gnueabi cross, and with the testcase attached to the >> PR. No added testcase in the patch as this code is sensitive to other code >> generation and so it is not easy to generate a testcase which will reliably >> test this condition. >> >> OK for trunk, 4.7, and 4.6? > > > Ok for trunk today - please wait a few days before backporting into > 4.6 and 4.7 to see what the fallout is like . Watch out for any > fallout with the auto-testers. The attached is what was actually committed as revision 193930 (original patch + requested comment). Thanks, Matt -- Matthew Gretton-Dann Linaro Toolchain Working Group matthew.gretton-d...@linaro.org Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 193929) +++ gcc/ChangeLog (revision 193930) @@ -1,3 +1,33 @@ +2012-11-29 Matthew Gretton-Dann + + PR target/54974 + * config/arm/arm.md (thumb2_pool_range, pool_range): Add comment on + Thumb pool ranges. + (thumb1_extendhisi2): Reduce Thumb pool range. + (arm_movdi): Likewise. + (thumb1_movdi_insn): Likewise. + (thumb1_movsi_insn): Likewise. + (pic_load_addr_unified): Likewise. + (pic_load_addr_32bit): Likewise. + (pic_load_addr_thumb1): Likewise. + (thumb1_movhf): Likewise. + (arm_movsf_soft_insn): Likewise. + (thumb1_movsf_soft_insn): Likewise. + (movdf_soft_insn): Likewise. + (thumb1_movdf_soft_insn): Likewise. + * config/arm/neon.md (*neon_mov): Likewise. + (*neon_mov): Likwise. + * config/arm/thumb2.md: (*thumb2_movsi_insn): Likewise. + (*thumb2_movhi_insn): Likewise. + (*thumb2_extendqisi_v6): Likewise. + (*thumb2_zero_extendqisi_v6): Likewise. + (*thumb2_zero_extendqisi2_v6): Likewise. + * config/arm/vfp.md: (*thumb2_movsi_vfp): Likewise. + (*movdi_vfp): Likewise. + (*movdi_vfp_cortexa8): Likewise. + (*thumb2_movsf_vfp): Likewise. + (*thumb2_movdf_vfp): Likewise. + 2012-11-29 Kai Tietz PR target/55171 Index: gcc/config/arm/thumb2.md === --- gcc/config/arm/thumb2.md(revision 193929) +++ gcc/config/arm/thumb2.md(revision 193930) @@ -182,7 +182,7 @@ str%?\\t%1, %0" [(set_attr "type" "*,*,*,*,load1,load1,store1,store1") (set_attr "predicable" "yes") - (set_attr "pool_range" "*,*,*,*,1020,4096,*,*") + (set_attr "pool_range" "*,*,*,*,1018,4094,*,*") (set_attr "neg_pool_range" "*,*,*,*,0,0,*,*")] ) @@ -217,7 +217,7 @@ ldr%(h%)\\t%0, %1\\t%@ movhi" [(set_attr "type" "*,*,store1,load1") (set_attr "predicable" "yes") - (set_attr "pool_range" "*,*,*,4096") + (set_attr "pool_range" "*,*,*,4094") (set_attr "neg_pool_range" "*,*,*,250")] ) @@ -570,7 +570,7 @@ ldr%(sb%)\\t%0, %1" [(set_attr "type" "alu_shift,load_byte") (set_attr "predicable" "yes") - (set_attr "pool_range" "*,4096") + (set_attr "pool_range" "*,4094") (set_attr "neg_pool_range" "*,250")] ) @@ -583,7 +583,7 @@ ldr%(h%)\\t%0, %1" [(set_attr "type" "alu_shift,load_byte") (set_attr "predicable" "yes") - (set_attr "pool_range" "*,4096") + (set_attr "pool_range" "*,4094") (set_attr "neg_pool_range" "*,250")] ) @@ -596,7 +596,7 @@ ldr%(b%)\\t%0, %1\\t%@ zero_extendqisi2" [(set_attr "type" "alu_shift,load_byte") (set_attr "predicable" "yes") - (set_attr "pool_range" "*,4096") + (set_attr "pool_range" "*,4094") (set_attr "neg_pool_range" "*,250")] ) Index: gcc/config/arm/vfp.md === --- gcc/config/arm/vfp.md (revision 193929) +++ gcc/config/arm/vfp.md (revision 193930) @@ -123,7 +123,7 @@ (set_attr "type" "*,*,*,*,load1,load1,store1,store1,r_2_f,f_2_r,fcpys,f_loads,f_stores") (set_attr "neon_type" "*,*,*,*,*,*,*,*,neon_mcr,neon_mrc,neon_vmov,*,*") (set_attr "insn" "mov,mov,mvn,mov,*,*,*,*,*,*,*,*,*") - (set_attr "pool_range" "*,*,*,*,1020,4096,*,*,*,*,*,1020,*") + (set_attr "pool_range" "*,*,*,*,1018,4094,*,*,*,*,*,1018,*") (set_attr "neg_pool_range" "*,*,*,*, 0, 0,*,*,*,*,*,1008,*")] ) @@ -176,7 +176,8 @@ (const_int 8) (const_int 4))] (const_int 4))) - (set_attr "pool_range" "*,*,*,*,1020,4096,*,*,*,*,1020,*") + (set_attr "arm_pool_range" "*,*,*,*,1020,4096,*,*,*,*,1020,*") +
Re: [RFA/4.7/ARM] Backport arm-*-linux-gnueabihf triplet support to 4.7
PING On 20 November 2012 20:34, Matthew Gretton-Dann wrote: > All, > > This patch backports Matthais Klose's arm*-*-linux-gnueabihf triplet > support patch of 2012-10-15 to 4.7. > > The backport was not clean as 4.8 has obsoleted various arm*-*-* > triplets which are valid in 4.7. > > I have tested this cross with arm-none-linux-gnueabihf and > arm-none-linux-gnueabi. > > One question I do have having done this work - is there a canonical way to > test for the arm*-*-linux-gnueabi triplet (or variants)? Various configure > and testsuite files test for this, but there doesn't seem to be a consistent > method. > > OK for 4.7? > > Thanks, > > Matt > > 2012-11-08 Matthew Gretton-Dann > > Backport from mainline > 2012-10-15 Matthias Klose > > * config.gcc: Match arm*-*-linux-* for ARM Linux/GNU. > * doc/install.texi: Use arm-*-*linux-* instead of > arm-*-*linux-gnueabi. > > gcc/ada/ChangeLog: > 2012-11-08 Matthew Gretton-Dann > > Backport from mainline. > 2012-10-15 Matthias Klose > > * gcc-interface/Makefile.in: Match arm*-*-linux-*eabi* for > ARM Linux/GNU. > > gcc/testsuite/ChangeLog: > 2012-11-08 Matthew Gretton-Dann > > Backport from mainline > 2012-10-15 Matthias Klose > > * lib/target-supports.exp (check_profiling_available): Match > arm*-*-linux-* for ARM Linux/GNU. > * gfortran.dg/enum_10.f90: Likewise. > * gfortran.dg/enum_9.f90: Likewise. > * gcc.target/arm/synchronize.c: Likewise. > * g++.old-deja/g++.jason/enum6.C: Likewise. > * g++.old-deja/g++.law/enum9.C: Likewise. > * g++.old-deja/g++.other/enum4.C: Likewise. > > libgcc/ChangeLog: > 2012-11-08 Matthew Gretton-Dann > Backport from mainline. > 2012-10-15 Matthias Klose > > * config.host: Match arm*-*-linux-* for ARM Linux/GNU. > > libjava/ChangeLog: > 2012-11-08 Matthew Gretton-Dann > > Backport from mainline. > 2012-10-15 Matthias Klose > > * configure.ac: Match arm*-*-linux-* for ARM Linux/GNU. > * configure: Regenerate. > > libstdc++-v3/ChangeLog: > 2012-11-08 Matthew Gretton-Dann > > Backport from mainline > 2012-10-15 Matthias Klose > > * configure.host: Match arm*-*-linux-* for ARM Linux/GNU. > * testsuite/20_util/make_signed/requirements/typedefs-2.cc: Likewise. > * testsuite/20_util/make_unsigned/requirements/typedefs-2.cc: > Likewise. > > -- > Matthew Gretton-Dann > Linaro Toolchain Working Group > matthew.gretton-d...@linaro.org -- Matthew Gretton-Dann Linaro Toolchain Working Group matthew.gretton-d...@linaro.org
[patch] RFA: more fixes for PR55006
On Thu, Nov 29, 2012 at 12:53 AM, Steven Bosscher wrote: > Attached is a different fix. It splits DF_REF_IN_NOTE in two: One flag > for each kind of note. This allows the dead note removal code to > distinguish the source note for the EQ_USES. I needed to remove one > flag to keep the df_ref_flags 16-bit, but the DF_REF_SUBREG flag looks > completely unnecessary to me, and only ira.c uses it, so it wasn't to > hard to scavenge a single bit. > > > The patch also includes all places I've found so far where the > compiler could create self-referencing notes: > > 1. optabs.c: Not sure what it was trying to do, but now it just > refuses to add a note if TARGET is mentioned in one of the source > operands. > > 2. gcse.c: gcse_emit_move_after added notes, but none of them was very > useful as far as I could tell, and almost all of them turned > self-referencing after CPROP. So I propose we just never add notes in > this case. > > 3. cprop.c: It seems to me that the purpose here is to propagate > constants. If a reg could not be propagated, then the REG_EQUAL note > will not help much either. Propagating constants via REG_EQUAL notes > still helps folding comparisons sometimes, so I'm proposing we only > propagate those. As a bonus: less garbage RTL because a > cprop_constant_p can be shared. > > 4. fwprop.c: If the SET_DEST is a REG that is mentioned in the > SET_SRC, don' create a note. This one I'm not very happy with, because > it doesn't handle the case that the REG is somehow simplified out of > the SET_SRC, but being smarter about this would only complicate > things. I for one can't think of anything better for the moment, > anyway. > > > Finally, it makes sense to compute the NOTE problem for CSE. > > Bootstrap&testing in progress at the older revision I've been using to > debug the darwin10 and sparc bugs. I'll test trunk head on x86_64 and > powerpc later. In the mean time, let me hear what you think of this > one :-) So I've tested this patch: Bootstrapped&tested on x86_64-unknown-linux-gnu at r193394. Bootstrapped&tested on x86_64-unknown-linux-gnu at r193924. Bootstrapped&tested on powerpc64-unknown-linux-gnu at r193924. Of course, I also verified that the darwin10 and sparc issues that spawned this long thread are fixed. OK for trunk? Ciao! Steven PR55006_2.diff Description: Binary data
RE: [PATCH,ARM] Subdivide alu into alu_reg and simple_alu_imm
> -Original Message- > From: Richard Earnshaw > Sent: 29 November 2012 10:12 > To: Greta Yorsh > Cc: GCC Patches; Ramana Radhakrishnan; ni...@redhat.com; > p...@codesourcery.com > Subject: Re: [PATCH,ARM] Subdivide alu into alu_reg and simple_alu_imm > > > ; ??? Check Thumb-2 split length > (define_insn_and_split "*arm_subsi3_insn" > - [(set (match_operand:SI 0 "s_register_operand" > "=r,r,rk,r") > - (minus:SI (match_operand:SI 1 "reg_or_int_operand" "rI,r,k,?n") > - (match_operand:SI 2 "reg_or_int_operand" "r,rI,r, r")))] > + [(set (match_operand:SI 0 "s_register_operand" > "=r,r,r,rk,r") > + (minus:SI (match_operand:SI 1 "reg_or_int_operand" "rI,r,r,k,?n") > + (match_operand:SI 2 "reg_or_int_operand" "r,I,r,r, r")))] > "TARGET_32BIT" > "@ > rsb%?\\t%0, %2, %1 > sub%?\\t%0, %1, %2 > sub%?\\t%0, %1, %2 > + sub%?\\t%0, %1, %2 > #" > "&& (CONST_INT_P (operands[1]) > && !const_ok_for_arm (INTVAL (operands[1])))" > @@ -1270,8 +1295,9 @@ (define_insn_and_split "*arm_subsi3_insn > INTVAL (operands[1]), operands[0], operands[2], > 0); > DONE; > " > - [(set_attr "length" "4,4,4,16") > - (set_attr "predicable" "yes")] > + [(set_attr "length" "4,4,4,4,16") > + (set_attr "predicable" "yes") > + (set_attr "type" "*,simple_alu_imm,*,*,*")] > ) > > > There's something wrong here. MINUS (reg, imm) should be canonicalized > elsewhere to PLUS (reg, -imm), so the alternative you've split /should/ > never match anything. On the other hand, you haven't split the first > alternative (that generates RSB), which is a legitimate use of an > immediate in MINUS. The "rI" constraint on operand 1 in the first alternative is not split because RSB instruction it generates cannot dual-issue (even with an immediate operand) in the second slot on Cortex-A7, and so the alternative should not be marked as simple_alu_imm. Thanks, Greta > > Otherwise, OK. > > R. > > On 29/11/12 09:57, Greta Yorsh wrote: > > For attribute named "type", subdivide "alu" into "alu_reg" and > > "simple_alu_imm". > > Set type attribute as appropriate in define_insn patterns with > immediate > > operands. > > Update pipeline descriptions to use the new values of type attribute. > > > > No regression on qemu arm-none-eabi -mcpu=cortex-a15/cortex-a7. > > > > Bootstrap successful on Cortex-A15. > > > > No difference in generated assembly when compiling all of > preprocessed > > sources of gcc 4.8 as a test in various configurations: -mcpu=cortex- > a15 > > -march=armv6t2 -marm/-mthumb -O0/-O1/-O2/-O3/-Os. > > > > The motivation for this patch is cortex-a7 pipeline description, > which will > > be submitted separately. > > > > Ok for trunk? > > > > Thanks, > > Greta > > > > ChangeLog > > > > gcc/ > > > > 2012-11-28 Ramana Radhakrishnan > > Greta Yorsh > > > > * config/arm/arm.md (type): Subdivide "alu" into "alu_reg" > and > > "simple_alu_imm". > > (core_cycles): Use new names. > > (arm_addsi3): Set type attribute for patterns involving > > simple_alu_imm. > > (addsi3_compare0, addsi3_compare0_scratch): Likewise. > > (addsi3_compare_op1, addsi3_compare_op2, > compare_addsi2_op0): > > Likewise. > > (compare_addsi2_op1, arm_subsi3_insn, subsi3_compare0): > Likewise. > > (subsi3_compare, arm_decscc,arm_andsi3_insn): Likewise. > > (thumb1_andsi3_insn, andsi3_compare0_scratch): Likewise. > > (zeroextractsi_compare0_scratch, iorsi3_insn, > iorsi3_compare0): > > Likewise. > > (iorsi3_compare0_scratch, arm_xorsi3, thumb1_xorsi3_insn): > Likewise. > > (xorsi3_compare0, xorsi3_compare0_scratch, > thumb1_zero_extendhisi2): > > Likewise. > > (arm_zero_extendhisi2_v6, thumb1_zero_extendqisi2_v): > Likewise. > > (arm_zero_extendqisi2_v6, thumb1_extendhisi2, > arm_extendqisi_v6): > > Likewise. > > (thumb1_extendqisi2, arm_movsi_insn): Likewise. > > (movsi_compare0, movhi_insn_arch4, movhi_bytes): Likewise. > > (arm_movqi_insn, thumb1_movqi_insn, arm_cmpsi_insn): > Likewise. > > (movsicc_insn, if_plus_move, if_move_plus): Likewise. > > * config/arm/neon.md (neon_mov/VDX): Likewise. > > (neon_mov/VQXMOV): Likewise. > > * config/arm/arm1020e.md (1020alu_op): Likewise. > > * config/arm/fmp626.md (mp626_alu_op): Likewise. > > * config/arm/fa726te.md (726te_alu_op): Likewise. > > * config/arm/fa626te.md (626te_alu_op): Likewise. > > * config/arm/fa606te.md (606te_alu_op): Likewise. > > * config/arm/fa526.md (526_alu_op): Likewise. > > * config/arm/cortex-r4.md (cortex_r4_alu, cortex_r4_mov): > Likewise. > > * config/arm/cortex-m4.md (cortex_m4_alu): Likewise. > > * config/arm/cortex-a9.md (cprtex_a9_dp): Likewise. > > * config/arm/cortex-a8.md (cortex_a8_alu, cor
Re: [PATCH][AARCH64] Refactor constant generation
On 26/11/12 16:50, Sofiane Naci wrote: Hi, Constant building in the AArch64 backend spits out assembly code, which affects scheduling of the generated code. This patch rewrites the code to use RTL patterns. A full aarch64-none-elf regression run shows no issues. Thanks Sofiane - ChangeLog: 2012-11-26 Sofiane Naci * config/aarch64/aarch64.c (aarch64_build_constant): Update prototype. Call emit_move_insn instead of printing movi/movn/movz instructions. Call gen_insv_immdi instead of printing movk instruction. (aarch64_add_constant): Update prototype. Generate RTL instead of priting add/sub instructions. (aarch64_output_mi_thunk): Update calls to aarch64_build_constant and aarch64_add_constant. OK. Please back port to aarch64-4.7-branch. /Marcus
Re: [PATCH,ARM] Subdivide alu into alu_reg and simple_alu_imm
On 29/11/12 14:58, Greta Yorsh wrote: -Original Message- From: Richard Earnshaw Sent: 29 November 2012 10:12 To: Greta Yorsh Cc: GCC Patches; Ramana Radhakrishnan; ni...@redhat.com; p...@codesourcery.com Subject: Re: [PATCH,ARM] Subdivide alu into alu_reg and simple_alu_imm ; ??? Check Thumb-2 split length (define_insn_and_split "*arm_subsi3_insn" - [(set (match_operand:SI 0 "s_register_operand" "=r,r,rk,r") - (minus:SI (match_operand:SI 1 "reg_or_int_operand" "rI,r,k,?n") - (match_operand:SI 2 "reg_or_int_operand" "r,rI,r, r")))] + [(set (match_operand:SI 0 "s_register_operand" "=r,r,r,rk,r") + (minus:SI (match_operand:SI 1 "reg_or_int_operand" "rI,r,r,k,?n") + (match_operand:SI 2 "reg_or_int_operand" "r,I,r,r, r")))] "TARGET_32BIT" "@ rsb%?\\t%0, %2, %1 sub%?\\t%0, %1, %2 sub%?\\t%0, %1, %2 + sub%?\\t%0, %1, %2 #" "&& (CONST_INT_P (operands[1]) && !const_ok_for_arm (INTVAL (operands[1])))" @@ -1270,8 +1295,9 @@ (define_insn_and_split "*arm_subsi3_insn INTVAL (operands[1]), operands[0], operands[2], 0); DONE; " - [(set_attr "length" "4,4,4,16") - (set_attr "predicable" "yes")] + [(set_attr "length" "4,4,4,4,16") + (set_attr "predicable" "yes") + (set_attr "type" "*,simple_alu_imm,*,*,*")] ) There's something wrong here. MINUS (reg, imm) should be canonicalized elsewhere to PLUS (reg, -imm), so the alternative you've split /should/ never match anything. On the other hand, you haven't split the first alternative (that generates RSB), which is a legitimate use of an immediate in MINUS. The "rI" constraint on operand 1 in the first alternative is not split because RSB instruction it generates cannot dual-issue (even with an immediate operand) in the second slot on Cortex-A7, and so the alternative should not be marked as simple_alu_imm. Thanks, Greta Ah! Um... OK then. R. Otherwise, OK. R. On 29/11/12 09:57, Greta Yorsh wrote: For attribute named "type", subdivide "alu" into "alu_reg" and "simple_alu_imm". Set type attribute as appropriate in define_insn patterns with immediate operands. Update pipeline descriptions to use the new values of type attribute. No regression on qemu arm-none-eabi -mcpu=cortex-a15/cortex-a7. Bootstrap successful on Cortex-A15. No difference in generated assembly when compiling all of preprocessed sources of gcc 4.8 as a test in various configurations: -mcpu=cortex- a15 -march=armv6t2 -marm/-mthumb -O0/-O1/-O2/-O3/-Os. The motivation for this patch is cortex-a7 pipeline description, which will be submitted separately. Ok for trunk? Thanks, Greta ChangeLog gcc/ 2012-11-28 Ramana Radhakrishnan Greta Yorsh * config/arm/arm.md (type): Subdivide "alu" into "alu_reg" and "simple_alu_imm". (core_cycles): Use new names. (arm_addsi3): Set type attribute for patterns involving simple_alu_imm. (addsi3_compare0, addsi3_compare0_scratch): Likewise. (addsi3_compare_op1, addsi3_compare_op2, compare_addsi2_op0): Likewise. (compare_addsi2_op1, arm_subsi3_insn, subsi3_compare0): Likewise. (subsi3_compare, arm_decscc,arm_andsi3_insn): Likewise. (thumb1_andsi3_insn, andsi3_compare0_scratch): Likewise. (zeroextractsi_compare0_scratch, iorsi3_insn, iorsi3_compare0): Likewise. (iorsi3_compare0_scratch, arm_xorsi3, thumb1_xorsi3_insn): Likewise. (xorsi3_compare0, xorsi3_compare0_scratch, thumb1_zero_extendhisi2): Likewise. (arm_zero_extendhisi2_v6, thumb1_zero_extendqisi2_v): Likewise. (arm_zero_extendqisi2_v6, thumb1_extendhisi2, arm_extendqisi_v6): Likewise. (thumb1_extendqisi2, arm_movsi_insn): Likewise. (movsi_compare0, movhi_insn_arch4, movhi_bytes): Likewise. (arm_movqi_insn, thumb1_movqi_insn, arm_cmpsi_insn): Likewise. (movsicc_insn, if_plus_move, if_move_plus): Likewise. * config/arm/neon.md (neon_mov/VDX): Likewise. (neon_mov/VQXMOV): Likewise. * config/arm/arm1020e.md (1020alu_op): Likewise. * config/arm/fmp626.md (mp626_alu_op): Likewise. * config/arm/fa726te.md (726te_alu_op): Likewise. * config/arm/fa626te.md (626te_alu_op): Likewise. * config/arm/fa606te.md (606te_alu_op): Likewise. * config/arm/fa526.md (526_alu_op): Likewise. * config/arm/cortex-r4.md (cortex_r4_alu, cortex_r4_mov): Likewise. * config/arm/cortex-m4.md (cortex_m4_alu): Likewise. * config/arm/cortex-a9.md (cprtex_a9_dp): Likewise. * config/arm/cortex-a8.md (cortex_a8_alu, cortex_a8_mov): Likewise. * config/arm/cortex-a5.md (cortex_a5_alu): Likewise. * config/arm/cortex-a15.md (cortex_a15_alu): Likewise. * config/arm/arm926ejs.md (9_alu_op): Likewise. * con
Re: [Patch AArch64-4.7] Backport - Refactor Advanced SIMD builtin initialisation.
On 28/11/12 11:07, James Greenhalgh wrote: Hi, The backport for this patch was not entirely clean, __builtin_thread_pointer was made a front end builtin on the 4.8 branch, but remains a back-end builtin on aarch64-4.7. Otherwise, there are no differences between this patch and the patch which was committed to mainline last week. OK to commit to AArch64-4.7? Thanks, James --- gcc/ChangeLog.aarch64 2012-11-28 James Greenhalgh Backport from mainline. 2012-11-20 James Greenhalgh Tejas Belagod * config/aarch64/aarch64-builtins.c (aarch64_simd_builtin_type_bits): Rename to... (aarch64_simd_builtin_type_mode): ...this, make sequential. (aarch64_simd_builtin_datum): Refactor members. (VAR1, VAR2, ..., VAR12): Update accordingly. (aarch64_simd_builtin_data): Include from aarch64-simd-builtins.def. (aarch64_builtins): Update accordingly. (init_aarch64_simd_builtins): Refactor, rename to... (aarch64_init_simd_builtins): ...this. (aarch64_simd_builtin_compare): Remove. (locate_simd_builtin_icode): Likewise. * config/aarch64/aarch64-protos.h (aarch64_init_builtins): New. (aarch64_expand_builtin): Likewise. (aarch64_load_tp): Likewise. * config/aarch64/aarch64-simd-builtins.def: New file. * config/aarch64/aarch64.c (aarch64_init_builtins): Move to aarch64-builtins.c. (aarch64_expand_builtin): Likewise. (aarch64_load_tp): Remove static designation. * config/aarch64/aarch64.h (aarch64_builtins): Move to aarch64-builtins.c. OK /Marcus
Re: [PATCH] Instance information in DWARF discriminators
> "Thomas" == Thomas Quinot writes: Thomas> As for the size impact on line_map_ordinary, indeed that's an additional Thomas> int here, but I'm not sure we allocate so many objects of that kind that Thomas> the increase is significant. Could you please measure it? My guess is that it is worse than you think. Thomas> Now, an alternative might be to have an array of instance IDs stored at Thomas> the struct line_maps level, with the same indices at the set of ordinary Thomas> maps, which would be allocated only when flag_debug_instances is used; Thomas> when it is not used the cost would then be a single NULL pointer in Thomas> struct line_maps. Would that be acceptable? I thought there was already a similar approach in place. See location_adhoc_data. The proliferation of ways to do this is mildly worrying. Tom
Re: constexpr vector operations
OK. Jason
Re: [patch c++]: 1 of 7 Fix for PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
OK. Jason
Re: [PATCH][AARCH64] Fix the name mangling of va_list
Please find the updated patch that improves the comment added to the test in the generic part of the testsuite. Thanks, Yufeng On 11/26/12 09:55, Marcus Shawcroft wrote: On 21/11/12 14:31, Yufeng Zhang wrote: Hi, This patch updates the AArch64 port to mangle __va_list as it is in namespace std in C++. This is specified in the AArch64 AAPCS64 ABI doc. OK for the trunk? Thanks, Yufeng gcc/ChangeLog 2012-11-21 Yufeng Zhang * config/aarch64/aarch64.c (aarch64_mangle_type): New function. (TARGET_MANGLE_TYPE): Define. The change to the AArch64 port itself is OK. gcc/testsuite/ChangeLog 2012-11-21 Yufeng Zhang * g++.dg/abi/arm_va_list.C: Also test on aarch64*-*-*. // AAPCS \S 7.1.4 requires that va_list be a typedef for "struct // __va_list". The mangling is as if it were "std::__va_list". +// So is required for the AArch64 target. The functional change in this test makes sense however the comment change is slightly confusing. The original comment refers to the procedure call standard for AArch32: IHI0042D_aapcs.pdf. The procedure call standard for AArch64 is defined in IHI0055A_aapcs.pdf. This document also discusses va_list in chapter 7.1.4. Perhaps a different form of words distinguishes between the two different PCS documents would be better? Perhaps something along these lines: // AAPCS \S 7.1.4 requires that va_list be a typedef for "struct // __va_list". The mangling is as if it were "std::__va_list". // AArch64 PCS IHI0055A_aapcs64.pdf \S 7.1.4 requires that va_list // be a typedef for "struct __va_list". The mangling is as if it // were "std::__va_list". In any case I don;t believe I can OK this change to the generic part of the test suite. Suggest you CC Mike Stump or Janis Johnson. Cheers /Marcus diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index d4708bf..42f3a40 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5853,6 +5853,20 @@ aarch64_preferred_simd_mode (enum machine_mode mode) return word_mode; } +/* Implement TARGET_MANGLE_TYPE. */ + +const char * +aarch64_mangle_type (const_tree type) +{ + /* The AArch64 ABI documents say that "__va_list" has to be + managled as if it is in the "std" namespace. */ + if (lang_hooks.types_compatible_p (CONST_CAST_TREE (type), va_list_type)) +return "St9__va_list"; + + /* Use the default mangling. */ + return NULL; +} + /* Return the equivalent letter for size. */ static unsigned char sizetochar (int size) @@ -6778,6 +6792,9 @@ aarch64_c_mode_for_suffix (char suffix) #undef TARGET_LIBGCC_CMP_RETURN_MODE #define TARGET_LIBGCC_CMP_RETURN_MODE aarch64_libgcc_cmp_return_mode +#undef TARGET_MANGLE_TYPE +#define TARGET_MANGLE_TYPE aarch64_mangle_type + #undef TARGET_MEMORY_MOVE_COST #define TARGET_MEMORY_MOVE_COST aarch64_memory_move_cost diff --git a/gcc/testsuite/g++.dg/abi/arm_va_list.C b/gcc/testsuite/g++.dg/abi/arm_va_list.C index 45a426a..4f6f3a4 100644 --- a/gcc/testsuite/g++.dg/abi/arm_va_list.C +++ b/gcc/testsuite/g++.dg/abi/arm_va_list.C @@ -1,9 +1,10 @@ -// { dg-do compile } +// { dg-do compile { target { aarch64*-*-* arm*-*-* } } } // { dg-options "-Wno-abi" } -// { dg-require-effective-target arm_eabi } +// { dg-require-effective-target arm_eabi { target arm*-*-* } } // AAPCS \S 7.1.4 requires that va_list be a typedef for "struct // __va_list". The mangling is as if it were "std::__va_list". +// AAPCS64 \S 7.1.4 has the same requirement for AArch64 targets. // #include typedef __builtin_va_list va_list;
Re: [0/8] Add optabs alternatives for insv, extv and extzv
> I agree that this kind of MEM is less than ideal, but I thought: > > set_mem_attributes_minus_bitpos (to_rtx, to, 0, bitpos); > > said that the attributes of TO_RTX will to be TO _once a pending offset-and- > narrowing operation has been applied_. So we have: > > /* If we modified OFFSET based on T, then subtract the outstanding > bit position offset. Similarly, increase the size of the accessed > object to contain the negative offset. */ > if (apply_bitpos) > { > gcc_assert (attrs.offset_known_p); > attrs.offset -= apply_bitpos / BITS_PER_UNIT; > if (attrs.size_known_p) > attrs.size += apply_bitpos / BITS_PER_UNIT; > } > > I didn't think we necessarily expected the width of the reference > (TO_RTX) and the width of the type (TO) to match at this stage. > That's different from adjust_bitfield_address, where the > offset-and-narrowing operation itself is applied. I was essentially thinking of the size adjustment just above that one: if the mode size is known, setting a smaller size without further ado seems awkward. So the questionable MEM doesn't survive long? OK, maybe... > The difference between the width of the reference and the width > of T is what led to: > >http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00262.html > > As things stand, APPLY_BITPOS is only nonzero if we set both the > MEM_EXPR and MEM_SIZE from T. There are also cases (like this one) > where we don't set the MEM_EXPR from T but do set the MEM_SIZE from T. > The bitpos will be applied either way, so I thought MEM_SIZE should be > the same in both cases. That doesn't fix this problem of course, it's > just an argument that the relationship between the width of the reference > mode, the MEM_SIZE and the width of T seems pretty complicated with the > current interface. MEM_SIZE and MEM_EXPR are used alone by the aliasing machinery to disambiguate memory references, so they need be conservative wrt the actual memory access. > Maybe set_mem_attributes_minus_bitpos (but not set_mem_attributes) > should only set the MEM_EXPR and leave the MEM_SIZE unchanged? > > Before submitting the patched linked above, I tried getting rid > of set_mem_attributes_minus_bitpos and passing the tree down instead. > Then we could set the attributes at the time of the offset-and-narrowing > operation, where the size and offset of the final reference are known. > That didn't seem like an easy change to make though, and became a > bit of a distraction from the main patches. > > Anyway, given the breakage that this series has already caused, > I'd prefer not to tackle stuff like this as well. I'd only used > MEM_SIZE in the first attempted patch out of habit. I think the > revised patch more obviously matches the *_fixed_bit_field functions > and is more generally in keeping with the existing checks. > (It's deliberately more conservative though, only using register > bitfields if both the bit_field_mode_iterator and strict volatile > bitfield rules are met.) Well, rewriting the bitfield machinery of the middle-end is a once-in-a-decade undertaking, so some fallouts are to be expected. :-) That wasn't too bad in the end. But I agree with the cautious approach from now on. -- Eric Botcazou
Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
On Thu, Nov 29, 2012 at 09:34:31AM +0100, Richard Biener wrote: > Definitely not - that means to not preserve loops until after cprop. The goal > is to preserve loops everywhere! Yikes, sorry, it wasn't clear to me what PROP_loops really does. Anyway, I think I have a better fix now. The problem is just that when removing BB 4 (which was a header), we have to zap ->header and ->latch. We already have code for this: if (current_loops != NULL && e->src->loop_father->latch == e->src) { /* ??? Now we are creating (or may create) a loop with multiple entries. Simply mark it for removal. Alternatively we could not do this threading. */ e->src->loop_father->header = NULL; e->src->loop_father->latch = NULL; } but the thing is that when there are multiple latch edges, then ->latch is NULL. So we need to keep track of how many latch edges the header has. Regtested/bootstrapped on x86_64, ok for trunk? Can I get rid of may_be_loop_header (and just use n_latch_edges > 0 instead at that one place) in a followup? 2012-11-29 Marek Polacek PR middle-end/54838 * cprop.c (bypass_block): Set header and latch to NULL when BB has more than one latch edge. (n_latches): New variable. * gcc.dg/pr54838.c: New test. --- gcc/cprop.c.mp 2012-11-29 15:49:53.120524295 +0100 +++ gcc/cprop.c 2012-11-29 15:50:01.421547832 +0100 @@ -1499,6 +1499,7 @@ bypass_block (basic_block bb, rtx setcc, int may_be_loop_header; unsigned removed_p; unsigned i; + unsigned n_latch_edges; edge_iterator ei; insn = (setcc != NULL) ? setcc : jump; @@ -1510,13 +1511,12 @@ bypass_block (basic_block bb, rtx setcc, if (note) find_used_regs (&XEXP (note, 0), NULL); - may_be_loop_header = false; + n_latch_edges = 0; FOR_EACH_EDGE (e, ei, bb->preds) if (e->flags & EDGE_DFS_BACK) - { - may_be_loop_header = true; - break; - } + n_latch_edges++; + + may_be_loop_header = n_latch_edges > 0; change = 0; for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); ) @@ -1605,7 +1605,8 @@ bypass_block (basic_block bb, rtx setcc, && dest != EXIT_BLOCK_PTR) { if (current_loops != NULL - && e->src->loop_father->latch == e->src) + && (e->src->loop_father->latch == e->src + || n_latch_edges > 1)) { /* ??? Now we are creating (or may create) a loop with multiple entries. Simply mark it for --- gcc/testsuite/gcc.dg/pr54838.c.mp 2012-11-26 14:48:43.783980854 +0100 +++ gcc/testsuite/gcc.dg/pr54838.c 2012-11-26 14:49:51.051158719 +0100 @@ -0,0 +1,24 @@ +/* PR middle-end/54838 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-forward-propagate -ftracer" } */ + +void bar (void); + +void +foo (void *b, int *c) +{ +again: + switch (*c) +{ +case 1: + if (!b) + { + bar (); + return; + } + goto again; +case 3: + if (!b) + goto again; +} +} Marek
Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
On Thu, Nov 29, 2012 at 04:38:52PM +0100, Marek Polacek wrote: > 2012-11-29 Marek Polacek > > PR middle-end/54838 > * cprop.c (bypass_block): Set header and latch to NULL when > BB has more than one latch edge. > (n_latches): New variable. Of course here should be (n_latch_edges). Marek
[PATCH] Fix for PR55492 : __atomic_load doesn't match ACQUIRE memory model
Hi, on ARMv7, the code generated for the __atomic_load builtins in the __ATOMIC_ACQUIRE memory model, puts a memory barrier before the load, whereas the semantic of the acquire memory model implies a barrier after. The issue seems to be in expand_atomic_load which puts a memory fence before the load in any memory model. The attached patch fixes the problem. Thanks, Yvan Fix-load-acquire.patch Description: Binary data
Re: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold
Am 29.11.2012 09:43, schrieb Richard Biener: > Do we need to consider GNU ld and gold coming from different binutils versions > and thus do we need two sets of linker feature tests at configure > time? both GNU ld and gold are built from the same sources. So it is likely that they come from the same binutils version (at least that is the current situation). > Eventually > also if GNU ld and gold are not in feature-parity for the same binutils > version? > > That is, isn't this going to create hard to debug issues for users? How would that be different than today's issues? If both versions are built, then any of them can be picked up once these are installed. How is today's situation? Debian/Ubuntu do build both linkers (and I assume other distributions do as well). One is the default (currently GNU ld), and you can make the other the system wide default using diversions (a distribution mechanism to change /usr/bin/ld). I have: $ ls -l /usr/lib/*-ld /usr/lib/compat-ld: lrwxrwxrwx 1 root root 16 Nov 26 08:13 ld -> ../../bin/ld.bfd /usr/lib/gold-ld: lrwxrwxrwx 1 root root 17 Nov 26 08:13 ld -> ../../bin/ld.gold so to explicitly use one linker you have to know the location, and then pass the appropriate -B option. Having this option makes this choice independent of the location. Having this location independence is my rationale for this patch. Matthias
Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
On Thu, Nov 29, 2012 at 4:38 PM, Marek Polacek wrote: > On Thu, Nov 29, 2012 at 09:34:31AM +0100, Richard Biener wrote: >> Definitely not - that means to not preserve loops until after cprop. The >> goal >> is to preserve loops everywhere! > > Yikes, sorry, it wasn't clear to me what PROP_loops really does. Anyway, > I think I have a better fix now. The problem is just that when removing > BB 4 (which was a header), we have to zap ->header and ->latch. We > already have code for this: > > if (current_loops != NULL > && e->src->loop_father->latch == e->src) > { > /* ??? Now we are creating (or may create) a loop > with multiple entries. Simply mark it for > removal. Alternatively we could not do this > threading. */ > e->src->loop_father->header = NULL; > e->src->loop_father->latch = NULL; > } > > but the thing is that when there are multiple latch edges, then > ->latch is NULL. So we need to keep track of how many latch edges > the header has. Regtested/bootstrapped on x86_64, ok for trunk? > > Can I get rid of may_be_loop_header (and just use n_latch_edges > 0 > instead at that one place) in a followup? > > 2012-11-29 Marek Polacek <> > > PR middle-end/54838 > * cprop.c (bypass_block): Set header and latch to NULL when > BB has more than one latch edge. > (n_latches): New variable. You don't have to mention a new local variable in the ChangeLog. But FWIW, not all DFS back edges are latches. Maybe name it n_back_edges? > * gcc.dg/pr54838.c: New test. > > --- gcc/cprop.c.mp 2012-11-29 15:49:53.120524295 +0100 > +++ gcc/cprop.c 2012-11-29 15:50:01.421547832 +0100 > @@ -1499,6 +1499,7 @@ bypass_block (basic_block bb, rtx setcc, >int may_be_loop_header; >unsigned removed_p; >unsigned i; > + unsigned n_latch_edges; >edge_iterator ei; > >insn = (setcc != NULL) ? setcc : jump; > @@ -1510,13 +1511,12 @@ bypass_block (basic_block bb, rtx setcc, >if (note) > find_used_regs (&XEXP (note, 0), NULL); > > - may_be_loop_header = false; > + n_latch_edges = 0; >FOR_EACH_EDGE (e, ei, bb->preds) > if (e->flags & EDGE_DFS_BACK) > - { > - may_be_loop_header = true; > - break; > - } > + n_latch_edges++; > + > + may_be_loop_header = n_latch_edges > 0; > >change = 0; >for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); ) > @@ -1605,7 +1605,8 @@ bypass_block (basic_block bb, rtx setcc, > && dest != EXIT_BLOCK_PTR) > { > if (current_loops != NULL > - && e->src->loop_father->latch == e->src) > + && (e->src->loop_father->latch == e->src > + || n_latch_edges > 1)) > { > /* ??? Now we are creating (or may create) a loop > with multiple entries. Simply mark it for It seems to me that this threading should just not happen. Creating loops with multiple entries is something to be avoided because most loop-based optimizations don't work on irreducible regions. So this affects all passes that run after CPROP, including unrolling, IRA, the scheduler, etc. There is already code that tries to avoid creating multi-entry loops: /* The irreducible loops created by redirecting of edges entering the loop from outside would decrease effectiveness of some of the following optimizations, so prevent this. */ if (may_be_loop_header && !(e->flags & EDGE_DFS_BACK)) { ei_next (&ei); continue; } Apparently your test case manages to slip through, and I wonder why. Ciao! Steven
Re: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold
Am 29.11.2012 10:10, schrieb Markus Trippelsdorf: > Additionally, what is the rationale for this patch? IOW why isn't the > following > enough? > > x4 ~ # ln -f /usr/x86_64-pc-linux-gnu/binutils-bin/git/ld.gold > /usr/x86_64-pc-linux-gnu/binutils-bin/git/ld > x4 ~ # ld -v > GNU gold (GNU Binutils 2.23.51.20121126) 1.11 > x4 ~ # ln -f /usr/x86_64-pc-linux-gnu/binutils-bin/git/ld.bfd > /usr/x86_64-pc-linux-gnu/binutils-bin/git/ld > x4 ~ # ld -v > GNU ld (GNU Binutils) 2.23.51.20121126 because you have to find out about these hard-coded path for every single installation of binutils.
Re: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold
On Thu, Nov 29, 2012 at 12:43 AM, Richard Biener wrote: > On Thu, Nov 29, 2012 at 5:18 AM, H.J. Lu wrote: >> From: "H.J. Lu" >> To: gcc-patches@gcc.gnu.org >> Cc: Joseph Myers , Paolo Bonzini >> Bcc: >> Subject: [PATCH] Support -fuse-ld=bfd and -fuse-ld=gold >> Reply-To: >> >> Hi, >> >> Binutils supports 2 linkers, ld.gold and ld.bfd. One of them is >> configured as the default linker, ld, which is used by GCC. Sometimes, >> we want to use the alternate linker with GCC at run-time. This patch >> adds -fuse-ld=bfd and -fuse-ld=gold options to GCC driver. It changes >> collect2.c to pick either ld.bfd or ld.gold. It also adds >> ORIGINAL_LD_BFD_FOR_TARGET and ORIGINAL_LD_GOLD_FOR_TARGET to >> exec-tool.in to add -fuse-ld=bfd and -fuse-ld=gold support to >> collect-ld. Since ld.bfd nor ld.gold know the new options, you >> will get >> >> # ./xgcc -B./ /tmp/x.c -fuse-ld=gold -v >> ... >> ./collect-ld --eh-frame-hdr -m elf_x86_64 -dynamic-linker >> /lib64/ld-linux-x86-64.so.2 -fuse-ld=gold /lib/../lib64/crt1.o >> /lib/../lib64/crti.o ./crtbegin.o -L. -L/lib/../lib64 -L/usr/lib/../lib64 >> /tmp/cclVWcGz.o -v -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc >> --as-needed -lgcc_s --no-as-needed ./crtend.o /lib/../lib64/crtn.o >> GNU gold (Linux/GNU Binutils 2.23.51.0.7.20121127) 1.11 >> /usr/local/bin/ld.gold: fatal error: -f/--auxiliary may not be used without >> -shared >> collect2: error: ld returned 1 exit status >> >> This is because we pass everything to ld and ld.bfd/ld.gold doesn't >> understand -fuse-ld=bfd/-fuse-ld=gold. It isn't a problem for collect2 >> since it will filter-out -fuse-ld=bfd/-fuse-ld=gold. I will submit a >> binutils patch to ignore -fuse-ld=bfd/-fuse-ld=gold, similar to -flto >> options. >> >> OK to install? > > Do we need to consider GNU ld and gold coming from different binutils versions > and thus do we need two sets of linker feature tests at configure > time? Eventually > also if GNU ld and gold are not in feature-parity for the same binutils > version? > > That is, isn't this going to create hard to debug issues for users? > Actually, it helps to debug issues for users. At the moment, we don't know which linker is the default linker. With -fuse-ld, we can specify which linker to use. -- H.J.
Re: [PATCH] Add --with-build-config=bootstrap-asan support
On Thu, Nov 22, 2012 at 12:37:47PM -0800, H.J. Lu wrote: > 2012-11-21 H.J. Lu > > * Makefile.def (target_modules): Add bootstrap=true and > raw_cxx=true to libsanitizer. > * configure.ac (bootstrap_target_libs): Add libsanitizer. > * Makefile.in: Regenerated. > * configure: Likewise. > > config/ > > 2012-11-21 H.J. Lu > > * bootstrap-asan.mk: New file. Ok, thanks. Jakub
Re: [PATCH] Stream profile summary histogram through LTO files (issue6782131)
> This patch ensures that the histograms from the profile summary are streamed > through the LTO files so that the working set can be computed for use in > downstream optimizations. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? > > Thanks, > Teresa > > 2012-11-28 Teresa Johnson > > * lto-cgraph.c (output_profile_summary): Stream out sum_all > and histogram. > (input_profile_summary): Stream in sum_all and histogram. > (merge_profile_summaries): Merge sum_all and histogram. > (input_symtab): Call compute_working_sets after merging > summaries. > * gcov-io.c (gcov_histo_index): Make extern for compiler. > * gcov-io.h (gcov_histo_index): Ditto. > * profile.c (compute_working_sets): Remove static keyword. > * profile.h (compute_working_sets): Ditto. OK. > > Index: lto-cgraph.c > === > --- lto-cgraph.c (revision 193909) > +++ lto-cgraph.c (working copy) > @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-streamer.h" > #include "gcov-io.h" > #include "tree-pass.h" > +#include "profile.h" Please update dependencies in Makefile.in > + /* Count number of non-zero histogram entries, and fill in a bit vector > + of non-zero indices. */ > + counters. */ > + for (bv_ix = 0; bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE; bv_ix++) > +histo_bitvector[bv_ix] = 0; > + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) > +{ > + if (profile_info->histogram[h_ix].num_counters > 0) > +{ > + histo_bitvector[h_ix / 32] |= 1 << (h_ix % 32); > + h_cnt++; > +} I think this would be more readable if you just produced a bitpack instead of doing it by hand, like into gcov-io. > + lto_gcov_summary.sum_all = MAX (lto_gcov_summary.sum_all, > + (file_data->profile_info.sum_all > + * scale > + + REG_BR_PROB_BASE / 2) > + / REG_BR_PROB_BASE); Use RDIV for the scaling. > -#if IN_LIBGCOV || !IN_GCOV > +#if !IN_GCOV > /* Determine the index into histogram for VALUE. */ > > -static unsigned > +GCOV_LINKAGE unsigned I would probably go around the trouble of declaring this static in GCOV, so it is inlined at we do not bload libgcov more than needed. Thanks, Honza
Re: [PATCH] Stream profile summary histogram through LTO files (issue6782131)
On Thu, Nov 29, 2012 at 8:17 AM, Jan Hubicka wrote: >> This patch ensures that the histograms from the profile summary are streamed >> through the LTO files so that the working set can be computed for use in >> downstream optimizations. >> >> Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk? >> >> Thanks, >> Teresa >> >> 2012-11-28 Teresa Johnson >> >> * lto-cgraph.c (output_profile_summary): Stream out sum_all >> and histogram. >> (input_profile_summary): Stream in sum_all and histogram. >> (merge_profile_summaries): Merge sum_all and histogram. >> (input_symtab): Call compute_working_sets after merging >> summaries. >> * gcov-io.c (gcov_histo_index): Make extern for compiler. >> * gcov-io.h (gcov_histo_index): Ditto. >> * profile.c (compute_working_sets): Remove static keyword. >> * profile.h (compute_working_sets): Ditto. > > OK. >> >> Index: lto-cgraph.c >> === >> --- lto-cgraph.c (revision 193909) >> +++ lto-cgraph.c (working copy) >> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3. If not see >> #include "tree-streamer.h" >> #include "gcov-io.h" >> #include "tree-pass.h" >> +#include "profile.h" > > Please update dependencies in Makefile.in ok. >> + /* Count number of non-zero histogram entries, and fill in a bit >> vector >> + of non-zero indices. */ >> + counters. */ >> + for (bv_ix = 0; bv_ix < GCOV_HISTOGRAM_BITVECTOR_SIZE; bv_ix++) >> +histo_bitvector[bv_ix] = 0; >> + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) >> +{ >> + if (profile_info->histogram[h_ix].num_counters > 0) >> +{ >> + histo_bitvector[h_ix / 32] |= 1 << (h_ix % 32); >> + h_cnt++; >> +} > > I think this would be more readable if you just produced a bitpack instead of > doing it > by hand, like into gcov-io. I assume you mean use the bitpack streaming functionality used other places in lto-cgraph.c, and not change the way it is done in gcov-io.c where this was cloned from? I will change the lto-cgraph.c code to use the bitpacking. >> + lto_gcov_summary.sum_all = MAX (lto_gcov_summary.sum_all, >> + (file_data->profile_info.sum_all >> + * scale >> + + REG_BR_PROB_BASE / 2) >> + / REG_BR_PROB_BASE); > > Use RDIV for the scaling. ok. This was cloned from the way we do other scalings in the same function, I will change them all to use RDIV. >> -#if IN_LIBGCOV || !IN_GCOV >> +#if !IN_GCOV >> /* Determine the index into histogram for VALUE. */ >> >> -static unsigned >> +GCOV_LINKAGE unsigned > I would probably go around the trouble of declaring this static in GCOV, > so it is inlined at we do not bload libgcov more than needed. Do you mean leave it static when IN_LIBGCOV? It isn't included when IN_GCOV. I just need it extern when in the compiler. So do you mean make it static when IN_LIBGCOV and GCOV_LINKAGE (i.e. extern) when !IN_LIBGCOV? Thanks, Teresa > > Thanks, > Honza -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [PATCH] Stream profile summary histogram through LTO files (issue6782131)
> > I assume you mean use the bitpack streaming functionality used other > places in lto-cgraph.c, and not change the way it is done in gcov-io.c > where this was cloned from? I will change the lto-cgraph.c code to use > the bitpacking. Yes, we don't have the facility on gcov-io, so we need to do that by hand. > > >> + lto_gcov_summary.sum_all = MAX (lto_gcov_summary.sum_all, > >> + (file_data->profile_info.sum_all > >> + * scale > >> + + REG_BR_PROB_BASE / 2) > >> + / REG_BR_PROB_BASE); > > > > Use RDIV for the scaling. > > ok. This was cloned from the way we do other scalings in the same > function, I will change them all to use RDIV. Yes, thanks! > > >> -#if IN_LIBGCOV || !IN_GCOV > >> +#if !IN_GCOV > >> /* Determine the index into histogram for VALUE. */ > >> > >> -static unsigned > >> +GCOV_LINKAGE unsigned > > I would probably go around the trouble of declaring this static in GCOV, > > so it is inlined at we do not bload libgcov more than needed. > > Do you mean leave it static when IN_LIBGCOV? It isn't included when > IN_GCOV. I just need it extern when in the compiler. So do you mean > make it static when IN_LIBGCOV and GCOV_LINKAGE (i.e. extern) when > !IN_LIBGCOV? Yes, it should help inliner to do the right things for libgcov... Thanks, Honza > > Thanks, > Teresa > > > > > Thanks, > > Honza > > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
On Thu, Nov 29, 2012 at 04:50:19PM +0100, Steven Bosscher wrote: > > 2012-11-29 Marek Polacek <> > > > > PR middle-end/54838 > > * cprop.c (bypass_block): Set header and latch to NULL when > > BB has more than one latch edge. > > (n_latches): New variable. > > You don't have to mention a new local variable in the ChangeLog. Ok. > But FWIW, not all DFS back edges are latches. Maybe name it n_back_edges? Yeah, sure. > > @@ -1605,7 +1605,8 @@ bypass_block (basic_block bb, rtx setcc, > > && dest != EXIT_BLOCK_PTR) > > { > > if (current_loops != NULL > > - && e->src->loop_father->latch == e->src) > > + && (e->src->loop_father->latch == e->src > > + || n_latch_edges > 1)) > > { > > /* ??? Now we are creating (or may create) a loop > > with multiple entries. Simply mark it for > > It seems to me that this threading should just not happen. Creating > loops with multiple entries is something to be avoided because most > loop-based optimizations don't work on irreducible regions. So this > affects all passes that run after CPROP, including unrolling, IRA, the > scheduler, etc. > > There is already code that tries to avoid creating multi-entry loops: > > /* The irreducible loops created by redirecting of edges entering the > loop from outside would decrease effectiveness of some of the > following optimizations, so prevent this. */ > if (may_be_loop_header > && !(e->flags & EDGE_DFS_BACK)) > { > ei_next (&ei); > continue; > } > > Apparently your test case manages to slip through, and I wonder why. That's probably because even though BB 4 is a header, 3->4 and 9->4 are back edges (in the condition there's !(e->flags & EDGE_DFS_BACK), which in this case is 0). Note that the comment speaks about edges coming from outside of the loop. Updated patch: 2012-11-29 Marek Polacek PR middle-end/54838 * cprop.c (bypass_block): Set header and latch to NULL when BB has more than one latch edge. * gcc.dg/pr54838.c: New test. --- gcc/cprop.c.mp 2012-11-29 15:49:53.120524295 +0100 +++ gcc/cprop.c 2012-11-29 17:45:03.004041242 +0100 @@ -1499,6 +1499,7 @@ bypass_block (basic_block bb, rtx setcc, int may_be_loop_header; unsigned removed_p; unsigned i; + unsigned n_back_edges; edge_iterator ei; insn = (setcc != NULL) ? setcc : jump; @@ -1510,13 +1511,12 @@ bypass_block (basic_block bb, rtx setcc, if (note) find_used_regs (&XEXP (note, 0), NULL); - may_be_loop_header = false; + n_back_edges = 0; FOR_EACH_EDGE (e, ei, bb->preds) if (e->flags & EDGE_DFS_BACK) - { - may_be_loop_header = true; - break; - } + n_back_edges++; + + may_be_loop_header = n_back_edges > 0; change = 0; for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); ) @@ -1605,7 +1605,8 @@ bypass_block (basic_block bb, rtx setcc, && dest != EXIT_BLOCK_PTR) { if (current_loops != NULL - && e->src->loop_father->latch == e->src) + && (e->src->loop_father->latch == e->src + || n_back_edges > 1)) { /* ??? Now we are creating (or may create) a loop with multiple entries. Simply mark it for --- gcc/testsuite/gcc.dg/pr54838.c.mp 2012-11-26 14:48:43.783980854 +0100 +++ gcc/testsuite/gcc.dg/pr54838.c 2012-11-29 17:43:19.397737779 +0100 @@ -0,0 +1,24 @@ +/* PR middle-end/54838 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-forward-propagate -ftracer" } */ + +void bar (void); + +void +foo (void *b, int *c) +{ +again: + switch (*c) +{ +case 1: + if (!b) + { + bar (); + return; + } + goto again; +case 3: + if (!b) + goto again; +} +} Marek
Re: [PATCH] Fix for PR55492 : __atomic_load doesn't match ACQUIRE memory model
On 11/29/2012 10:42 AM, Yvan Roux wrote: Hi, on ARMv7, the code generated for the __atomic_load builtins in the __ATOMIC_ACQUIRE memory model, puts a memory barrier before the load, whereas the semantic of the acquire memory model implies a barrier after. The issue seems to be in expand_atomic_load which puts a memory fence before the load in any memory model. The attached patch fixes the problem. Patch is fine. If you can confirm you have a copyright assignment on file I can take care of checking it in. Andrew
Re: [PATCH] Fix for PR55492 : __atomic_load doesn't match ACQUIRE memory model
> If you can confirm you have a copyright assignment on file I can take care > of checking it in. Yes, I am covered by the copyright assignement RT 211150 between STMicroelectronics and FSF. Thanks, Yvan
Re: [patch] Fix PR middle-end/55321
On 2012-11-29 06:29, Eric Botcazou wrote: > PR middle-end/55321 > * calls.c (emit_library_call_value_1): Mark as no-nonlocal if no-throw. Ok. r~
[ARM] Turning off 64bits ops in Neon and gfortran/modulo-scheduling problem
Hi, I have been working on a patch to avoid using Neon for 64 bits bitops when it's too expensive to move data between core and Neon registers. Benchmarking and validation look OK on the 4.7 branch (compiler configured for thumb and hard FP) - validation on cortex-a9 board OK - bencharking shows 10.5% improvement on spec2k's crafty bench. On other benches we are between -0.5% and +0.5%. On trunk I have noticed a regression in gfortran when using modulo scheduling: sms-1.f90 now fails, but I suspect it's not because of this patch since forcing compilation for armv5t makes the same test fail with and without my patch. Specifically, I have observed that the loop: 862e:3b01 subsr3, #1 8630:ef70 08a1 vadd.i64d16, d16, d17 8634:ec51 0b30 vmovr0, r1, d16 8638:e9e2 0102 strdr0, r1, [r2, #8]! 863c:d1f7 bne.n862e in transformed into: 862e:3901 subsr1, #1 8630:1912 addsr2, r2, r4 8632:eb43 0305 adc.wr3, r3, r5 8636:e9e0 2302 strdr2, r3, [r0, #8]! 863a:d1f8 bne.n862e with my patch. This is wrong because adds/adc clobber the flags used to control the loop. The patch is: 2012-11-28 Christophe Lyon gcc/ * config/arm/arm-protos.h (tune_params): Add prefer_neon_for_64bits field. * config/arm/arm.c (prefer_neon_for_64bits): New variable. (arm_slowmul_tune): Default prefer_neon_for_64bits to false. (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto. (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto. (arm_cortex_a5_tune, arm_cortex_a15_tune): Ditto. (arm_cortex_a9_tune, arm_fa726te_tune): Ditto. (arm_option_override): Handle -mneon-for-64bits new option. * config/arm/arm.h (TARGET_PREFER_NEON_64BITS): New macro. (prefer_neon_for_64bits): Declare new variable. * config/arm/arm.md (arch): Rename neon_onlya8 and neon_nota8 to avoid_neon_for_64bits and neon_for_64bits. (arch_enabled): Handle new arch types. (one_cmpldi2): Use new arch names. * config/arm/neon.md (adddi3_neon, subdi3_neon, iordi3_neon) (anddi3_neon, xordi3_neon, ashldi3_neon, di3_neon): Use neon_for_64bits instead of nota8 and avoid_neon_for_64bits instead of onlya8. Is it OK for trunk? diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index d942c5b..c92f055 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -247,6 +247,8 @@ struct tune_params performance. The first element covers Thumb state and the second one is for ARM state. */ bool logical_op_non_short_circuit[2]; + /* Prefer Neon for 64-bit bitops. */ + bool prefer_neon_for_64bits; }; extern const struct tune_params *current_tune; diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 286a6c5..9efd215 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -816,6 +816,10 @@ int arm_arch_thumb2; int arm_arch_arm_hwdiv; int arm_arch_thumb_hwdiv; +/* Nonzero if we should use Neon to handle 64-bits operations rather + than core registers. */ +int prefer_neon_for_64bits = 0; + /* In case of a PRE_INC, POST_INC, PRE_DEC, POST_DEC memory reference, we must report the mode of the memory reference from TARGET_PRINT_OPERAND to TARGET_PRINT_OPERAND_ADDRESS. */ @@ -895,6 +899,7 @@ const struct tune_params arm_slowmul_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true},/* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_fastmul_tune = @@ -908,6 +913,7 @@ const struct tune_params arm_fastmul_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true},/* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; /* StrongARM has early execution of branches, so a sequence that is worth @@ -924,6 +930,7 @@ const struct tune_params arm_strongarm_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true},/* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_xscale_tune = @@ -937,6 +944,7 @@ const struct tune_params arm_xscale_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true},/* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_9e_t
Re: [PATCH] Add --with-build-config=bootstrap-asan support
H.J. Lu wrote: This patch adds --with-build-config=bootstrap-asan support. Tested on Linux/x86-64. OK to install? I think that patch has broken bootstrap for me. If I do a normal bootstrap, Stage1 fails with: libtool: compile: unrecognized option `-D_GNU_SOURCE' libtool: compile: Try `libtool --help' for more information. make[4]: *** [interception_linux.lo] Error 1 make[4]: Leaving directory `/home/burnus/gcc/build/x86_64-unknown-linux-gnu/libsanitizer/interception' Actually, I wonder why it is build in stage 1. Shouldn't it be a target library which only built later (except with bootstrap-asan)? Tobias
[PATCH Filter out -fsanitize=address if not in combined tree in lto-plugin
Hi, When GCC is configured with --with-build-config="bootstrap-asan", all -flto tests will fail since -fsanitize=address is used to compile liblto_plugin.so and linker isn't compiled with -fsanitize=address. This patch filters out -fsanitize=address from CFLAGS if we aren't in a combined tree with binutils. OK to install? H.J. --- --- lto-plugin/Makefile.am| 7 +++ lto-plugin/Makefile.in| 5 + lto-plugin/configure | 18 +++--- lto-plugin/configure.ac | 4 5 files changed, 39 insertions(+), 3 deletions(-) create mode 100644 lto-plugin/ChangeLog.asan 2012-11-21 H.J. Lu * Makefile.am (CFLAGS): Filter out -fsanitize=address if not in a combined tree of GCC and binutils. (LDFLAGS): Likewise. * configure.ac (COMBINED_TREE): New AM_CONDITIONAL. * Makefile.in: Regenerated. * configure: Likewise. diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am index b24015e..d90c555 100644 --- a/lto-plugin/Makefile.am +++ b/lto-plugin/Makefile.am @@ -18,6 +18,13 @@ in_gcc_libs = $(foreach lib, $(libexecsub_LTLIBRARIES), $(gcc_build_dir)/$(lib)) # Can be removed when libiberty becomes a normal convenience library Wc=-Wc, +if !COMBINED_TREE +# Filter out -fsanitize=address if we aren't in a combined tree with +# GCC and binutils. +override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS)) +override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS)) +endif + liblto_plugin_la_SOURCES = lto-plugin.c liblto_plugin_la_LIBADD = \ $(if $(wildcard ../libiberty/pic/libiberty.a),$(Wc)../libiberty/pic/libiberty.a,) diff --git a/lto-plugin/Makefile.in b/lto-plugin/Makefile.in index 0c8d89f..14e6f67 100644 --- a/lto-plugin/Makefile.in +++ b/lto-plugin/Makefile.in @@ -535,6 +535,11 @@ uninstall-am: uninstall-libexecsubLTLIBRARIES uninstall-libexecsubLTLIBRARIES +# Filter out -fsanitize=address if we aren't in a combined tree with +# GCC and binutils. +@COMBINED_TREE_FALSE@override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS)) +@COMBINED_TREE_FALSE@override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS)) + all-local: $(in_gcc_libs) $(in_gcc_libs) : $(gcc_build_dir)/%: % diff --git a/lto-plugin/configure b/lto-plugin/configure index 4900d80..80f3ac6 100755 --- a/lto-plugin/configure +++ b/lto-plugin/configure @@ -597,7 +597,9 @@ ac_includes_default="\ # include #endif" -ac_subst_vars='am__EXEEXT_FALSE +ac_subst_vars='COMBINED_TREE_FALSE +COMBINED_TREE_TRUE +am__EXEEXT_FALSE am__EXEEXT_TRUE LTLIBOBJS LIBOBJS @@ -10552,7 +10554,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 10555 "configure" +#line 10557 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -10658,7 +10660,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 10661 "configure" +#line 10663 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -13318,3 +13320,13 @@ if test -n "$ac_unrecognized_opts" && test "$enable_option_checking" != no; then $as_echo "$as_me: WARNING: unrecognized options: $ac_unrecognized_opts" >&2;} fi + +# Check if this is a combined tree with GCC and binutils. + if test -e ${ac_top_srcdir}/ld/ldmain.c -o -e ${ac_top_srcdir}/gold/version.cc; then + COMBINED_TREE_TRUE= + COMBINED_TREE_FALSE='#' +else + COMBINED_TREE_TRUE='#' + COMBINED_TREE_FALSE= +fi + diff --git a/lto-plugin/configure.ac b/lto-plugin/configure.ac index 9a418d2..526186a 100644 --- a/lto-plugin/configure.ac +++ b/lto-plugin/configure.ac @@ -16,3 +16,7 @@ AC_HEADER_SYS_WAIT AC_CONFIG_FILES(Makefile) AC_CONFIG_HEADERS(config.h) AC_OUTPUT + +# Check if this is a combined tree with GCC and binutils. +AM_CONDITIONAL(COMBINED_TREE, + test -e ${ac_top_srcdir}/ld/ldmain.c -o -e ${ac_top_srcdir}/gold/version.cc) -- 1.7.11.7
[PATCH] Filter out -fsanitize=address if not in combined tree in libiberty
Hi, When GCC is configured with --with-build-config="bootstrap-asan", all -flto tests will fail since -fsanitize=address is used to compile host libiberty, which is used to create liblto_plugin.so, and linker isn't compiled with -fsanitize=address. This patch filters out -fsanitize=address from CFLAGS if we aren't in a combined tree with binutils. OK to install? Thanks. H.J. --- 2012-11-21 H.J. Lu * Makefile.in (CFLAGS): Filter out -fsanitize=address if in GCC tree, but not in a combined tree with binutils. * configure.ac (COMBINED_TREE_FALSE): New AC_SUBST. * configure: Regenerated. diff --git a/libiberty/Makefile.in b/libiberty/Makefile.in index 1ba8cf1..2d357d7 100644 --- a/libiberty/Makefile.in +++ b/libiberty/Makefile.in @@ -63,6 +63,10 @@ PERL = @PERL@ PICFLAG = @PICFLAG@ +# Filter out -fsanitize=address if we are in GCC tree, but aren't in a +# combined tree with binutils. +@COMBINED_TREE_FALSE@override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS)) + MAKEOVERRIDES = TARGETLIB = ./libiberty.a diff --git a/libiberty/configure b/libiberty/configure index 5367027..4869cd5 100755 --- a/libiberty/configure +++ b/libiberty/configure @@ -590,6 +590,7 @@ ac_includes_default="\ ac_subst_vars='LTLIBOBJS INSTALL_DEST +COMBINED_TREE_FALSE pexecute target_header_dir CHECK @@ -6917,6 +6918,20 @@ esac fi +# Check if this is in GCC tree, but aren't in a combined tree with +# binutils. +if test -e ${srcdir}/../gcc/gcc.c; then + if test -e ${srcdir}/../ld/ldmain.c -o \ + -e ${top_srcdir}/../gold/version.cc; then +COMBINED_TREE_FALSE='#' + else +COMBINED_TREE_FALSE='' + fi +else + COMBINED_TREE_FALSE='#' +fi + + # Install a library built with a cross compiler in $(tooldir) rather # than $(libdir). if test -z "${with_cross_host}"; then diff --git a/libiberty/configure.ac b/libiberty/configure.ac index c763894..7661752 100644 --- a/libiberty/configure.ac +++ b/libiberty/configure.ac @@ -670,6 +670,20 @@ AC_SUBST(pexecute) libiberty_AC_FUNC_STRNCMP +# Check if this is in GCC tree, but aren't in a combined tree with +# binutils. +if test -e ${srcdir}/../gcc/gcc.c; then + if test -e ${srcdir}/../ld/ldmain.c -o \ + -e ${top_srcdir}/../gold/version.cc; then +COMBINED_TREE_FALSE='#' + else +COMBINED_TREE_FALSE='' + fi +else + COMBINED_TREE_FALSE='#' +fi +AC_SUBST(COMBINED_TREE_FALSE) + # Install a library built with a cross compiler in $(tooldir) rather # than $(libdir). if test -z "${with_cross_host}"; then -- 1.7.11.7
[PATCH] Put a breakpoint on __asan_report_error for ASAN
Hi, This patch puts a breakpoint on __asan_report_error if CFLAGS contains -fsanitize=address, similar to fancy_abort and internal_error. OK to install? H.J. -- 2012-11-24 H.J. Lu * configure.ac: Append gdbasan.in to .gdbinit if CFLAGS contains -fsanitize=address. * configure: Regenerated. * gdbasan.in: New file. diff --git a/gcc/configure b/gcc/configure index e2c119e..004910f 100755 --- a/gcc/configure +++ b/gcc/configure @@ -27032,6 +27032,14 @@ if test "x$subdirs" != x; then fi echo "source ${srcdir}/gdbinit.in" >> .gdbinit +# Put a breakpoint on __asan_report_error to help with debugging buffer +# overflow. +case "$CFLAGS" in +*-fsanitize=address*) + echo "source ${srcdir}/gdbasan.in" >> .gdbinit + ;; +esac + gcc_tooldir='$(libsubdir)/$(libsubdir_to_prefix)$(target_noncanonical)' diff --git a/gcc/configure.ac b/gcc/configure.ac index c6f57bd..b9aaadb 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -5017,6 +5017,14 @@ if test "x$subdirs" != x; then fi echo "source ${srcdir}/gdbinit.in" >> .gdbinit +# Put a breakpoint on __asan_report_error to help with debugging buffer +# overflow. +case "$CFLAGS" in +*-fsanitize=address*) + echo "source ${srcdir}/gdbasan.in" >> .gdbinit + ;; +esac + gcc_tooldir='$(libsubdir)/$(libsubdir_to_prefix)$(target_noncanonical)' AC_SUBST(gcc_tooldir) AC_SUBST(dollar) diff --git a/gcc/gdbasan.in b/gcc/gdbasan.in new file mode 100644 index 000..cf05825 --- /dev/null +++ b/gcc/gdbasan.in @@ -0,0 +1,3 @@ +# Put a breakpoint on __asan_report_error to help with debugging buffer +# overflow. +b __asan_report_error -- 1.7.11.7
Re: [PATCH] Add --with-build-config=bootstrap-asan support
On Thu, Nov 29, 2012 at 9:36 AM, Tobias Burnus wrote: > H.J. Lu wrote: >> >> This patch adds --with-build-config=bootstrap-asan support. Tested on >> Linux/x86-64. OK to install? > > > I think that patch has broken bootstrap for me. If I do a normal bootstrap, > Stage1 fails with: > > libtool: compile: unrecognized option `-D_GNU_SOURCE' > libtool: compile: Try `libtool --help' for more information. > make[4]: *** [interception_linux.lo] Error 1 > make[4]: Leaving directory > `/home/burnus/gcc/build/x86_64-unknown-linux-gnu/libsanitizer/interception' > > > Actually, I wonder why it is build in stage 1. Shouldn't it be a target > library which only built later (except with bootstrap-asan)? > Please try this: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01530.html -- H.J.
Re: [PATCH] Don't bypass blocks with multiple latch edges (PR middle-end/54838)
> Yikes, sorry, it wasn't clear to me what PROP_loops really does. Anyway, > I think I have a better fix now. The problem is just that when removing > BB 4 (which was a header), we have to zap ->header and ->latch. We > already have code for this: > > if (current_loops != NULL > && e->src->loop_father->latch == e->src) > { > /* ??? Now we are creating (or may create) a loop > with multiple entries. Simply mark it for > removal. Alternatively we could not do this > threading. */ > e->src->loop_father->header = NULL; > e->src->loop_father->latch = NULL; > } > > but the thing is that when there are multiple latch edges, then > ->latch is NULL. So we need to keep track of how many latch edges > the header has. Regtested/bootstrapped on x86_64, ok for trunk? > > Can I get rid of may_be_loop_header (and just use n_latch_edges > 0 > instead at that one place) in a followup? > > 2012-11-29 Marek Polacek > > PR middle-end/54838 > * cprop.c (bypass_block): Set header and latch to NULL when > BB has more than one latch edge. > (n_latches): New variable. This looks good on principle, but can't we do better now that we have the loop structure? Can't we compute is_loop_header instead of may_be_loop_header and simplify the condition under which we mark the loop for removal? Or maybe we should call disambiguate_loops_with_multiple_latches on entry of the pass? Richard, what would be the "modern" approach to solving the problem here? -- Eric Botcazou
[PATCH] Filter out -fsanitize=address from PLUGINCFLAGS
Hi, All plugin tests failed when GCC is configured with --with-build-config=bootstrap-asan, since linker can't find libasan at link-time. -fsanitize=address isn't needed to test plugin support. This patch filters out -fsanitize=address from PLUGINCFLAGS. OK to install? Thanks. H.J. --- gcc/Makefile.in| 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) 2012-11-22 H.J. Lu PR testsuite/55440 * Makefile.in (PLUGINCFLAGS): Filter out -fsanitize=address. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index c7b8648..78a6007 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -329,7 +329,8 @@ enable_lto = @enable_lto@ # Compiler and flags needed for plugin support PLUGINCC = @CXX@ -PLUGINCFLAGS = @CXXFLAGS@ +# Filter out -fsanitize=address +PLUGINCFLAGS = $(filter-out -fsanitize=address,@CXXFLAGS@) # Libs and linker options needed for plugin support PLUGINLIBS = @pluginlibs@ -- 1.7.11.7
[patch, ARM] Fix pr55073
PR 55073 is a case where scheduling appears to mess up the order of instructions to the extent that they no-longer give the correct results. However, looking at the patterns, I think they are ill-defined. IIRC tied operands should tie a source to a destination, rather than a destination to a source. Failing to get this right can essentially result in an input value being clobbered and the compiler failing to detect this. Because all the patterns that appear to have this violation are named patterns that are called during expand (mainly of intrinsics), it's not trivial to simply rename the operands, or we get bogus rtl. Instead, I've taken the approach of splitting expand from match. gcc: * arm/neon.md (neon_vtrn_internal): Split into expand and insn patterns. Re-order insn arguments to tie inputs to outputs. (neon_vzip_internal): Likewise. (neon_vuzp_internal): Likewise. testsuite: * gcc.target/arm/pr55073.C: New test.--- config/arm/neon.md (revision 193005) +++ config/arm/neon.md (local) @@ -4225,16 +4225,29 @@ (define_insn "neon_vtbx4v8qi" [(set_attr "neon_type" "neon_bp_3cycle")] ) -(define_insn "neon_vtrn_internal" +(define_expand "neon_vtrn_internal" + [(parallel +[(set (match_operand:VDQW 0 "s_register_operand" "") + (unspec:VDQW [(match_operand:VDQW 1 "s_register_operand" "") + (match_operand:VDQW 2 "s_register_operand" "")] + UNSPEC_VTRN1)) + (set (match_operand:VDQW 3 "s_register_operand" "") + (unspec:VDQW [(match_dup 1) (match_dup 2)] UNSPEC_VTRN2))])] + "TARGET_NEON" + "" +) + +;; Note: Different operand numbering to handle tied registers correctly. +(define_insn "*neon_vtrn_insn" [(set (match_operand:VDQW 0 "s_register_operand" "=w") (unspec:VDQW [(match_operand:VDQW 1 "s_register_operand" "0") - (match_operand:VDQW 2 "s_register_operand" "w")] + (match_operand:VDQW 3 "s_register_operand" "2")] UNSPEC_VTRN1)) - (set (match_operand:VDQW 3 "s_register_operand" "=2") - (unspec:VDQW [(match_dup 1) (match_dup 2)] + (set (match_operand:VDQW 2 "s_register_operand" "=w") + (unspec:VDQW [(match_dup 1) (match_dup 3)] UNSPEC_VTRN2))] "TARGET_NEON" - "vtrn.\t%0, %3" + "vtrn.\t%0, %2" [(set (attr "neon_type") (if_then_else (match_test "") (const_string "neon_bp_simple") @@ -4252,16 +4265,29 @@ (define_expand "neon_vtrn" DONE; }) -(define_insn "neon_vzip_internal" +(define_expand "neon_vzip_internal" + [(parallel +[(set (match_operand:VDQW 0 "s_register_operand" "") + (unspec:VDQW [(match_operand:VDQW 1 "s_register_operand" "") + (match_operand:VDQW 2 "s_register_operand" "")] + UNSPEC_VZIP1)) +(set (match_operand:VDQW 3 "s_register_operand" "") + (unspec:VDQW [(match_dup 1) (match_dup 2)] UNSPEC_VZIP2))])] + "TARGET_NEON" + "" +) + +;; Note: Different operand numbering to handle tied registers correctly. +(define_insn "*neon_vzip_insn" [(set (match_operand:VDQW 0 "s_register_operand" "=w") (unspec:VDQW [(match_operand:VDQW 1 "s_register_operand" "0") - (match_operand:VDQW 2 "s_register_operand" "w")] + (match_operand:VDQW 3 "s_register_operand" "2")] UNSPEC_VZIP1)) - (set (match_operand:VDQW 3 "s_register_operand" "=2") -(unspec:VDQW [(match_dup 1) (match_dup 2)] + (set (match_operand:VDQW 2 "s_register_operand" "=w") +(unspec:VDQW [(match_dup 1) (match_dup 3)] UNSPEC_VZIP2))] "TARGET_NEON" - "vzip.\t%0, %3" + "vzip.\t%0, %2" [(set (attr "neon_type") (if_then_else (match_test "") (const_string "neon_bp_simple") @@ -4279,16 +4305,29 @@ (define_expand "neon_vzip" DONE; }) -(define_insn "neon_vuzp_internal" +(define_expand "neon_vuzp_internal" + [(parallel +[(set (match_operand:VDQW 0 "s_register_operand" "") + (unspec:VDQW [(match_operand:VDQW 1 "s_register_operand" "") + (match_operand:VDQW 2 "s_register_operand" "")] + UNSPEC_VUZP1)) + (set (match_operand:VDQW 3 "s_register_operand" "") + (unspec:VDQW [(match_dup 1) (match_dup 2)] UNSPEC_VUZP2))])] + "TARGET_NEON" + "" +) + +;; Note: Different operand numbering to handle tied registers correctly. +(define_insn "*neon_vuzp_insn" [(set (match_operand:VDQW 0 "s_register_operand" "=w") (unspec:VDQW [(match_operand:VDQW 1 "s_register_operand" "0") - (match_operand:VDQW 2 "s_register_operand" "w")] + (match_operand:VDQW 3 "s_register_operand" "2")] UNSPEC_VUZP1)) - (set (match_operand:VDQW 3 "s_register_operand" "=2") -(unspec:VDQW [(match_dup 1) (match_dup 2)] + (set (match_operand:VDQW 2 "s_register_operand" "=w") +(unspec:VDQW [(match_dup 1) (match_dup 3)] UNSPEC_VUZP2))] "TARGET_NEO
Re: [PATCH] Add --with-build-config=bootstrap-asan support
H.J. Lu wrote: Please try this: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01530.html I just did locally "git log -2 -p|patch -p1 -R" - and wait until you sort that out with the patch reviewers, which seemingly didn't like that patch. BTW: I am not sure whether it is possible with the current build system, but I'd prefer: if libsanitizer is not needed (i.e. no bootstrap-asan is used), it is only build as target library in stage2/stage3 and not also in stage1 as host library. Tobias Burnus wrote: I think that patch has broken bootstrap for me. If I do a normal bootstrap, Stage1 fails with: I should have quoted more. Note that the "g++" or similar is missing before "-D_GNU_SOURCE": /bin/sh ../libtool --tag=CXX --mode=compile -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I. -I../../../../libsanitizer/interception -I ../../../../libsanitizer/include -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros -Wno-c99-extensions -g -O2 -D_GNU_SOURCE -MT interception_mac.lo -MD -MP -MF .deps/interception_mac.Tpo -c -o interception_mac.lo ../../../../libsanitizer/interception/interception_mac.cc Thus, for libsanitizer, the $(CXX) variable is not set, when invoked via build root's Makefile. Which gets set via 'CXX=$$(RAW_CXX_FOR_TARGET)' Tobias
Go patch committed: Track fields with tag go:"track"
This patch implements support for somewhat experimental field tracking. If a struct field has a tag go:"track", the compiler will track references to the field. A new function runtime.Fieldtrack will add the names of all tracked fields to a map. The idea is that linker garbage collection will remove all unreferenced functions, and those permit the program to determine at runtime which fields are actually referenced. The actual implementation is pretty ugly. The compiler inserts function calls for each field reference, passing a string describing the field. The function does nothing. To determine which fields are referenced, the runtime scans the data sections for strings that match the required format. The problem is that we want linker garbage collection to discard unused field references. The middle-end doesn't currently provide any way to associate arbitrary data with a function; it supports putting a switch table in a section associated with a function, but not other data. So I actually store the string in a global variable, and then let the linker garbage collect the function and the global variable. If this experiment turns out to be useful, then at some later point we can enhance the compiler to permit associating arbitrary data with a function. Then the linker garbage collection should do the right thing, discarding the data as well as the function. The data can be put into some other section, which will permit the runtime scan to only look at the relevant data, not the entire data section, and will avoid the remote possibility of a false positive. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2012-11-29 Ian Lance Taylor * go-gcc.cc: Include "output.h". (global_variable): Add is_unique_section parameter. (global_variable_set_init): Adjust unique section if necessary. * Make-lang.in (go/go-gcc.o): Add dependency on output.h. Index: gcc/go/go-gcc.cc === --- gcc/go/go-gcc.cc (revision 193595) +++ gcc/go/go-gcc.cc (working copy) @@ -28,6 +28,7 @@ #include "tree-iterator.h" #include "gimple.h" #include "toplev.h" +#include "output.h" #include "go-c.h" @@ -267,6 +268,7 @@ class Gcc_backend : public Backend Btype* btype, bool is_external, bool is_hidden, + bool in_unique_section, Location location); void @@ -1277,6 +1279,7 @@ Gcc_backend::global_variable(const std:: Btype* btype, bool is_external, bool is_hidden, + bool in_unique_section, Location location) { tree type_tree = btype->get_tree(); @@ -1308,6 +1311,9 @@ Gcc_backend::global_variable(const std:: } TREE_USED(decl) = 1; + if (in_unique_section) +resolve_unique_section (decl, 0, 1); + go_preserve_from_gc(decl); return new Bvariable(decl); @@ -1326,6 +1332,16 @@ Gcc_backend::global_variable_set_init(Bv if (var_decl == error_mark_node) return; DECL_INITIAL(var_decl) = expr_tree; + + // If this variable goes in a unique section, it may need to go into + // a different one now that DECL_INITIAL is set. + if (DECL_HAS_IMPLICIT_SECTION_NAME_P (var_decl)) +{ + DECL_SECTION_NAME (var_decl) = NULL_TREE; + resolve_unique_section (var_decl, + compute_reloc_for_constant (expr_tree), + 1); +} } // Make a local variable. Index: gcc/go/Make-lang.in === --- gcc/go/Make-lang.in (revision 193595) +++ gcc/go/Make-lang.in (working copy) @@ -1,6 +1,6 @@ # Make-lang.in -- Top level -*- makefile -*- fragment for gcc Go frontend. -# Copyright (C) 2009, 2010, 2011 Free Software Foundation, Inc. +# Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc. # This file is part of GCC. @@ -254,7 +254,7 @@ go/go-lang.o: go/go-lang.c $(CONFIG_H) $ GOINCLUDES = -I $(srcdir)/go -I $(srcdir)/go/gofrontend go/go-gcc.o: go/go-gcc.cc $(GO_SYSTEM_H) $(TREE_H) tree-iterator.h \ - $(GIMPLE_H) toplev.h $(GO_C_H) $(GO_GOGO_H) \ + $(GIMPLE_H) toplev.h output.h $(GO_C_H) $(GO_GOGO_H) \ go/gofrontend/backend.h $(CXX) -c $(GOINCLUDES) $(ALL_CPPFLAGS) $(ALL_CXXFLAGS) $< $(OUTPUT_OPTION) Index: gcc/go/gofrontend/gogo.cc === --- gcc/go/gofrontend/gogo.cc (revision 193874) +++ gcc/go/gofrontend/gogo.cc (working copy) @@ -3075,7 +3075,8 @@ Function::Function(Function_type* type, closure_var_(NULL), block_(block), location_(location), labels_(), local_type_count_(0), fndecl_(NULL), defer_stack_(NULL), results_are_named_(false), nointerface_(false), calls_recover_(false), -is_recover_thunk_(false), has_recover_thunk_(false) +is_recover_thunk_(false), has_recover_thunk_(false), +in_unique_section_(false) { } @@ -3896,7 +3897,7 @@ Variable::Variable(Type* type, Expressio seen_(false), init_is_low
Re: [patch prefix.c]: 4 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
On Thu, Nov 29, 2012 at 4:09 AM, Kai Tietz wrote: > > ChangeLog > > 2012-11-29 Kai Tietz > > PR target/53912 > * prefix.c (lookup_key): Explicit cast return-type of xmalloc/xrealloc > to char *. > > Tested for i686-w64-mingw32, x86_64-w64-mingw32, and > x86_64-unknown-gnu-linux. Ok for apply? > > Regards, > Kai > > Index: prefix.c > === > --- prefix.c(Revision 193925) > +++ prefix.c(Arbeitskopie) > @@ -157,12 +157,12 @@ lookup_key (char *key) > } > >size = 32; > - dst = xmalloc (size); > + dst = (char *) xmalloc (size); > >res = RegQueryValueExA (reg_key, key, 0, &type, (LPBYTE) dst, &size); >if (res == ERROR_MORE_DATA && type == REG_SZ) > { > - dst = xrealloc (dst, size); > + dst = (char *) xrealloc (dst, size); >res = RegQueryValueExA (reg_key, key, 0, &type, (LPBYTE) dst, &size); > } This code should be using the XNEWVEC and XRESIZEVEC macros. Ian
Re: [PATCH] Add --with-build-config=bootstrap-asan support
On 11/29/2012 06:36 PM, Tobias Burnus wrote: H.J. Lu wrote: This patch adds --with-build-config=bootstrap-asan support. Tested on Linux/x86-64. OK to install? I think that patch has broken bootstrap for me. If I do a normal bootstrap, Stage1 fails with: libtool: compile: unrecognized option `-D_GNU_SOURCE' libtool: compile: Try `libtool --help' for more information. make[4]: *** [interception_linux.lo] Error 1 make[4]: Leaving directory `/home/burnus/gcc/build/x86_64-unknown-linux-gnu/libsanitizer/interception' Likewise here. Would it be possible to revert the offending commit, in the meanwhile? Thanks, Paolo.
Re: [patch prefix.c]: 4 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
Fine. Tested patch using XNEWVEC/XRESIZEVEC for this. Ok for apply? Kai Index: prefix.c === --- prefix.c(Revision 193939) +++ prefix.c(Arbeitskopie) @@ -157,12 +157,12 @@ lookup_key (char *key) } size = 32; - dst = xmalloc (size); + dst = XNEWVEC (char, size); res = RegQueryValueExA (reg_key, key, 0, &type, (LPBYTE) dst, &size); if (res == ERROR_MORE_DATA && type == REG_SZ) { - dst = xrealloc (dst, size); + dst = XRESIZEVEC (char, dst, size); res = RegQueryValueExA (reg_key, key, 0, &type, (LPBYTE) dst, &size); }
[PATCH] Use explicit -I for libstdc++-v3 header files
Hi, Since libsanitizer is used for bootstrap and compiled with raw_cxx, we need to use explicit -I for libstdc++-v3 header files in libsanitizer. Otherwise, we will get libtool: compile: unrecognized option `-D_GNU_SOURCE' libtool: compile: Try `libtool --help' for more information. This patch fixes it. OK to install? Thanks. H.J. --- libsanitizer/Makefile.am | 2 -- libsanitizer/Makefile.in | 6 +++--- libsanitizer/aclocal.m4 | 1 + libsanitizer/asan/Makefile.am | 6 -- libsanitizer/asan/Makefile.in | 14 ++ libsanitizer/configure| 22 -- libsanitizer/configure.ac | 1 + libsanitizer/interception/Makefile.am | 6 -- libsanitizer/interception/Makefile.in | 14 ++ libsanitizer/sanitizer_common/Makefile.am | 6 -- libsanitizer/sanitizer_common/Makefile.in | 14 ++ libsanitizer/tsan/Makefile.am | 6 -- libsanitizer/tsan/Makefile.in | 13 + 14 files changed, 97 insertions(+), 31 deletions(-) create mode 100644 libsanitizer/ChangeLog.asan 2012-11-22 H.J. Lu * Makefile.am (AM_MAKEFLAGS): Remove CC and CXX. * configure.ac (ACX_NONCANONICAL_TARGET): New. * asan/Makefile.am (AM_CXXFLAGS): Add -I for libstdc++-v3 header files. (AM_MAKEFLAGS): Remove CC and CXX. * interception/Makefile.am: Likewise. * sanitizer_common/Makefile.am: Likewise. * tsan/Makefile.am: Likewise. * Makefile.in: Regenerated. * aclocal.m4: Likewise. * configure: Likewise. * asan/Makefile.in: Likewise. * interception/Makefile.in: Likewise. * sanitizer_common/Makefile.in: Likewise. * tsan/Makefile.in: Likewise. diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am index 64d3d2e..cd4e92d 100644 --- a/libsanitizer/Makefile.am +++ b/libsanitizer/Makefile.am @@ -37,8 +37,6 @@ AM_MAKEFLAGS = \ "includedir=$(includedir)" \ "AR=$(AR)" \ "AS=$(AS)" \ - "CC=$(CC)" \ - "CXX=$(CXX)" \ "LD=$(LD)" \ "LIBCFLAGS=$(LIBCFLAGS)" \ "NM=$(NM)" \ diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in index 21c2711..53e0be9 100644 --- a/libsanitizer/Makefile.in +++ b/libsanitizer/Makefile.in @@ -41,7 +41,8 @@ DIST_COMMON = $(am__configure_deps) $(srcdir)/../config.guess \ $(srcdir)/../mkinstalldirs $(srcdir)/Makefile.am \ $(srcdir)/Makefile.in $(top_srcdir)/configure ChangeLog ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 -am__aclocal_m4_deps = $(top_srcdir)/../config/depstand.m4 \ +am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ + $(top_srcdir)/../config/depstand.m4 \ $(top_srcdir)/../config/lead-dot.m4 \ $(top_srcdir)/../config/multi.m4 \ $(top_srcdir)/../config/override.m4 \ @@ -236,6 +237,7 @@ sysconfdir = @sysconfdir@ target = @target@ target_alias = @target_alias@ target_cpu = @target_cpu@ +target_noncanonical = @target_noncanonical@ target_os = @target_os@ target_vendor = @target_vendor@ toolexecdir = @toolexecdir@ @@ -278,8 +280,6 @@ AM_MAKEFLAGS = \ "includedir=$(includedir)" \ "AR=$(AR)" \ "AS=$(AS)" \ - "CC=$(CC)" \ - "CXX=$(CXX)" \ "LD=$(LD)" \ "LIBCFLAGS=$(LIBCFLAGS)" \ "NM=$(NM)" \ diff --git a/libsanitizer/aclocal.m4 b/libsanitizer/aclocal.m4 index a52bc30..d6782f8 100644 --- a/libsanitizer/aclocal.m4 +++ b/libsanitizer/aclocal.m4 @@ -990,6 +990,7 @@ AC_SUBST([am__tar]) AC_SUBST([am__untar]) ]) # _AM_PROG_TAR +m4_include([../config/acx.m4]) m4_include([../config/depstand.m4]) m4_include([../config/lead-dot.m4]) m4_include([../config/multi.m4]) diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am index 3da1db3..45fb3b3 100644 --- a/libsanitizer/asan/Makefile.am +++ b/libsanitizer/asan/Makefile.am @@ -5,6 +5,10 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER) DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DASAN_HAS_EXCEPTIONS=1 -DASAN_FLEXIBLE_MAPPING_AND_OFFSET=0 -DASAN_NEEDS_SEGV=1 AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros -Wno-c99-extensions +## We require libstdc++-v3 to be in the same build tree. +AM_CXXFLAGS += -I../../libstdc++-v3/include \ + -I../../libstdc++-v3/include/$(target_noncanonical) \ + -I$(srcdir)/../../libstdc++-v3/libsupc++ ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config toolexeclib_LTLIBRARIES = libasan.la @@ -64,8 +68,6 @@ AM_MAKEFLAGS = \ "includedir=$(includedir)" \ "AR=$(AR)" \ "AS=$(AS)" \ - "CC=$(CC)" \ - "CXX=$(CXX)" \ "LD=$(LD)" \
Re: [PATCH, generic] Fix for define_subst
On 2012-11-29 00:58, Michael Zolotukhin wrote: > 2012-11-29 Michael Zolotukhin > > * gensupport.c (maybe_eval_c_test): Remove not-null check for expr. > * read-rtl.c (apply_iterators): Initialize condition with "" instead > of NULL. Ok. r~
Re: [patch prefix.c]: 4 of 7 Fix of PR target/53912 bootstrap fails using default c++ mode in stage 2 and 3 for native x86_64-w64-mingw32
On Thu, Nov 29, 2012 at 10:27 AM, Kai Tietz wrote: > Fine. Tested patch using XNEWVEC/XRESIZEVEC for this. Ok for apply? This is OK with a ChangeLog entry. Thanks. Ian > Index: prefix.c > === > --- prefix.c(Revision 193939) > +++ prefix.c(Arbeitskopie) > @@ -157,12 +157,12 @@ lookup_key (char *key) > } > >size = 32; > - dst = xmalloc (size); > + dst = XNEWVEC (char, size); > >res = RegQueryValueExA (reg_key, key, 0, &type, (LPBYTE) dst, &size); >if (res == ERROR_MORE_DATA && type == REG_SZ) > { > - dst = xrealloc (dst, size); > + dst = XRESIZEVEC (char, dst, size); >res = RegQueryValueExA (reg_key, key, 0, &type, (LPBYTE) dst, &size); > }
Re: [PATCH] Add --with-build-config=bootstrap-asan support
On Thu, Nov 29, 2012 at 10:24 AM, Paolo Carlini wrote: > On 11/29/2012 06:36 PM, Tobias Burnus wrote: >> >> H.J. Lu wrote: >>> >>> This patch adds --with-build-config=bootstrap-asan support. Tested on >>> Linux/x86-64. OK to install? >> >> >> I think that patch has broken bootstrap for me. If I do a normal >> bootstrap, Stage1 fails with: >> >> libtool: compile: unrecognized option `-D_GNU_SOURCE' >> libtool: compile: Try `libtool --help' for more information. >> make[4]: *** [interception_linux.lo] Error 1 >> make[4]: Leaving directory >> `/home/burnus/gcc/build/x86_64-unknown-linux-gnu/libsanitizer/interception' > > Likewise here. Would it be possible to revert the offending commit, in the > meanwhile? > This is a known problem. Please try: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02480.html I updated my patch for the newly added tsan. It fixed bootstrap on Linux/ia32. -- H.J.
Re: [PATCH] Vtable pointer verification, gcc changes (patch 2 of 2)
On 11/16/2012 01:21 PM, Caroline Tice wrote: -start_objects (int method_type, int initp) +start_objects (int method_type, int initp, const char *extra_name) I don't think we want to mess with start_objects and such here. Can't you just use DECL_STATIC_CONSTRUCTOR (decl) = 1; SET_DECL_INIT_PRIORITY (decl, priority); on your initialization function instead? Jason
Re: [PATCH] Use explicit -I for libstdc++-v3 header files
On Thu, Nov 29, 2012 at 10:30 AM, H.J. Lu wrote: > Hi, > > Since libsanitizer is used for bootstrap and compiled with raw_cxx, > we need to use explicit -I for libstdc++-v3 header files in > libsanitizer. Otherwise, we will get > > libtool: compile: unrecognized option `-D_GNU_SOURCE' > libtool: compile: Try `libtool --help' for more information. > > This patch fixes it. OK to install? Is there a reason why your change to have libsanitizer be bootstrapped wait to be committed until this patch was approved? I think you should revert the change to have libsanitizer be bootstrapped and only commit that after this patch has been approved because right now nobody can do any work as bootstrap is broken. Thanks, Andrew Pinski > > Thanks. > > > H.J. > --- > libsanitizer/Makefile.am | 2 -- > libsanitizer/Makefile.in | 6 +++--- > libsanitizer/aclocal.m4 | 1 + > libsanitizer/asan/Makefile.am | 6 -- > libsanitizer/asan/Makefile.in | 14 ++ > libsanitizer/configure| 22 -- > libsanitizer/configure.ac | 1 + > libsanitizer/interception/Makefile.am | 6 -- > libsanitizer/interception/Makefile.in | 14 ++ > libsanitizer/sanitizer_common/Makefile.am | 6 -- > libsanitizer/sanitizer_common/Makefile.in | 14 ++ > libsanitizer/tsan/Makefile.am | 6 -- > libsanitizer/tsan/Makefile.in | 13 + > 14 files changed, 97 insertions(+), 31 deletions(-) > create mode 100644 libsanitizer/ChangeLog.asan > > 2012-11-22 H.J. Lu > > * Makefile.am (AM_MAKEFLAGS): Remove CC and CXX. > * configure.ac (ACX_NONCANONICAL_TARGET): New. > * asan/Makefile.am (AM_CXXFLAGS): Add -I for libstdc++-v3 header > files. > (AM_MAKEFLAGS): Remove CC and CXX. > * interception/Makefile.am: Likewise. > * sanitizer_common/Makefile.am: Likewise. > * tsan/Makefile.am: Likewise. > * Makefile.in: Regenerated. > * aclocal.m4: Likewise. > * configure: Likewise. > * asan/Makefile.in: Likewise. > * interception/Makefile.in: Likewise. > * sanitizer_common/Makefile.in: Likewise. > * tsan/Makefile.in: Likewise. > > diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am > index 64d3d2e..cd4e92d 100644 > --- a/libsanitizer/Makefile.am > +++ b/libsanitizer/Makefile.am > @@ -37,8 +37,6 @@ AM_MAKEFLAGS = \ > "includedir=$(includedir)" \ > "AR=$(AR)" \ > "AS=$(AS)" \ > - "CC=$(CC)" \ > - "CXX=$(CXX)" \ > "LD=$(LD)" \ > "LIBCFLAGS=$(LIBCFLAGS)" \ > "NM=$(NM)" \ > diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in > index 21c2711..53e0be9 100644 > --- a/libsanitizer/Makefile.in > +++ b/libsanitizer/Makefile.in > @@ -41,7 +41,8 @@ DIST_COMMON = $(am__configure_deps) > $(srcdir)/../config.guess \ > $(srcdir)/../mkinstalldirs $(srcdir)/Makefile.am \ > $(srcdir)/Makefile.in $(top_srcdir)/configure ChangeLog > ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 > -am__aclocal_m4_deps = $(top_srcdir)/../config/depstand.m4 \ > +am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ > + $(top_srcdir)/../config/depstand.m4 \ > $(top_srcdir)/../config/lead-dot.m4 \ > $(top_srcdir)/../config/multi.m4 \ > $(top_srcdir)/../config/override.m4 \ > @@ -236,6 +237,7 @@ sysconfdir = @sysconfdir@ > target = @target@ > target_alias = @target_alias@ > target_cpu = @target_cpu@ > +target_noncanonical = @target_noncanonical@ > target_os = @target_os@ > target_vendor = @target_vendor@ > toolexecdir = @toolexecdir@ > @@ -278,8 +280,6 @@ AM_MAKEFLAGS = \ > "includedir=$(includedir)" \ > "AR=$(AR)" \ > "AS=$(AS)" \ > - "CC=$(CC)" \ > - "CXX=$(CXX)" \ > "LD=$(LD)" \ > "LIBCFLAGS=$(LIBCFLAGS)" \ > "NM=$(NM)" \ > diff --git a/libsanitizer/aclocal.m4 b/libsanitizer/aclocal.m4 > index a52bc30..d6782f8 100644 > --- a/libsanitizer/aclocal.m4 > +++ b/libsanitizer/aclocal.m4 > @@ -990,6 +990,7 @@ AC_SUBST([am__tar]) > AC_SUBST([am__untar]) > ]) # _AM_PROG_TAR > > +m4_include([../config/acx.m4]) > m4_include([../config/depstand.m4]) > m4_include([../config/lead-dot.m4]) > m4_include([../config/multi.m4]) > diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am > index 3da1db3..45fb3b3 100644 > --- a/libsanitizer/asan/Makefile.am > +++ b/libsanitizer/asan/Makefile.am > @@ -5,6 +5,10 @@ gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER) > > DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -DASAN_HAS_EXCEPTIONS=1 > -DASAN_FLEXIBLE_MAPPING_AND_OFFSET=0 -DASAN_NEEDS_SEGV=1 > AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic > -Wno-long-long -fPIC -fno-builtin -fno-exc
Re: [PATCH, RFC] Dumping expanded MD files
On 2012-11-28 23:22, Michael Zolotukhin wrote: > +.PHONY: mddump > +mddump: $(BUILD_RTL) $(MD_DEPS) build/genmddump$(build_exeext) > + $(RUN_GEN) build/genmddump$(build_exeext) $(md_file) 2> tmp-mddump.md Why in the world are you dumping to stderr instead of stdout? r~
Re: [patch] PR middle-end/55401: uninstrument relevant path in TM clone
On 2012-11-28 17:05, Aldy Hernandez wrote: > Author: Aldy Hernandez > Date: Wed Nov 28 18:17:33 2012 -0600 > > PR middle-end/55401 > * trans-mem.c (get_tm_region_blocks): Exclude uninstrumented > blocks from vector if requested. > (collect_bb2reg): Pass new argument to > get_tm_region_blocks. > (get_bb_regions_instrumented): Add INCLUDE_UNINSTRUMENTED_P > argument, and pass it to expand_regions. > (execute_tm_mark): Pass new argument to > get_bb_regions_instrumented. > (execute_tm_edges): Same. Ok. r~
Re: [PATCH] Use explicit -I for libstdc++-v3 header files
On Thu, Nov 29, 2012 at 10:34 AM, Andrew Pinski wrote: > On Thu, Nov 29, 2012 at 10:30 AM, H.J. Lu wrote: >> Hi, >> >> Since libsanitizer is used for bootstrap and compiled with raw_cxx, >> we need to use explicit -I for libstdc++-v3 header files in >> libsanitizer. Otherwise, we will get >> >> libtool: compile: unrecognized option `-D_GNU_SOURCE' >> libtool: compile: Try `libtool --help' for more information. >> >> This patch fixes it. OK to install? > > Is there a reason why your change to have libsanitizer be bootstrapped > wait to be committed until this patch was approved? > I think you should revert the change to have libsanitizer be > bootstrapped and only commit that after this patch has been approved > because right now nobody can do any work as bootstrap is broken. > I posted my patch: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01530.html Since it didn't fix anything at the time, there was no incentive for anyone to review it. Please someone review my patch and get over with it. FWIW, libjava has the same issue with libstdc++ dependency as a bootstrap library. I applied the same solution to libsanitizer. -- H.J.
Re: [PATCH] Use explicit -I for libstdc++-v3 header files
On Thu, Nov 29, 2012 at 10:40 AM, H.J. Lu wrote: > On Thu, Nov 29, 2012 at 10:34 AM, Andrew Pinski wrote: >> On Thu, Nov 29, 2012 at 10:30 AM, H.J. Lu wrote: >>> Hi, >>> >>> Since libsanitizer is used for bootstrap and compiled with raw_cxx, >>> we need to use explicit -I for libstdc++-v3 header files in >>> libsanitizer. Otherwise, we will get >>> >>> libtool: compile: unrecognized option `-D_GNU_SOURCE' >>> libtool: compile: Try `libtool --help' for more information. >>> >>> This patch fixes it. OK to install? >> >> Is there a reason why your change to have libsanitizer be bootstrapped >> wait to be committed until this patch was approved? >> I think you should revert the change to have libsanitizer be >> bootstrapped and only commit that after this patch has been approved >> because right now nobody can do any work as bootstrap is broken. >> > > I posted my patch: > > http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01530.html > > Since it didn't fix anything at the time, there was no incentive > for anyone to review it. Please someone review my patch > and get over with it. So you broke the build to force someone to review your patch? That seems backwards. Thanks, Andrew > > FWIW, libjava has the same issue with libstdc++ dependency > as a bootstrap library. I applied the same solution to libsanitizer. > > > -- > H.J.
patch to fix PR55456
The following patch fixes gcc.gnu.org/bugzilla/show_bug.cgi?id=55456 The patch was successfully bootstrapped and tested on x86/x86-64. Committed as rev. 193948. 2012-11-28 Vladimir Makarov PR middle-end/55456 * lra-int.h (lra_new_regno_start): New external. * lra.c (lra_new_regno_start): New global. (lra): Set up lra_new_regno_start. * lra-constraints.c (match_reload): Sync values only for original pseudos. Index: lra.c === --- lra.c (revision 193901) +++ lra.c (working copy) @@ -2151,6 +2151,9 @@ update_inc_notes (void) /* Set to 1 while in lra. */ int lra_in_progress; +/* Start of pseudo regnos before the LRA. */ +int lra_new_regno_start; + /* Start of reload pseudo regnos before the new spill pass. */ int lra_constraint_new_regno_start; @@ -2235,7 +2238,7 @@ lra (FILE *f) so set up lra_constraint_new_regno_start before its call to permit changing reg classes for pseudos created by this simplification. */ - lra_constraint_new_regno_start = max_reg_num (); + lra_constraint_new_regno_start = lra_new_regno_start = max_reg_num (); remove_scratches (); scratch_p = lra_constraint_new_regno_start != max_reg_num (); Index: lra-constraints.c === --- lra-constraints.c (revision 193901) +++ lra-constraints.c (working copy) @@ -685,8 +685,10 @@ match_reload (signed char out, signed ch else new_out_reg = gen_rtx_SUBREG (outmode, reg, 0); /* If the input reg is dying here, we can use the same hard - register for REG and IN_RTX. */ - if (REG_P (in_rtx) + register for REG and IN_RTX. We do it only for original + pseudos as reload pseudos can die although original + pseudos still live where reload pseudos dies. */ + if (REG_P (in_rtx) && (int) REGNO (in_rtx) < lra_new_regno_start && find_regno_note (curr_insn, REG_DEAD, REGNO (in_rtx))) lra_reg_info[REGNO (reg)].val = lra_reg_info[REGNO (in_rtx)].val; } @@ -712,7 +714,9 @@ match_reload (signed char out, signed ch /* If SUBREG_REG is dying here and sub-registers IN_RTX and NEW_IN_REG are similar, we can use the same hard register for REG and SUBREG_REG. */ - if (REG_P (subreg_reg) && GET_MODE (subreg_reg) == outmode + if (REG_P (subreg_reg) + && (int) REGNO (subreg_reg) < lra_new_regno_start + && GET_MODE (subreg_reg) == outmode && SUBREG_BYTE (in_rtx) == SUBREG_BYTE (new_in_reg) && find_regno_note (curr_insn, REG_DEAD, REGNO (subreg_reg))) lra_reg_info[REGNO (reg)].val Index: lra-int.h === --- lra-int.h (revision 193901) +++ lra-int.h (working copy) @@ -295,6 +295,7 @@ extern lra_copy_t lra_get_copy (int); extern bool lra_former_scratch_p (int); extern bool lra_former_scratch_operand_p (rtx, int); +extern int lra_new_regno_start; extern int lra_constraint_new_regno_start; extern bitmap_head lra_inheritance_pseudos; extern bitmap_head lra_split_regs;
Re: [PATCH] Add --with-build-config=bootstrap-asan support
On Thu, Nov 29, 2012 at 10:32 AM, H.J. Lu wrote: > On Thu, Nov 29, 2012 at 10:24 AM, Paolo Carlini > wrote: >> On 11/29/2012 06:36 PM, Tobias Burnus wrote: >>> >>> H.J. Lu wrote: This patch adds --with-build-config=bootstrap-asan support. Tested on Linux/x86-64. OK to install? >>> >>> >>> I think that patch has broken bootstrap for me. If I do a normal >>> bootstrap, Stage1 fails with: >>> >>> libtool: compile: unrecognized option `-D_GNU_SOURCE' >>> libtool: compile: Try `libtool --help' for more information. >>> make[4]: *** [interception_linux.lo] Error 1 >>> make[4]: Leaving directory >>> `/home/burnus/gcc/build/x86_64-unknown-linux-gnu/libsanitizer/interception' >> >> Likewise here. Would it be possible to revert the offending commit, in the >> meanwhile? >> > > This is a known problem. Please try: > > http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02480.html > > I updated my patch for the newly added tsan. It fixed > bootstrap on Linux/ia32. > I checked in my patch to restore bootstrap after tested on Linux/ia32 and Linux/x86-64. I will revert both patches if my fix is incorrect. -- H.J.