RFA; MN10300: Fix AM33 clzsi2 pattern
Hi Alex, Hi Jeff, Hi Richard, The clzsi2/bsch patterns in the MN10300 backend do not work. There are two problems - firstly the starting bit-search position for the BSCH instruction is not set. Secondly the BSCH instruction returns the bit position of the highest set bit, not the number of leading zeros. The attached patch fixes both of these problems and shaves about 200 unexpected failures off the gcc testsuite for the AM33 or AM33_2 multilibs. OK to apply ? Cheers Nick gcc/ChangeLog 2011-06-25 Nick Clifton * config/mn10300/mn10300.md (clzsi2): Remove unused const_int 0. (bsch): Remove unused second operand. Initialise bit search starting position. Convert located bit position into a zero count. Index: gcc/config/mn10300/mn10300.md === --- gcc/config/mn10300/mn10300.md (revision 175370) +++ gcc/config/mn10300/mn10300.md (working copy) @@ -1812,21 +1812,25 @@ ;; -- (define_expand "clzsi2" - [(parallel [(set (match_operand:SI 0 "register_operand" "") - (unspec:SI [(match_operand:SI 1 "register_operand" "") - (const_int 0)] UNSPEC_BSCH)) + [(parallel [(set (match_operand:SI 0 "register_operand") + (unspec:SI [(match_operand:SI 1 "register_operand")] + UNSPEC_BSCH)) (clobber (reg:CC CC_REG))])] "TARGET_AM33" ) +;; The XOR in the instruction sequence below is there because the BSCH +;; instruction returns the bit number of the highest set bit and we want +;; the number of zero bits above that bit. The AM33 does not have a +;; reverse subtraction instruction, but we can use an xor instead since +;; we know that the top 27 bits are clear. (define_insn "*bsch" - [(set (match_operand:SI 0 "register_operand" "=r") - (unspec:SI [(match_operand:SI 1 "register_operand" "r") - (match_operand:SI 2 "nonmemory_operand" "0")] + [(set (match_operand:SI 0 "register_operand" "=&r") + (unspec:SI [(match_operand:SI 1 "register_operand" "r")] UNSPEC_BSCH)) (clobber (reg:CC CC_REG))] "TARGET_AM33" - "bsch %1,%0" + "clr %0 ; bsch %1, %0; xor 31, %0" ) ;; --
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On Fri, Jun 24, 2011 at 6:58 PM, Stubbs, Andrew wrote: > On 24/06/11 16:47, Richard Guenther wrote: >>> > I can certainly add checks to make sure that the skipped operations >>> > actually don't make any important changes to the value, but do I need to? >> Yes. > > Ok, I'll go away and do that then. > > BTW, I see useless_type_conversion_p, but that's not quite what I want. > Is there an equivalent existing function to determine whether a > conversion changes the logical/arithmetic meaning of a type? > > I mean, conversion to a wider mode is not "useless", but it is harmless, > whereas conversion to a narrower mode may truncate the value. Well, you have to decide that for the concrete situation based on the signedness and precision of the types involved. All such conversions change the logical/arithmetic meaning of a type if seen in the right context. Richard. > Andrew >
Re: PATCH TRUNK: better format output for time reports.
On Sat, Jun 25, 2011 at 8:40 AM, Basile Starynkevitch wrote: > Hello All, > > When cc1 report timing, the timing variable name has a too short width for df > reg dead/unused notes: > > Execution times (seconds) > phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall > 1089 kB ( 1%) ggc > trivially dead code : 0.02 ( 1%) usr 0.00 ( 0%) sys 0.03 ( 1%) wall > 0 kB ( 0%) ggc > df scan insns : 0.07 ( 2%) usr 0.00 ( 0%) sys 0.11 ( 2%) wall > 42 kB ( 0%) ggc > df live regs : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 1%) wall > 0 kB ( 0%) ggc > df reg dead/unused notes: 0.02 ( 1%) usr 0.00 ( 0%) sys 0.02 ( 0%) > wall 1395 kB ( 1%) ggc > register information : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall > 0 kB ( 0%) ggc > alias analysis : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall > 393 kB ( 0%) ggc > rebuild jump labels : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall > 0 kB ( 0%) ggc > preprocessing : 0.11 ( 3%) usr 0.10 (16%) sys 0.39 ( 8%) wall > 11550 kB (10%) ggc > lexical analysis : 0.02 ( 1%) usr 0.11 (17%) sys 0.31 ( 7%) wall > 0 kB ( 0%) ggc > > The following trivial patch should fix that: > ### patch against trunk 175201 > Index: gcc/timevar.c > === > --- gcc/timevar.c (revision 175201) > +++ gcc/timevar.c (working copy) > @@ -478,7 +478,7 @@ timevar_print (FILE *fp) > continue; > > /* The timing variable name. */ > - fprintf (fp, " %-22s:", tv->name); > + fprintf (fp, " %-24s:", tv->name); > > #ifdef HAVE_USER_TIME > /* Print user-mode time for this process. */ > ### gcc/ChangeLog entry > 2011-06-25 Basile Starynkevitch > > * timevar.c (timevar_print): Increase width for display of timevar > name. > ### > > Ok for trunk? Ok. Thanks, Richard > Regards > -- > Basile STARYNKEVITCH http://starynkevitch.net/Basile/ > email: basilestarynkevitchnet mobile: +33 6 8501 2359 > 8, rue de la Faiencerie, 92340 Bourg La Reine, France > *** opinions {are only mine, sont seulement les miennes} *** >
Re: PATCH TRUNK: better format output for time reports.
On Sat, 25 Jun 2011 11:38:20 +0200 Richard Guenther wrote: > > ### patch against trunk 175201 > > Index: gcc/timevar.c > > === > > --- gcc/timevar.c (revision 175201) > > +++ gcc/timevar.c (working copy) > > @@ -478,7 +478,7 @@ timevar_print (FILE *fp) > > continue; > > > > /* The timing variable name. */ > > - fprintf (fp, " %-22s:", tv->name); > > + fprintf (fp, " %-24s:", tv->name); > > > > #ifdef HAVE_USER_TIME > > /* Print user-mode time for this process. */ > > ### gcc/ChangeLog entry > > 2011-06-25 Basile Starynkevitch > > > > * timevar.c (timevar_print): Increase width for display of timevar > > name. > > ### > > > > Ok for trunk? > > Ok. > Committed revision 175396. Thanks -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} ***
Re: PATCH TRUNK: better format output for time reports.
> > Ok. > > Committed revision 175396. Please avoid posting messages like this. See http://gcc.gnu.org/svnwrite.html: "When you have checked in a patch exactly as it has been approved, you do not need to tell that to people -- including the approver. People interested in when a particular patch is committed can check SVN or the gcc-cvs list." -- Eric Botcazou
Re: Simplify Solaris configuration
> All bootstraps completed without regressions, so I've installed the > patch. I'll address eventual issues and further simplifications as a > followup. I cannot bootstrap SPARC64/Solaris anymore though: /nile.build/botcazou/gcc-head/sparc64-sun-solaris2.9/./gcc/xgcc -B/nile.build/botcazou/gcc-head/sparc64-sun-solaris2.9/./gcc/ -B/nile.build/botcazou/gcc-head/install_sparc64/sparc64-sun-solaris2.9/bin/ -B/nile.build/botcazou/gcc-head/install_sparc64/sparc64-sun-solaris2.9/lib/ -isystem /nile.build/botcazou/gcc-head/install_sparc64/sparc64-sun-solaris2.9/include -isystem /nile.build/botcazou/gcc-head/install_sparc64/sparc64-sun-solaris2.9/sys-include -g -O2 -O2 -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -g -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -fbuilding-libgcc -fno-stack-protector -fPIC -I. -I. -I../.././gcc -I/nile.build/botcazou/gcc-head/src/libgcc -I/nile.build/botcazou/gcc-head/src/libgcc/. -I/nile.build/botcazou/gcc-head/src/libgcc/../gcc -I/nile.build/botcazou/gcc-head/src/libgcc/../include -DHAVE_CC_TLS -o _muldi3.o -MT _muldi3.o -MD -MP -MF _muldi3.dep -DL_muldi3 -c /nile.build/botcazou/gcc-head/src/libgcc/../gcc/libgcc2.c \ -fvisibility=hidden -DHIDE_EXPORTS In file included from /nile.build/botcazou/gcc-head/sparc64-sun-solaris2.9/./gcc/include-fixed/sys/feature_tests.h:24:0, from /nile.build/botcazou/gcc-head/sparc64-sun-solaris2.9/./gcc/include-fixed/iso/stdio_iso.h:44, from /nile.build/botcazou/gcc-head/sparc64-sun-solaris2.9/./gcc/include-fixed/stdio.h:36, from /nile.build/botcazou/gcc-head/src/libgcc/../gcc/tsystem.h:87, from /nile.build/botcazou/gcc-head/src/libgcc/../gcc/libgcc2.c:29: /usr/include/sys/isa_defs.h:280:2: error: #error "SPARC Versions 8 and 9 are mutually exclusive choices" /usr/include/sys/isa_defs.h:376:2: error: #error "Both _ILP32 and _LP64 are defined" make[3]: *** [_muldi3.o] Error 1 make[3]: Leaving directory `/nfs/nile/nile.build/botcazou/gcc-head/sparc64-sun-solaris2.9/sparc64-sun-solaris2.9/libgcc' make[2]: *** [all-stage1-target-libgcc] Error 2 make[2]: Leaving directory `/nfs/nile/nile.build/botcazou/gcc-head/sparc64-sun-solaris2.9' make[1]: *** [stage1-bubble] Error 2 make[1]: Leaving directory `/nfs/nile/nile.build/botcazou/gcc-head/sparc64-sun-solaris2.9' make: *** [all] Error 2 -- Eric Botcazou
[patch, darwin, committed] fix PR49371
I committed the following patch as approved in the PR thread. cheers Iain gcc/ PR driver/49371 * config/darwin.c (darwin_override_options): Improve warning when mdynamic-no-pic is given together with fPIC/fpic, also warn when it is given with fpie/fPIE. * config/darwin.h (PIE_SPEC): New, (LINK_SPEC): Use PIE_SPEC. * config/darwin9.h (PIE_SPEC): New. Index: gcc/config/darwin.c === --- gcc/config/darwin.c (revision 175396) +++ gcc/config/darwin.c (working copy) @@ -2932,7 +2932,9 @@ darwin_override_options (void) if (MACHO_DYNAMIC_NO_PIC_P) { if (flag_pic) - warning (0, "-mdynamic-no-pic overrides -fpic or -fPIC"); + warning_at (UNKNOWN_LOCATION, 0, +"%<-mdynamic-no-pic%> overrides %<-fpic%>, %<-fPIC%>," +" %<-fpie%> or %<-fPIE%>"); flag_pic = 0; } else if (flag_pic == 1) Index: gcc/config/darwin.h === --- gcc/config/darwin.h (revision 175396) +++ gcc/config/darwin.h (working copy) @@ -226,6 +226,8 @@ extern GTY(()) int darwin_ms_struct; #define LINK_SYSROOT_SPEC "%{isysroot*:-syslibroot %*}" #endif +#define PIE_SPEC "%{fpie|pie|fPIE:}" + /* Please keep the random linker options in alphabetical order (modulo 'Z' and 'no' prefixes). Note that options taking arguments may appear multiple times on a command line with different arguments each time, @@ -290,7 +292,7 @@ extern GTY(()) int darwin_ms_struct; %:version-compare(< 10.5 mmacosx-version-min= - multiply_defined) \ %:version-compare(< 10.5 mmacosx-version-min= suppress)}} \ %{Zmultiplydefinedunused*:-multiply_defined_unused %*} \ - %{fpie:-pie} \ + " PIE_SPEC " \ %{prebind} %{noprebind} %{nofixprebinding} % {prebind_all_twolevel_modules} \ %{read_only_relocs} \ %{sectcreate*} %{sectorder*} %{seg1addr*} %{segprot*} \ Index: gcc/config/darwin9.h === --- gcc/config/darwin9.h(revision 175396) +++ gcc/config/darwin9.h(working copy) @@ -35,6 +35,12 @@ along with GCC; see the file COPYING3. If not see /* Tell collect2 to run dsymutil for us as necessary. */ #define COLLECT_RUN_DSYMUTIL 1 +#undef PIE_SPEC +#define PIE_SPEC \ + "%{fpie|pie|fPIE: \ + %{mdynamic-no-pic: %n'-mdynamic-no-pic' overrides '-pie', '- fpie' or '-fPIE'; \ + :-pie}}" + #undef ASM_OUTPUT_ALIGNED_COMMON #define ASM_OUTPUT_ALIGNED_COMMON(FILE, NAME, SIZE, ALIGN) \ do { \
Re: [PATCH (0/7)] Improve use of Widening Multiplies
On 06/23/11 16:34, Andrew Stubbs wrote: > The patches provide a number of improvements: > > * Support for instructions that widen by more than one mode >(e.g. from HImode to DImode). > > * Use of widening multiplies even when the input mode is narrower than >the instruction uses. (e.g. Use HI->DI to do QI->DI). > > * Use of signed widening multiplies (of a larger mode) where unsigned >multiplies are not available. > > * Support for input operands with mis-matched signedness, with or >without usmul_widen_optab. > > * Support for input operands with mis-matched mode [1]. > > * Improved pattern matching in the widening_mult pass. >* Recognition of true types, even if obscured by a cast. >* Insertion of extra gimple statements where the existing code was > incompatible with widening multiplies. >* Recognition of widening multiply-and-accumulate even where the > multiply expression was not widening. That all sounds good, but missing from this list is something that occurs on many CPUs - widening from the high part of a register. The current machinery only recognizes lowxlow widening multiplication, but hardware often exists for highxlow and highxhigh. For example, Blackfin has "hisi_lh"/hl/hh instruction patterns; C6X also has a full set; ARM has mulhisi3tb/bt/tt. Do you think it will be possible to extend your new framework to handle this case as well? Bernd
Re: PATCH [5/n]: Prepare x32: PR middle-end/48016: Inconsistency in non-local goto save area
On Thu, Jun 16, 2011 at 10:18 AM, H.J. Lu wrote: > On Thu, Jun 16, 2011 at 12:56 AM, Richard Guenther > wrote: >> On Wed, Jun 15, 2011 at 9:55 PM, H.J. Lu wrote: >>> On Wed, Jun 15, 2011 at 8:16 AM, Michael Matz wrote: Hi, On Wed, 15 Jun 2011, H.J. Lu wrote: > >> + /* FIXME: update_nonlocal_goto_save_area may pass SA in the wrong > >> mode. */ > >> + if (GET_MODE (sa) != mode) > >> + { > >> + gcc_assert (ptr_mode != Pmode > >> + && GET_MODE (sa) == ptr_mode > >> + && mode == Pmode); > >> + sa = adjust_address (sa, mode, 0); > >> + } > > > > That may be appropriate for a branch, but trunk shouldn't contain FIXMEs > > that explain how something should be fixed, instead that something > > should > > be carried out. I.e. just fix update_nonlocal_goto_save_area. > > > > I don't know update_nonlocal_goto_save_area enough to fix it > without breaking other targets. This patch is the lest invasive. > Any suggestions how to properly fix it is appreciated. Well, the most obvious variant would be to move the above code right before the call of emit_stack_save in update_nonlocal_goto_save_area (using r_save and STACK_SAVEAREA_MODE (SAVE_NONLOCAL)). All other callers of emit_stack_save already make sure to pass an object of correct mode, so this one should too. But I think it's better to just produce a correct array_ref from the start. get_nl_goto_field creates an array_type for the nonlocal_goto_save_area of correct type (ptr_type_node or lang_hooks.types.type_for_mode (Pmode, 1)), and we should use that. So something like this in update_nonlocal_goto_save_area: t_save = build4 (ARRAY_REF, TREE_TYPE (TREE_TYPE (cfun->nonlocal_goto_save_area)), cfun->nonlocal_goto_save_area, integer_one_node, NULL_TREE, NULL_TREE); instead of the current building of t_save. Then r_save also should get the correct mode automatically. >>> >>> Here is the updated patch. OK for trunk? >> >> The explow.c change is ok. For the function.c change I wonder why >> convert_memory_address doesn't do the right thing - from it's documentation >> it definitely should, so it should be fixed instead of being replaced by >> adjust_address with a zero offset. >> > > convert_memory_address may return a pseudo register converted > to Pmode. But here what we want is the same memory address > adjusted for Pmode. I don't think the usage of convert_memory_address > Here is the code in question: r_save = convert_memory_address (Pmode, r_save); emit_move_insn (r_save, targetm.builtin_setjmp_frame_value ()); R_SAVE must be lvalue. But return from convert_memory_address isn't. I am re-posting my patch here. OK for trunk? Thanks. -- H.J. --- 2011-06-15 H.J. Lu PR middle-end/48016 * explow.c (update_nonlocal_goto_save_area): Use proper mode for stack save area. * function.c (expand_function_start): Properly store frame pointer for non-local goto. diff --git a/gcc/explow.c b/gcc/explow.c index c7d8183..efe6c7e 100644 --- a/gcc/explow.c +++ b/gcc/explow.c @@ -1102,7 +1097,9 @@ update_nonlocal_goto_save_area (void) first one is used for the frame pointer save; the rest are sized by STACK_SAVEAREA_MODE. Create a reference to array index 1, the first of the stack save area slots. */ - t_save = build4 (ARRAY_REF, ptr_type_node, cfun->nonlocal_goto_save_area, + t_save = build4 (ARRAY_REF, + TREE_TYPE (TREE_TYPE (cfun->nonlocal_goto_save_area)), + cfun->nonlocal_goto_save_area, integer_one_node, NULL_TREE, NULL_TREE); r_save = expand_expr (t_save, NULL_RTX, VOIDmode, EXPAND_WRITE); diff --git a/gcc/function.c b/gcc/function.c index 81c4d39..131bc09 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -4780,7 +4780,7 @@ expand_function_start (tree subr) cfun->nonlocal_goto_save_area, integer_zero_node, NULL_TREE, NULL_TREE); r_save = expand_expr (t_save, NULL_RTX, VOIDmode, EXPAND_WRITE); - r_save = convert_memory_address (Pmode, r_save); + r_save = adjust_address (r_save, Pmode, 0); emit_move_insn (r_save, targetm.builtin_setjmp_frame_value ()); update_nonlocal_goto_save_area ();
PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD > sizeof (void *)
Hi, This patch introduces UNIQUE_UNWIND_CONTEXT and properly saves/stores registers with UNITS_PER_WORD > sizeof (void *) as suggested in http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01526.html OK for trunk? Thanks. H.J. --- 2011-04-09 H.J. Lu PR other/48007 * unwind-dw2.c (UNIQUE_UNWIND_CONTEXT): New. (_Unwind_Context): If UNIQUE_UNWIND_CONTEXT is defined, add dwarf_reg_size_table and value, remove version and by_value. (EXTENDED_CONTEXT_BIT): Don't define if UNIQUE_UNWIND_CONTEXT is defined. (_Unwind_IsExtendedContext): Likewise. (_Unwind_GetGR): Support UNIQUE_UNWIND_CONTEXT. (_Unwind_SetGR): Likewise. (_Unwind_GetGRPtr): Likewise. (_Unwind_SetGRPtr): Likewise. (_Unwind_SetGRValue): Likewise. (_Unwind_GRByValue): Likewise. (__frame_state_for): Initialize dwarf_reg_size_table field if UNIQUE_UNWIND_CONTEXT is defined. (uw_install_context_1): Likewise. Support UNIQUE_UNWIND_CONTEXT. diff --git a/gcc/unwind-dw2.c b/gcc/unwind-dw2.c index 25990b4..5fa2723 100644 --- a/gcc/unwind-dw2.c +++ b/gcc/unwind-dw2.c @@ -59,6 +59,12 @@ #define DWARF_REG_TO_UNWIND_COLUMN(REGNO) (REGNO) #endif +#ifndef UNIQUE_UNWIND_CONTEXT +#if defined __x86_64 && !defined __LP64__ +# define UNIQUE_UNWIND_CONTEXT +#endif +#endif + /* This is the register and unwind state for a particular frame. This provides the information necessary to unwind up past a frame and return to its caller. */ @@ -69,6 +75,15 @@ struct _Unwind_Context void *ra; void *lsda; struct dwarf_eh_bases bases; +#ifdef UNIQUE_UNWIND_CONTEXT + /* Used to check for unique _Unwind_Context. */ + void *dwarf_reg_size_table; + /* Signal frame context. */ +#define SIGNAL_FRAME_BIT ((_Unwind_Word) 1 >> 0) + _Unwind_Word flags; + _Unwind_Word args_size; + _Unwind_Word value[DWARF_FRAME_REGISTERS+1]; +#else /* Signal frame context. */ #define SIGNAL_FRAME_BIT ((~(_Unwind_Word) 0 >> 1) + 1) /* Context which has version/args_size/by_value fields. */ @@ -79,6 +94,7 @@ struct _Unwind_Context _Unwind_Word version; _Unwind_Word args_size; char by_value[DWARF_FRAME_REGISTERS+1]; +#endif }; /* Byte size of every register managed by these routines. */ @@ -144,11 +160,13 @@ _Unwind_SetSignalFrame (struct _Unwind_Context *context, int val) context->flags &= ~SIGNAL_FRAME_BIT; } +#ifndef UNIQUE_UNWIND_CONTEXT static inline _Unwind_Word _Unwind_IsExtendedContext (struct _Unwind_Context *context) { return context->flags & EXTENDED_CONTEXT_BIT; } +#endif /* Get the value of register INDEX as saved in CONTEXT. */ @@ -168,8 +186,14 @@ _Unwind_GetGR (struct _Unwind_Context *context, int index) size = dwarf_reg_size_table[index]; ptr = context->reg[index]; +#ifdef UNIQUE_UNWIND_CONTEXT + gcc_assert (context->dwarf_reg_size_table == &dwarf_reg_size_table); + if (context->reg[index] == &context->value[index]) +return context->value[index]; +#else if (_Unwind_IsExtendedContext (context) && context->by_value[index]) return (_Unwind_Word) (_Unwind_Internal_Ptr) ptr; +#endif /* This will segfault if the register hasn't been saved. */ if (size == sizeof(_Unwind_Ptr)) @@ -207,11 +231,20 @@ _Unwind_SetGR (struct _Unwind_Context *context, int index, _Unwind_Word val) gcc_assert (index < (int) sizeof(dwarf_reg_size_table)); size = dwarf_reg_size_table[index]; +#ifdef UNIQUE_UNWIND_CONTEXT + gcc_assert (context->dwarf_reg_size_table == &dwarf_reg_size_table); + if (context->reg[index] == &context->value[index]) +{ + context->value[index] = val; + return; +} +#else if (_Unwind_IsExtendedContext (context) && context->by_value[index]) { context->reg[index] = (void *) (_Unwind_Internal_Ptr) val; return; } +#endif ptr = context->reg[index]; @@ -230,8 +263,10 @@ static inline void * _Unwind_GetGRPtr (struct _Unwind_Context *context, int index) { index = DWARF_REG_TO_UNWIND_COLUMN (index); +#ifndef UNIQUE_UNWIND_CONTEXT if (_Unwind_IsExtendedContext (context) && context->by_value[index]) return &context->reg[index]; +#endif return context->reg[index]; } @@ -241,8 +276,10 @@ static inline void _Unwind_SetGRPtr (struct _Unwind_Context *context, int index, void *p) { index = DWARF_REG_TO_UNWIND_COLUMN (index); +#ifndef UNIQUE_UNWIND_CONTEXT if (_Unwind_IsExtendedContext (context)) context->by_value[index] = 0; +#endif context->reg[index] = p; } @@ -254,10 +291,15 @@ _Unwind_SetGRValue (struct _Unwind_Context *context, int index, { index = DWARF_REG_TO_UNWIND_COLUMN (index); gcc_assert (index < (int) sizeof(dwarf_reg_size_table)); +#ifdef UNIQUE_UNWIND_CONTEXT + gcc_assert (dwarf_reg_size_table[index] == sizeof (_Unwind_Word)); + context->value[index] = val; + context->reg[index] = &context->value[index]; +#else gcc_assert (dwarf_reg_size_table[index] == sizeof
PATCH [9/n]: Prepare x32: PR middle-end/47383: ivopts miscompiles Pmode != ptr_mode
Hi, I was informed that MEM_REF only works in ptr_mode. This patch changes addr_for_mem_ref to use ptr_mode. OK for trunk? Thanks. H.J. --- 2011-06-25 H.J. Lu PR middle-end/47383 * tree-ssa-address.c (addr_for_mem_ref): Use ptr_mode instead of targetm.addr_space.address_mode. diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c index e3934e1..ddc6d58 100644 --- a/gcc/tree-ssa-address.c +++ b/gcc/tree-ssa-address.c @@ -188,12 +188,12 @@ rtx addr_for_mem_ref (struct mem_address *addr, addr_space_t as, bool really_expand) { - enum machine_mode address_mode = targetm.addr_space.address_mode (as); rtx address, sym, bse, idx, st, off; struct mem_addr_template *templ; if (addr->step && !integer_onep (addr->step)) -st = immed_double_int_const (tree_to_double_int (addr->step), address_mode); +st = immed_double_int_const (tree_to_double_int (addr->step), +ptr_mode); else st = NULL_RTX; @@ -201,7 +201,7 @@ addr_for_mem_ref (struct mem_address *addr, addr_space_t as, off = immed_double_int_const (double_int_sext (tree_to_double_int (addr->offset), TYPE_PRECISION (TREE_TYPE (addr->offset))), -address_mode); +ptr_mode); else off = NULL_RTX; @@ -220,16 +220,16 @@ addr_for_mem_ref (struct mem_address *addr, addr_space_t as, if (!templ->ref) { sym = (addr->symbol ? -gen_rtx_SYMBOL_REF (address_mode, ggc_strdup ("test_symbol")) +gen_rtx_SYMBOL_REF (ptr_mode, ggc_strdup ("test_symbol")) : NULL_RTX); bse = (addr->base ? -gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1) +gen_raw_REG (ptr_mode, LAST_VIRTUAL_REGISTER + 1) : NULL_RTX); idx = (addr->index ? -gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2) +gen_raw_REG (ptr_mode, LAST_VIRTUAL_REGISTER + 2) : NULL_RTX); - gen_addr_rtx (address_mode, sym, bse, idx, + gen_addr_rtx (ptr_mode, sym, bse, idx, st? const0_rtx : NULL_RTX, off? const0_rtx : NULL_RTX, &templ->ref, @@ -247,16 +247,16 @@ addr_for_mem_ref (struct mem_address *addr, addr_space_t as, /* Otherwise really expand the expressions. */ sym = (addr->symbol -? expand_expr (addr->symbol, NULL_RTX, address_mode, EXPAND_NORMAL) +? expand_expr (addr->symbol, NULL_RTX, ptr_mode, EXPAND_NORMAL) : NULL_RTX); bse = (addr->base -? expand_expr (addr->base, NULL_RTX, address_mode, EXPAND_NORMAL) +? expand_expr (addr->base, NULL_RTX, ptr_mode, EXPAND_NORMAL) : NULL_RTX); idx = (addr->index -? expand_expr (addr->index, NULL_RTX, address_mode, EXPAND_NORMAL) +? expand_expr (addr->index, NULL_RTX, ptr_mode, EXPAND_NORMAL) : NULL_RTX); - gen_addr_rtx (address_mode, sym, bse, idx, st, off, &address, NULL, NULL); + gen_addr_rtx (ptr_mode, sym, bse, idx, st, off, &address, NULL, NULL); return address; }
Re: RFA; MN10300: Fix AM33 clzsi2 pattern
On 06/25/2011 01:18 AM, Nick Clifton wrote: > Hi Alex, Hi Jeff, Hi Richard, > > The clzsi2/bsch patterns in the MN10300 backend do not work. There > are two problems - firstly the starting bit-search position for the > BSCH instruction is not set. Yes it is. What do you think that "unused" second operand does? > Secondly the BSCH instruction returns > the bit position of the highest set bit, not the number of leading > zeros. Ah, I do see that. I recommend you put the xor in the clz expander rather than cluttering up the "bsch" pattern. r~
PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const_int)))
Hi, When reload gets: (insn 588 587 589 28 (set (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0) (const_int 8 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8 A64]) (reg:DF 340 [ D.14980 ])) spooles.c:291 106 {*movdf_internal_rex64} (expr_list:REG_DEAD (reg:DF 340 [ D.14980 ]) (nil))) it generates: Reloads for insn # 588 Reload 0: reload_in (DI) = (reg/v/f:DI 182 [ b ]) GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0) reload_in_reg: (reg/v/f:DI 182 [ b ]) reload_reg_rtx: (reg:DI 1 dx) Reload 1: reload_in (DI) = (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0) (const_int 8 [0x8]))) GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0) reload_in_reg: (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0) (const_int 8 [0x8]))) reload_reg_rtx: (reg:DI 1 dx) Reload 2: reload_out (DF) = (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0) (const_int 8 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8 A64]) NO_REGS, RELOAD_FOR_OUTPUT (opnum = 0), optional reload_out_reg: (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0) (const_int 8 [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8 A64]) leads to (insn 1017 587 1020 34 (set (reg:DI 1 dx) (mem/c:DI (plus:DI (reg/f:DI 7 sp) (const_int 112 [0x70])) [5 %sfp+-208 S8 A64])) spooles.c:291 62 {*movdi_internal_rex64} (nil)) (insn 1020 1017 1022 34 (set (reg:SI 1 dx) (const_int 8 [0x8])) spooles.c:291 64 {*movsi_internal} (nil)) (insn 1022 1020 1023 34 (set (reg:SI 1 dx) (reg:SI 1 dx)) spooles.c:291 64 {*movsi_internal} (nil)) (insn 1023 1022 1024 34 (set (reg:SI 1 dx) (plus:SI (reg:SI 1 dx) (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32} (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0) (const_int 8 [0x8])) (nil))) (insn 1024 1023 588 34 (set (reg:DI 1 dx) (zero_extend:DI (reg:SI 1 dx))) spooles.c:291 112 {*zero_extendsidi2_rex64} (expr_list:REG_EQUIV (zero_extend:DI (plus:SI (subreg:SI (reg:DI 1 dx) 0) (const_int 8 [0x8]))) (nil))) (insn 588 1024 589 34 (set (mem:DF (reg:DI 1 dx) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8 A64]) (reg:DF 0 ax [orig:340 D.14980 ] [340])) spooles.c:291 106 {*movdf_internal_rex64} (nil)) gen_load has if (CONSTANT_P (op1) || MEM_P (op1) || GET_CODE (op1) == SUBREG || (REG_P (op1) For (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0) (const_int 8 [0x8])) it swaps SUBREG and CONST_INT. It leads to wrong code. This patch checks if OP0 is SUBREG before swapping. OK for trunk? Thanks. H.J. 2011-06-25 H.J. Lu PR rtl-optimization/49114 * reload1.c (gen_reload): Properly handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const_int))) diff --git a/gcc/reload1.c b/gcc/reload1.c index 4a697c2..d618a29 100644 --- a/gcc/reload1.c +++ b/gcc/reload1.c @@ -8528,7 +8528,9 @@ gen_reload (rtx out, rtx in, int opnum, enum reload_type type) code = optab_handler (add_optab, GET_MODE (out)); - if (CONSTANT_P (op1) || MEM_P (op1) || GET_CODE (op1) == SUBREG + if ((GET_CODE (op0) != SUBREG + && (CONSTANT_P (op1) || MEM_P (op1))) + || GET_CODE (op1) == SUBREG || (REG_P (op1) && REGNO (op1) >= FIRST_PSEUDO_REGISTER) || (code != CODE_FOR_nothing
PATCH [11/n]: Prepare x32: PR rtl-optimization/48155: Reload doesn't handle subreg properly
Hi, This is the last target independent patch for x32. I will start submitting x86 specific patches in a week. Given input: (plus:SI (subreg:SI (plus:DI (reg/f:DI 7 sp) (const_int 16 [0x10])) 0) (const_int -1 [0x])) reloads tries to add (subreg:SI (plus:DI (reg/f:DI 7 sp) (const_int 16 [0x10])) 0) to (reg:SI 1 dx) This code: /* How to do this reload can get quite tricky. Normally, we are being asked to reload a simple operand, such as a MEM, a constant, or a pseudo register that didn't get a hard register. In that case we can just call emit_move_insn. We can also be asked to reload a PLUS that adds a register or a MEM to another register, constant or MEM. This can occur during frame pointer elimination and while reloading addresses. This case is handled by trying to emit a single insn to perform the add. If it is not valid, we use a two insn sequence. Or we can be asked to reload an unary operand that was a fragment of an addressing mode, into a register. If it isn't recognized as-is, we try making the unop operand and the reload-register the same: (set reg:X (unop:X expr:Y)) -> (set reg:Y expr:Y) (set reg:X (unop:X reg:Y)). Finally, we could be called to handle an 'o' constraint by putting an address into a register. In that case, we first try to do this with a named pattern of "reload_load_address". If no such pattern exists, we just emit a SET insn and hope for the best (it will normally be valid on machines that use 'o'). This entire process is made complex because reload will never process the insns we generate here and so we must ensure that they will fit their constraints and also by the fact that parts of IN might be being reloaded separately and replaced with spill registers. Because of this, we are, in some sense, just guessing the right approach here. The one listed above seems to work. ??? At some point, this whole thing needs to be rethought. */ if (GET_CODE (in) == PLUS && (REG_P (XEXP (in, 0)) || GET_CODE (XEXP (in, 0)) == SUBREG || MEM_P (XEXP (in, 0))) && (REG_P (XEXP (in, 1)) || GET_CODE (XEXP (in, 1)) == SUBREG || CONSTANT_P (XEXP (in, 1)) || MEM_P (XEXP (in, 1 doesn't check if XEXP (in, 0/1) is a SUBREG of REG. This patch adds a new function, reload_plus_ok, to check this condition. OK for trunk? Thanks. H.J. --- 2011-06-25 H.J. Lu PR rtl-optimization/48155 * reload1.c (reload_plus_ok): New. (gen_reload_chain_without_interm_reg_p): Use it. (gen_reload): Likewise. diff --git a/gcc/reload1.c b/gcc/reload1.c index e65503b..1864ae6 100644 --- a/gcc/reload1.c +++ b/gcc/reload1.c @@ -5544,6 +5544,54 @@ substitute (rtx *where, const_rtx what, rtx repl) } } +/* Return TRUE if IN is a valid plus operation. */ + +static bool +reload_plus_ok (rtx in) +{ + if (GET_CODE (in) == PLUS) +{ + rtx op0 = XEXP (in, 0); + rtx op1 = XEXP (in, 1); + if ((REG_P (op0) + || GET_CODE (op0) == SUBREG + || MEM_P (op0)) + && (REG_P (op1) + || GET_CODE (op1) == SUBREG + || CONSTANT_P (op1) + || MEM_P (op1))) + { + rtx subreg, other; + if (GET_CODE (op0) == SUBREG) + { + subreg = SUBREG_REG (op0); + other = op1; + } + else if (GET_CODE (op1) == SUBREG) + { + subreg = SUBREG_REG (op1); + other = op0; + } + else + return true; + + /* Avoid +(plus (subreg (plus (reg) +(const_int NNN))) + (const_int NNN)) + */ + if (GET_CODE (subreg) == PLUS + && (CONSTANT_P (XEXP (subreg, 0)) + || CONSTANT_P (XEXP (subreg, 1))) + && CONSTANT_P (other)) + return false; + + return true; + } +} + return false; +} + /* The function returns TRUE if chain of reload R1 and R2 (in any order) can be evaluated without usage of intermediate register for the reload containing another reload. It is important to see @@ -5596,14 +5644,7 @@ gen_reload_chain_without_interm_reg_p (int r1, int r2) opposite SUBREG on OUT. Likewise for a paradoxical SUBREG on OUT. */ strip_paradoxical_subreg (&in, &out); - if (GET_CODE (in) == PLUS - && (REG_P (XEXP (in, 0)) - || GET_CODE (XEXP (in, 0)) == SUBREG - || MEM_P (XEXP (in, 0))) - && (REG_P (XEXP (in, 1)) - || GET_CODE (XEXP (in, 1)) == SUBREG - || CONSTANT_P (XEXP (in, 1)) - || MEM_P (XEXP (in, 1 + if (reload_plus_ok (in)) { insn = emit_insn (gen_rtx_SET (VOIDmode, out, in)); code = recog_memoized (insn); @@ -8449,14 +8490,7 @@ gen_reload (rtx ou
Re: Patch ping
> [testsuite] > http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01069.html > PR tree-optimization/48377, PR middle-end/49191 > trunk/4.6 > non_strict_align testsuite support Mike, Rainer, can one of you two take a look at this? -- Eric Botcazou
Re: PR tree-optimize/49373 (IPA-PTA regression)
Hi, just for those who are interested, this is quick&dirty patch adding another pass of local optimization passes at WPA time. I've added early inliner and IPA-SRA because I was curious how much of optimization oppurtunities we are missing by limiting those to early pass. With Early inlining it seems to be very little. We inline one extra call when building Mozilla in LTO mode. IPA SRA is different story. While we do 579 IPA SRA clones in the early pass, the late pass produces 13014 clones (22 times more ;) suggesting that the pass might be interesting at IPA level after all. There are 78686 functions after inlining in Mozilla, so one out of 7 functions is touched. Size difference of libxul is not great, about 100Kb reduction. I will try benchmarking it eventually, too. Honza Index: cgraph.c === *** cgraph.c(revision 175350) --- cgraph.c(working copy) *** cgraph_release_function_body (struct cgr *** 1389,1396 } if (cfun->cfg) { ! gcc_assert (dom_computed[0] == DOM_NONE); ! gcc_assert (dom_computed[1] == DOM_NONE); clear_edges (); } if (cfun->value_histograms) --- 1393,1403 } if (cfun->cfg) { ! /*gcc_assert (dom_computed[0] == DOM_NONE); ! gcc_assert (dom_computed[1] == DOM_NONE);*/ ! free_dominance_info (CDI_DOMINATORS); ! free_dominance_info (CDI_POST_DOMINATORS); ! clear_edges (); } if (cfun->value_histograms) Index: tree-pass.h === *** tree-pass.h (revision 175350) --- tree-pass.h (working copy) *** extern struct simple_ipa_opt_pass pass_i *** 452,458 extern struct simple_ipa_opt_pass pass_ipa_function_and_variable_visibility; extern struct simple_ipa_opt_pass pass_ipa_tree_profile; ! extern struct simple_ipa_opt_pass pass_early_local_passes; extern struct ipa_opt_pass_d pass_ipa_whole_program_visibility; extern struct ipa_opt_pass_d pass_ipa_lto_gimple_out; --- 452,458 extern struct simple_ipa_opt_pass pass_ipa_function_and_variable_visibility; extern struct simple_ipa_opt_pass pass_ipa_tree_profile; ! extern struct simple_ipa_opt_pass pass_early_local_passes, pass_late_local_passes, pass_late_local_passes2; extern struct ipa_opt_pass_d pass_ipa_whole_program_visibility; extern struct ipa_opt_pass_d pass_ipa_lto_gimple_out; Index: ipa-inline-analysis.c === *** ipa-inline-analysis.c (revision 175350) --- ipa-inline-analysis.c (working copy) *** estimate_function_body_sizes (struct cgr *** 1535,1542 edge->call_stmt_cannot_inline_p = true; gimple_call_set_cannot_inline (stmt, true); } ! else ! gcc_assert (!gimple_call_cannot_inline_p (stmt)); } /* TODO: When conditional jump or swithc is known to be constant, but --- 1535,1542 edge->call_stmt_cannot_inline_p = true; gimple_call_set_cannot_inline (stmt, true); } ! /*else ! gcc_assert (!gimple_call_cannot_inline_p (stmt));*/ } /* TODO: When conditional jump or swithc is known to be constant, but Index: tree-inline.c === *** tree-inline.c (revision 175350) --- tree-inline.c (working copy) *** expand_call_inline (basic_block bb, gimp *** 3891,3897 id->src_cfun = DECL_STRUCT_FUNCTION (fn); id->gimple_call = stmt; ! gcc_assert (!id->src_cfun->after_inlining); id->entry_bb = bb; if (lookup_attribute ("cold", DECL_ATTRIBUTES (fn))) --- 3891,3897 id->src_cfun = DECL_STRUCT_FUNCTION (fn); id->gimple_call = stmt; ! /*gcc_assert (!id->src_cfun->after_inlining);*/ id->entry_bb = bb; if (lookup_attribute ("cold", DECL_ATTRIBUTES (fn))) Index: tree-optimize.c === *** tree-optimize.c (revision 175350) --- tree-optimize.c (working copy) *** struct simple_ipa_opt_pass pass_early_lo *** 123,128 --- 123,189 /* Gate: execute, or not, all of the non-trivial optimizations. */ static bool + gate_all_late_local_passes (void) + { + /* Don't bother doing anything if the program has errors. */ + return (!seen_error () && optimize); + } + + static unsigned int + execute_all_late_local_passes (void) + { + /* Once this pass (and its sub-passes) are complete, all functions + will be in SSA form. Technically this state change is happening + a tad late, since the sub-passes have not yet run, but since + none of the sub-passes are IPA passes and do not create new + func
PR 49337: Make Gnatmake look for libgnat.so if it doesn't find libgnat.a.
Hello! The maintainers of GCC in Fedora have split out libgnat.a and libgnarl.a to a separate subpackage which is by default not installed together with the Gnat tools. (Fedora has a policy to use only shared libraries as much as possible.) This causes Gnatmake to crash when it tries to find the directory that contains libgnat by looking for libgnat.a and the file isn't there. This patch is one possible way of improving Gnatmake to also look for libgnat.so if it doesn't find libgnat.a, so that it can work with only shared libraries installed. I'm not going to claim that it's the best way, but it is what I could come up with. Björn Persson Changelog entry: * mlib-tgt-specific-linux.adb (Libgnat_Ptr): Look for libgnat.so if libgnat.a is not installed. --- gcc/ada/mlib-tgt-specific-linux.adb~ +++ gcc/ada/mlib-tgt-specific-linux.adb @@ -50,6 +50,8 @@ function Is_Archive_Ext (Ext : String) return Boolean; + function Libgnat return String; + --- -- Build_Dynamic_Library -- --- @@ -142,7 +144,27 @@ return Ext = ".a" or else Ext = ".so"; end Is_Archive_Ext; + - + -- Libgnat -- + - + + function Libgnat return String is + Libgnat_A : constant String := "libgnat.a"; + Libgnat_So : constant String := "libgnat.so"; + + begin + Name_Len := Libgnat_A'Length; + Name_Buffer (1 .. Name_Len) := Libgnat_A; + + if Osint.Find_File (Name_Enter, Osint.Library) /= No_File then + return Libgnat_A; + else + return Libgnat_So; + end if; + end Libgnat; + begin Build_Dynamic_Library_Ptr := Build_Dynamic_Library'Access; - Is_Archive_Ext_Ptr := Is_Archive_Ext'Access; + Is_Archive_Ext_Ptr:= Is_Archive_Ext'Access; + Libgnat_Ptr := Libgnat'Access; end MLib.Tgt.Specific; signature.asc Description: This is a digitally signed message part.
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD > sizeof (void *)
On Sat, 25 Jun 2011, H.J. Lu wrote: > +#ifndef UNIQUE_UNWIND_CONTEXT The use of #ifndef seems to imply that this is a target macro, to be defined in libgcc_tm.h. In that case it should be documented, and poisoned in system.h under the "only used for code built for the target" case, and this: > +#if defined __x86_64 && !defined __LP64__ is inappropriate since you should instead put it in an appropriate header in libgcc/config/, rather than hardcoding an architecture-specific #if in an architecture-independent file. -- Joseph S. Myers jos...@codesourcery.com
Re: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD > sizeof (void *)
On Sat, Jun 25, 2011 at 1:19 PM, Joseph S. Myers wrote: > On Sat, 25 Jun 2011, H.J. Lu wrote: > >> +#ifndef UNIQUE_UNWIND_CONTEXT > > The use of #ifndef seems to imply that this is a target macro, to be > defined in libgcc_tm.h. In that case it should be documented, and > poisoned in system.h under the "only used for code built for the target" > case, and this: > >> +#if defined __x86_64 && !defined __LP64__ > > is inappropriate since you should instead put it in an appropriate header > in libgcc/config/, rather than hardcoding an architecture-specific #if in > an architecture-independent file. > Here is the updated patch. OK for trunk? Thanks. -- H.J. --- gcc/ 2011-06-25 H.J. Lu PR other/48007 * config.gcc (libgcc_tm_file): Add i386/unique-unwind.h for Linux/x86. * system.h (UNIQUE_UNWIND_CONTEXT): Poisoned. * unwind-dw2.c (_Unwind_Context): If UNIQUE_UNWIND_CONTEXT is defined, add dwarf_reg_size_table and value, remove version and by_value. (EXTENDED_CONTEXT_BIT): Don't define if UNIQUE_UNWIND_CONTEXT is defined. (_Unwind_IsExtendedContext): Likewise. (_Unwind_GetGR): Support UNIQUE_UNWIND_CONTEXT. (_Unwind_SetGR): Likewise. (_Unwind_GetGRPtr): Likewise. (_Unwind_SetGRPtr): Likewise. (_Unwind_SetGRValue): Likewise. (_Unwind_GRByValue): Likewise. (__frame_state_for): Initialize dwarf_reg_size_table field if UNIQUE_UNWIND_CONTEXT is defined. (uw_install_context_1): Likewise. Support UNIQUE_UNWIND_CONTEXT. * doc/tm.texi.in: Document UNIQUE_UNWIND_CONTEXT. * doc/tm.texi: Regenerated. libgcc/ 2011-06-25 H.J. Lu * config/i386/unique-unwind.h: New file. gcc/ 2011-06-25 H.J. Lu PR other/48007 * config.gcc (libgcc_tm_file): Add i386/unique-unwind.h for Linux/x86. * system.h (UNIQUE_UNWIND_CONTEXT): Poisoned. * unwind-dw2.c (_Unwind_Context): If UNIQUE_UNWIND_CONTEXT is defined, add dwarf_reg_size_table and value, remove version and by_value. (EXTENDED_CONTEXT_BIT): Don't define if UNIQUE_UNWIND_CONTEXT is defined. (_Unwind_IsExtendedContext): Likewise. (_Unwind_GetGR): Support UNIQUE_UNWIND_CONTEXT. (_Unwind_SetGR): Likewise. (_Unwind_GetGRPtr): Likewise. (_Unwind_SetGRPtr): Likewise. (_Unwind_SetGRValue): Likewise. (_Unwind_GRByValue): Likewise. (__frame_state_for): Initialize dwarf_reg_size_table field if UNIQUE_UNWIND_CONTEXT is defined. (uw_install_context_1): Likewise. Support UNIQUE_UNWIND_CONTEXT. * doc/tm.texi.in: Document UNIQUE_UNWIND_CONTEXT. * doc/tm.texi: Regenerated. libgcc/ 2011-06-25 H.J. Lu * config/i386/unique-unwind.h: New file. diff --git a/gcc/config.gcc b/gcc/config.gcc index a1dbd1a..cdcabac 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -2627,6 +2648,7 @@ esac case ${target} in i[34567]86-*-linux* | x86_64-*-linux*) tmake_file="${tmake_file} i386/t-pmm_malloc i386/t-i386" + libgcc_tm_file="${libgcc_tm_file} i386/unique-unwind.h" ;; i[34567]86-*-* | x86_64-*-*) tmake_file="${tmake_file} i386/t-gmm_malloc i386/t-i386" diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 341628b..ad8543d 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -3701,6 +3701,13 @@ return @code{@var{regno}}. @end defmac +@defmac UNIQUE_UNWIND_CONTEXT + +Define this macro if the target only supports single unqiue unwind +context. The default is to support multiple unwind contexts. + +@end defmac + @node Elimination @subsection Eliminating Frame Pointer and Arg Pointer diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index f7c16e9..9847014 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -3687,6 +3687,13 @@ return @code{@var{regno}}. @end defmac +@defmac UNIQUE_UNWIND_CONTEXT + +Define this macro if the target only supports single unqiue unwind +context. The default is to support multiple unwind contexts. + +@end defmac + @node Elimination @subsection Eliminating Frame Pointer and Arg Pointer diff --git a/gcc/system.h b/gcc/system.h index e02cbcd..e9771af 100644 --- a/gcc/system.h +++ b/gcc/system.h @@ -764,7 +764,7 @@ extern void fancy_abort (const char *, int, const char *) ATTRIBUTE_NORETURN; /* Target macros only used for code built for the target, that have moved to libgcc-tm.h or have never been present elsewhere. */ #pragma GCC poison DECLARE_LIBRARY_RENAMES LIBGCC2_GNU_PREFIX \ - MD_UNWIND_SUPPORT ENABLE_EXECUTE_STACK + MD_UNWIND_SUPPORT ENABLE_EXECUTE_STACK UNIQUE_UNWIND_CONTEXT /* Other obsolete target macros, or macros that used to be in target headers and were not used, and may be obsolete or may never have diff --git a/gcc/unwind-dw2.c b/gcc/unwind-dw2.c index 19da299..ed6d15f 100644 --- a/gcc/unwind-dw2.c +++ b/
Re: [PATCH] Only run pr48377.c testcase on i?86/x86_64
On Jun 14, 2011, at 8:53 AM, Jakub Jelinek wrote: > On Tue, Jun 14, 2011 at 04:52:18PM +0200, Eric Botcazou wrote: >>> Well, Steve has a patch for non_strict_align effective_target >>> in http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00673.html >>> (with s/strict_align/non_strict_align/g ), I was hoping it would be >>> reviewed and I'd just adjust the testcase to use it as well. >> >> Would it be applied to the 4.6 branch as well? If no, I think you should >> apply >> your patch to trunk and 4.6 branch and let Steve adjust it on trunk later. > > I'd say it should be applied there as well. > > Here is what I've just bootstrapped/regtested, Steve's patch with that > s/strict_align/non_strict_align/g plus a smallish change on top of that. > > Mike, is this ok for trunk/4.6? Ok. > 2011-06-14 Jakub Jelinek > > PR tree-optimization/48377 > * gcc.dg/vect/pr48377.c: Add dg-require-effective-target > non_strict_align. > > 2011-06-14 Steve Ellcey > > PR middle-end/49191 > * lib/target-supports.exp (check_effective_target_non_strict_align): > New. > * gcc.dg/memcpy-3.c: Add dg-require-effective-target non_strict_align.
Re: Patch ping
On Jun 25, 2011, at 11:53 AM, Eric Botcazou wrote: >> [testsuite] >> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01069.html >> PR tree-optimization/48377, PR middle-end/49191 >> trunk/4.6 >> non_strict_align testsuite support > > Mike, Rainer, can one of you two take a look at this? Done, reviewed and approved in original thread.
Re: [testsuite] dg-require-effective-target: skip unneeded checks
On Jun 21, 2011, at 2:02 PM, Janis Johnson wrote: > This patch causes dg-require-effective-target to return early if the > test is already being skipped, saving some work. There's already > similar code in dg-skip-if. > > OK for trunk, and later for 4.6? Ok. Ok for 4.6. I'm not a huge fan of upvar, but, you didn't create the problem, so, you don't have to fix it.
Re: [testsuite] skip ARM neon-fp16 tests for other -mcpu values
On Jun 15, 2011, at 5:57 PM, Janis Johnson wrote: > The bug was in my attempt to run the tests with other -mfpu values, so > I'm very glad you caught that. I tried again, getting rid of the neon > requirement along the way, and found a way to run the VFP fp16 tests > with any of the fp16 values that Joseph listed. > > This patch renames *arm_neon_fp16* to *arm_fp16* and skips tests if the > multilib does not support arm32, includes -mfpu that is not fp16, or > includes -mfloat-abi=soft. If the multilib uses -mfpu= with an fp16 > value then that is used, otherwise -mfpu=vfpv4 is used. Added flags > include -mfloat-abi=softfp in case the default is "soft". > OK for trunk, and for 4.6 a few days later? Ok. Ok for 4.6. For 4.6, as also please ensure that the RMs don't have the branch locked down. General comment, I'm happy to have the front-end, target and library maintainers review and approve the normal additions to the .exp files to support testing their bits.
Re: [testsuite] scan-assembler variants to use 'unresolved' for missing file
On Jun 20, 2011, at 1:34 PM, Janis Johnson wrote: > Variants of scan-assembler, used in the dg-final steps of a test, do > not check that the output file exists, so it's reported as an error. > With this patch, if the file is missing then the check is reported as > unresolved using the same message as for pass/fail, and the reason for > the unresolved test is reported in the log file. This matches recent > changes for scan-dump and object-size. > > OK for trunk, and later for 4.6? Ok. Ok for 4.6. Please watch for any fall out, don't expect any, but...
Re: RFA; MN10300: Fix AM33 clzsi2 pattern
Hi Richard, The clzsi2/bsch patterns in the MN10300 backend do not work. There are two problems - firstly the starting bit-search position for the BSCH instruction is not set. Yes it is. What do you think that "unused" second operand does? That actually works ? Gross! Secondly the BSCH instruction returns the bit position of the highest set bit, not the number of leading zeros. Ah, I do see that. I recommend you put the xor in the clz expander rather than cluttering up the "bsch" pattern. OK - revised patch attached. Is this version OK to apply ? Cheers Nick gcc/ChangeLog 2011-06-26 Nick Clifton * mn10300.md (clzsi2): Use XOR after BSCH to convert bit position of highest bit set into a count of the high zero bits. Index: gcc/config/mn10300/mn10300.md === --- gcc/config/mn10300/mn10300.md (revision 175395) +++ gcc/config/mn10300/mn10300.md (working copy) @@ -1811,10 +1811,24 @@ ;; MISCELANEOUS ;; -- +;; Note the use of the (const_int 0) when generating the insn that matches +;; the bsch pattern. This ensures that the destination register is +;; initialised with 0 which will make the BSCH instruction set searching +;; at bit 31. +;; +;; The XOR in the instruction sequence below is there because the BSCH +;; instruction returns the bit number of the highest set bit and we want +;; the number of zero bits above that bit. The AM33 does not have a +;; reverse subtraction instruction, but we can use a simple xor instead +;; since we know that the top 27 bits are clear. (define_expand "clzsi2" - [(parallel [(set (match_operand:SI 0 "register_operand" "") - (unspec:SI [(match_operand:SI 1 "register_operand" "") + [(parallel [(set (match_operand:SI 0 "register_operand") + (unspec:SI [(match_operand:SI 1 "register_operand") (const_int 0)] UNSPEC_BSCH)) + (clobber (reg:CC CC_REG))]) + (parallel [(set (match_dup 0) + (xor:SI (match_dup 0) + (const_int 31))) (clobber (reg:CC CC_REG))])] "TARGET_AM33" )