Re: [PATCH, PR 49495] Cgraph verifier must look through aliases
> Hi, > > PR 49495 is actually a bug in the verifier that does not look through > aliases at one point. Fixed wit the patch below (created a special > function, otherwise I just wasn't able to fit the 80 column limit). > Bootstrapped and tested on x86_64-linux. OK for trunk? > > Thanks, > > Martin > > > 2011-07-02 Martin Jambor > > PR middle-end/49495 > * cgraphunit.c (verify_edge_corresponds_to_fndecl): New function. > (verify_cgraph_node): Some functinality moved to > verify_edge_corresponds_to_fndecl, call it. This is OK. > > > Index: src/gcc/cgraphunit.c > === > --- src.orig/gcc/cgraphunit.c > +++ src/gcc/cgraphunit.c > @@ -450,6 +450,34 @@ cgraph_debug_gimple_stmt (struct functio >debug_gimple_stmt (stmt); > } > > +/* Verify that call graph edge E corresponds to DECL from the associated > + statement. Return true if the verification should fail. */ > + > +static bool > +verify_edge_corresponds_to_fndecl (struct cgraph_edge *e, tree decl) > +{ > + if (!e->callee->global.inlined_to > + && decl > + && cgraph_get_node (decl) > + && (e->callee->former_clone_of > + != cgraph_function_or_thunk_node (cgraph_get_node (decl), NULL)->decl) > + /* IPA-CP sometimes redirect edge to clone and then back to the former > + function. This ping-pong has to go, eventaully. */ > + && (cgraph_function_or_thunk_node (cgraph_get_node (decl), NULL) > + != cgraph_function_or_thunk_node (e->callee, NULL)) > + && !clone_of_p (cgraph_get_node (decl), > + e->callee)) > +{ > + error ("edge points to wrong declaration:"); > + debug_tree (e->callee->decl); > + fprintf (stderr," Instead of:"); > + debug_tree (decl); > + return true; > +} > + else > +return false; > +} > + > /* Verify cgraph nodes of given cgraph node. */ > DEBUG_FUNCTION void > verify_cgraph_node (struct cgraph_node *node) > @@ -702,24 +730,8 @@ verify_cgraph_node (struct cgraph_node * > } > if (!e->indirect_unknown_callee) > { > - if (!e->callee->global.inlined_to > - && decl > - && cgraph_get_node (decl) > - && (e->callee->former_clone_of > - != cgraph_get_node (decl)->decl) > - /* IPA-CP sometimes redirect edge to clone and > then back to the former > -function. This ping-pong has to go, > eventaully. */ > - && (cgraph_function_or_thunk_node > (cgraph_get_node (decl), NULL) > - != cgraph_function_or_thunk_node > (e->callee, NULL)) > - && !clone_of_p (cgraph_get_node (decl), > - e->callee)) > - { > - error ("edge points to wrong declaration:"); > - debug_tree (e->callee->decl); > - fprintf (stderr," Instead of:"); > - debug_tree (decl); > - error_found = true; > - } > + if (verify_edge_corresponds_to_fndecl (e, decl)) > + error_found = true; Could you please move the error output here, somehow I like it better when all the diagnostic is output at single place... Honza > } > else if (decl) > {
Re: [testsuite] ARM test pr42093.c: thumb2 or thumb1
On 01/07/11 20:56, Janis Johnson wrote: > On 07/01/2011 02:02 AM, Richard Earnshaw wrote: >> On 24/06/11 14:18, Ramana Radhakrishnan wrote: >>> On 24/06/11 01:40, Janis Johnson wrote: Test gcc.target/arm/pr42093.c, added by Ramana, requires support for arm_thumb2 but fails for those targets. The patch for which it was added modified support for thumb1. Should the test instead require arm_thumb1_ok, as in this patch? >>> >>> No this is for a Thumb2 defect so the test is valid for Thumb2 - we >>> shouldn't be generating a tbb / tbh with signed offsets and that's what >>> was happening there. >>> >>> This test I think ends up being fragile because the generation of tbb / >>> tbh depends on how the blocks have been laid out . It would be >>> interesting to try and get a test that works reliably in T2 . >>> >>> cheers >>> Ramana >>> Janis >>> >>> >>> >> Perhaps -fno-reorder-blocks could be used to make it less fragile. >> >> R. >> > > It passes for all thumb2 targets with that option. > > Janis > > > Ok, so consider a patch to use that option pre-approved. R.
Re: [Ada] Fix parallel LTO bootstrap
> Not clear why this never showed up on the 4.6 branch, but this now prevents a > parallel LTO bootstrap with Ada enabled from completing on the mainline. > > Parallel LTO-bootstrapped, applied on the mainline and 4.6 branch. > > > 2011-07-01 Eric Botcazou > > * gcc-interface/Make-lang.in (gnat1): Prepend '+' to the command. > (gnatbind): Likewise. The changle is obviously correct, but I wonder how the bootstrap dies w/o '+'. It should IMO just prevent the parallelizm and take longer. Honza
Re: PATCH: PR target/49600: Bad SSE2 int->float split in i386.md
On Mon, Jul 4, 2011 at 7:13 AM, H.J. Lu wrote: In one SSE2 int->float split, when TARGET_USE_VECTOR_CONVERTS is true, TARGET_INTER_UNIT_MOVES is false and GENERAL_REG_P (op1) is true. we will get gcc_unreachable. This patch removes TARGET_INTER_UNIT_MOVES check. OK for trunk? >>> >>> This will result in register allocation failure. Operand 0 of > > That particular sse2_loadld insn matches: > > (insn 49 22 50 5 (set (reg:V4SI 21 xmm0 [83]) > (vec_merge:V4SI (vec_duplicate:V4SI (reg/v:SI 1 dx [orig:64 > test ] [64])) > (const_vector:V4SI [ > (const_int 0 [0]) > (const_int 0 [0]) > (const_int 0 [0]) > (const_int 0 [0]) > ]) > (const_int 1 [0x1]))) x.i:11 1365 {vec_setv4si_0} > (nil)) > Yes, but it should not be generated for !TARGET_INTER_UNIT_MOVES. The constraint should be Yi, but then we don't shadow other alternatives correctly. >>> sse2_loadld pattern has conditional constraint Yi that depends on >>> TARGET_INTER_UNIT_MOVES, so we can't blindly generate sse2_loadld >>> after reload. I'm testing attached patch. >>> >>> BTW: Do you perhaps have a testcase for this problem? >> >> I have a testcase. But it needs a new x86 optimization we are working on it. >> >>> 2011-07-03 Uros Bizjak >>> >>> PR target/49600 >>> * config/i386/i386.md (SSE2 int->float split): Push operand 1 in >>> general register to memory for !TARGET_INTER_UNIT_MOVES. >>> >> >> I will give it a try. >> > > It doesn't work: I still got Yes, I later noticed that I have changed the wrong pattern (the one with memory clobber) ;( . Attached is the correct patch. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 175786) +++ config/i386/i386.md (working copy) @@ -5022,11 +5022,20 @@ if (GET_CODE (op1) == SUBREG) op1 = SUBREG_REG (op1); - if (GENERAL_REG_P (op1) && TARGET_INTER_UNIT_MOVES) + if (GENERAL_REG_P (op1)) { operands[4] = simplify_gen_subreg (V4SImode, operands[0], mode, 0); - emit_insn (gen_sse2_loadld (operands[4], - CONST0_RTX (V4SImode), operands[1])); + if (TARGET_INTER_UNIT_MOVES) + emit_insn (gen_sse2_loadld (operands[4], + CONST0_RTX (V4SImode), operands[1])); + else + { + operands[5] = ix86_force_to_memory (GET_MODE (operands[1]), + operands[1]); + emit_insn (gen_sse2_loadld (operands[4], + CONST0_RTX (V4SImode), operands[5])); + ix86_free_from_memory (GET_MODE (operands[1])); + } } /* We can ignore possible trapping value in the high part of SSE register for non-trapping math. */
Ping #1: [testsuite, AVR]: Add some progmem test cases
Georg-Johann Lay wrote: > Some runtime and checks for error/warning for C/C++. Note that some tests fail because of pending http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02318.html Johann testsuite/ * gcc.target/avr/avr.exp: Run over cpp files, too. * gcc.target/avr/torture/avr-torture.exp: Ditto. * gcc.target/avr/progmem.h: New file. * gcc.target/avr/exit-abort.h: New file. * gcc.target/avr/progmem-error-1.c: New file. * gcc.target/avr/progmem-error-1.cpp: New file. * gcc.target/avr/progmem-warning-1.c: New file. * gcc.target/avr/torture/progmem-1.c: New file. * gcc.target/avr/torture/progmem-1.cpp: New file. Index: gcc.target/avr/avr.exp === --- gcc.target/avr/avr.exp (revision 175628) +++ gcc.target/avr/avr.exp (working copy) @@ -34,7 +34,7 @@ if ![info exists DEFAULT_CFLAGS] then { dg-init # Main loop. -dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.{\[cCS\],cpp}]] \ "" $DEFAULT_CFLAGS # All done. Index: gcc.target/avr/torture/avr-torture.exp === --- gcc.target/avr/torture/avr-torture.exp (revision 175628) +++ gcc.target/avr/torture/avr-torture.exp (working copy) @@ -52,7 +52,7 @@ set-torture-options $AVR_TORTURE_OPTIONS # Main loop. -gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] $DEFAULT_CFLAGS +gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.{\[cS\],cpp}]] $DEFAULT_CFLAGS # Finalize use of torture lists. torture-finish Index: gcc.target/avr/torture/progmem-1.c === --- gcc.target/avr/torture/progmem-1.c (revision 0) +++ gcc.target/avr/torture/progmem-1.c (revision 0) @@ -0,0 +1,30 @@ +/* { dg-do run } */ + +#include "../exit-abort.h" +#include "../progmem.h" + +const char strA[] PROGMEM = "@A"; +const char strc PROGMEM = 'c'; + +unsigned int volatile s = 2; + +int main() +{ +char c; + +c = pgm_read_char (&strA[s-1]); +if (c != 'A') +abort(); + +c = pgm_read_char (&PSTR ("@@B")[s]); +if (c != 'B') +abort(); + +c = pgm_read_char (&strc); +if (c != 'c') +abort(); + +exit (0); + +return 0; +} Index: gcc.target/avr/torture/progmem-1.cpp === --- gcc.target/avr/torture/progmem-1.cpp (revision 0) +++ gcc.target/avr/torture/progmem-1.cpp (revision 0) @@ -0,0 +1,2 @@ +/* { dg-do run } */ +#include "progmem-1.c" Index: gcc.target/avr/exit-abort.h === --- gcc.target/avr/exit-abort.h (revision 0) +++ gcc.target/avr/exit-abort.h (revision 0) @@ -0,0 +1,8 @@ +#ifdef __cplusplus +extern "C" { +#endif + extern void exit (int); + extern void abort (void); +#ifdef __cplusplus +} +#endif Index: gcc.target/avr/progmem-warning-1.c === --- gcc.target/avr/progmem-warning-1.c (revision 0) +++ gcc.target/avr/progmem-warning-1.c (revision 0) @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-Wuninitialized" } */ + +#include "progmem.h" + +const char c PROGMEM; /* { dg-warning "uninitialized variable 'c' put into program memory area" } */ Index: gcc.target/avr/progmem-error-1.c === --- gcc.target/avr/progmem-error-1.c (revision 0) +++ gcc.target/avr/progmem-error-1.c (revision 0) @@ -0,0 +1,5 @@ +/* { dg-do compile } */ + +#include "progmem.h" + +char str[] PROGMEM = "Hallo"; /* { dg-error "must be const" } */ Index: gcc.target/avr/progmem-error-1.cpp === --- gcc.target/avr/progmem-error-1.cpp (revision 0) +++ gcc.target/avr/progmem-error-1.cpp (revision 0) @@ -0,0 +1,5 @@ +/* { dg-do compile } */ + +#include "progmem.h" + +char str[] PROGMEM = "Hallo"; /* { dg-error "must be const" } */ Index: gcc.target/avr/progmem.h === --- gcc.target/avr/progmem.h (revision 0) +++ gcc.target/avr/progmem.h (revision 0) @@ -0,0 +1,14 @@ +#define PROGMEM __attribute__((progmem)) + +#define PSTR(s) \ +(__extension__({\ +static const char __c[] PROGMEM = (s); \ +&__c[0];})) + +#define pgm_read_char(addr) \ +(__extension__({\ +unsigned int __addr16 = (unsigned int)(addr); \ +char __result; \ +__asm__ ("lpm %0, %a1" \ + : "=r" (__result) : "z" (__addr16)); \ +__result; }))
Ping #1: [Patch, AVR, 4.6+trunk]: PR44643 addendum
Georg-Johann Lay wrote: http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02318.html > avr_insert_attributes uses TREE_READONLY on get readonlyness of node. > > That does not work for C++ arrays: it gives false error > "variable must be const in order to be put into read-only section by > means of '__attribute__((progmem))'". > > This patch peels arrays and uses TYPE_READONLY. > > I did not open separate PR for this, tagged it as addendum to PR44643 > instead. > > Lightly tested on own code. There is no 'progmem' in testsuite, so > from testsuite's perspective that code is dead, anyway... > > Johann > > PR target/44643 > * config/avr/avr.c (avr_insert_attributes): Use TYPE_READONLY > instead of TREE_READONLY. >
Re: [Ada] Fix parallel LTO bootstrap
> The changle is obviously correct, but I wonder how the bootstrap dies w/o > '+'. It should IMO just prevent the parallelizm and take longer. Same cryptic error as PR driver/46750. -- Eric Botcazou
Re: [wwwdocs] Document IRIX 6.5, Tru64 UNIX V5.1 obsoletion
On Fri, 1 Jul 2011, Rainer Orth wrote: > I don't need approval for the patch, but would be grateful for > improvements to wording. I find it quite clear, thanks. If you'd like, "is not" instead of "isn't" is the only suggestion I found. Gerald
[PATCH] Fix bootstrap on OpenBSD, PR48851
It happens that OpenBSD suffers from a bogus fixinclude that changes its perfectly valid NULL define from (void *)0 to 0. The fix itself appears to be very old and is completely bogus - it replaces (void *)0 with 0 under the assumption the former is invalid for C++ - which is true - but 0 is inappropriate for C which is much worse. Thus, I propose to remove the fix altogether. Platform maintainers can arrange for a new fix if the platforms still need fixing (which I seriously doubt after so many years and platform obsoletion). This restores bootstrap on OpenBSD. Ok for trunk and active branches? Thanks, Richard. 2011-07-04 Richard Guenther PR bootstrap/48851 * inclhack.def (void_null): Remove bogus fix. * fixincl.x: Regenerated. Index: fixincludes/inclhack.def === --- fixincludes/inclhack.def(revision 175800) +++ fixincludes/inclhack.def(working copy) @@ -4399,32 +4399,6 @@ fix = { /* - * AIX and Interix headers define NULL to be cast to a void pointer, - * which is illegal in ANSI C++. - */ -fix = { -hackname = void_null; -files = curses.h; -files = dbm.h; -files = locale.h; -files = stdio.h; -files = stdlib.h; -files = string.h; -files = time.h; -files = unistd.h; -files = sys/dir.h; -files = sys/param.h; -files = sys/types.h; -/* avoid changing C++ friendly NULL */ -bypass= __cplusplus; -select= "^#[ \t]*define[ \t]+NULL[ \t]+\\(\\(void[ \t]*\\*\\)0\\)"; -c_fix = format; -c_fix_arg = "#define NULL 0"; -test_text = "# define\tNULL \t((void *)0) /* typed NULL */"; -}; - - -/* * Make VxWorks header which is almost gcc ready fully gcc ready. */ fix = { Index: fixincludes/fixincl.x === --- fixincludes/fixincl.x (revision 175800) +++ fixincludes/fixincl.x (working copy) @@ -2,11 +2,11 @@ * * DO NOT EDIT THIS FILE (fixincl.x) * - * It has been AutoGen-ed Sunday June 5, 2011 at 09:04:54 PM CDT + * It has been AutoGen-ed Monday July 4, 2011 at 12:59:38 PM CEST * From the definitionsinclhack.def * and the template file fixincl */ -/* DO NOT SVN-MERGE THIS FILE, EITHER Sun Jun 5 21:04:54 CDT 2011 +/* DO NOT SVN-MERGE THIS FILE, EITHER Mon Jul 4 12:59:38 CEST 2011 * * You must regenerate it. Use the ./genfixes script. * @@ -15,7 +15,7 @@ * certain ANSI-incompatible system header files which are fixed to work * correctly with ANSI C and placed in a directory that GNU C will search. * - * This file contains 211 fixup descriptions. + * This file contains 210 fixup descriptions. * * See README for more information. * @@ -8199,48 +8199,6 @@ static const char* apzVa_I960_MacroPatch /* * * * * * * * * * * * * * * * * * * * * * * * * * * - * Description of Void_Null fix - */ -tSCC zVoid_NullName[] = - "void_null"; - -/* - * File name selection pattern - */ -tSCC zVoid_NullList[] = - "curses.h\0dbm.h\0locale.h\0stdio.h\0stdlib.h\0string.h\0time.h\0unistd.h\0sys/dir.h\0sys/param.h\0sys/types.h\0"; -/* - * Machine/OS name selection pattern - */ -#define apzVoid_NullMachs (const char**)NULL - -/* - * content selection pattern - do fix if pattern found - */ -tSCC zVoid_NullSelect0[] = - "^#[ \t]*define[ \t]+NULL[ \t]+\\(\\(void[ \t]*\\*\\)0\\)"; - -/* - * content bypass pattern - skip fix if pattern found - */ -tSCC zVoid_NullBypass0[] = - "__cplusplus"; - -#defineVOID_NULL_TEST_CT 2 -static tTestDesc aVoid_NullTests[] = { - { TT_NEGREP, zVoid_NullBypass0, (regex_t*)NULL }, - { TT_EGREP,zVoid_NullSelect0, (regex_t*)NULL }, }; - -/* - * Fix Command Arguments for Void_Null - */ -static const char* apzVoid_NullPatch[] = { -"format", -"#define NULL 0", -(char*)NULL }; - -/* * * * * * * * * * * * * * * * * * * * * * * * * * - * * Description of Vxworks_Gcc_Problem fix */ tSCC zVxworks_Gcc_ProblemName[] = @@ -8591,9 +8549,9 @@ static const char* apzX11_SprintfPatch[] * * List of all fixes */ -#define REGEX_COUNT 250 +#define REGEX_COUNT 248 #define MACH_LIST_SIZE_LIMIT 181 -#define FIX_COUNT211 +#define FIX_COUNT210 /* * Enumerate the fixes @@ -8801,7 +8759,6 @@ typedef enum { ULTRIX_CONST_FIXIDX, ULTRIX_CONST2_FIXIDX, VA_I960_MACRO_FIXIDX, -VOID_NULL_FIXIDX, VXWORKS_GCC_PROBLEM_FIXIDX, VXWORKS_NEEDS_VXTYPES_FIXIDX, VXWORKS_NEEDS_VXWORKS_FIXIDX, @@ -9823,11 +9780,6 @@ tFixDesc fixDescList[ FIX_COUNT ] = { VA_I960_MACRO_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE, aVa_I960_MacroTests, apzVa_I960_MacroPatch, 0 }, - { zVoid_NullName,zVoid_NullList, - apzVoid_NullMachs, - VOID_NULL_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE, - aVoid_NullTests, apzVoid_NullPatch, 0 }, - { zVxworks_Gcc_ProblemNa
Re: Ping #1: [testsuite, AVR]: Add some progmem test cases
2011/7/4 Georg-Johann Lay : > Georg-Johann Lay wrote: >> Some runtime and checks for error/warning for C/C++. > > Note that some tests fail because of pending > > http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02318.html > > Johann > > testsuite/ > * gcc.target/avr/avr.exp: Run over cpp files, too. > * gcc.target/avr/torture/avr-torture.exp: Ditto. > * gcc.target/avr/progmem.h: New file. > * gcc.target/avr/exit-abort.h: New file. > * gcc.target/avr/progmem-error-1.c: New file. > * gcc.target/avr/progmem-error-1.cpp: New file. > * gcc.target/avr/progmem-warning-1.c: New file. > * gcc.target/avr/torture/progmem-1.c: New file. > * gcc.target/avr/torture/progmem-1.cpp: New file. I don't know who must approve tests. If me then Approved Denis.
[PATCH] Fix PR49518
Handling of negative steps broke one of the many asserts in the vectorizer. The following patch drops one that I can't make sense of. I think all asserts need comments - especially this one would, as I can't see why using vf is correct to test against and not nelements (and why <= vf and not < vf). Well, ok? Thanks, Richard. 2011-07-04 Richard Guenther PR tree-optimization/49518 * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Drop assert. * gcc.dg/torture/pr49518.c: New testcase. Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 175800) +++ gcc/tree-vect-data-refs.c (working copy) @@ -1552,7 +1552,6 @@ vect_enhance_data_refs_alignment (loop_v for (j = 0; j < possible_npeel_number; j++) { - gcc_assert (npeel_tmp <= vf); vect_peeling_hash_insert (loop_vinfo, dr, npeel_tmp); npeel_tmp += nelements; } Index: gcc/testsuite/gcc.dg/torture/pr49518.c === --- gcc/testsuite/gcc.dg/torture/pr49518.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr49518.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ + +int a, b; +struct S { unsigned int s, t, u; } c, d = { 0, 1, 0 }; + +void +test (unsigned char z) +{ + char e[] = {0, 0, 0, 0, 1}; + for (c.s = 1; c.s; c.s++) +{ + b = e[c.s]; + if (a) + break; + b = z >= c.u; + if (d.t) + break; +} +}
Re: [PATCH, ARM] Unaligned accesses for packed structures [1/2]
Julian Brown wrote: > The most awkward change in the patch is to generic code (expmed.c, > {store,extract}_bit_field_1): in big-endian mode, the existing behaviour > (when inserting/extracting a bitfield to a memory location) is > definitely bogus: "unit" is set to BITS_PER_UNIT for memory locations, > and if bitsize (the size of the field to insert/extract) is greater than > BITS_PER_UNIT (which isn't unusual at all), xbitpos becomes negative. > That can't possibly be intentional; I can only assume that this code > path is not exercised for machines which have memory alternatives for > bitfield insert/extract, and BITS_BIG_ENDIAN of 0 in BYTES_BIG_ENDIAN > mode. [snip] > @@ -648,7 +648,7 @@ store_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT > bitsize, >/* On big-endian machines, we count bits from the most significant. >If the bit field insn does not, we must invert. */ > > - if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) > + if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN && !MEM_P (xop0)) > xbitpos = unit - bitsize - xbitpos; I agree that the current code cannot possibly be correct. However, just disabling the BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN renumbering *completely* seems wrong to me as well. According to the docs, the meaning bit position passed to the extv/insv expanders is determined by BITS_BIG_ENDIAN, both in the cases of register and memory operands. Therefore, if BITS_BIG_ENDIAN differs from BYTES_BIG_ENDIAN, we should need a correction for memory operands as well. However, this correction needs to be relative to the size of the access (i.e. the operand to the extv/insn), not just BITS_PER_UNIT. >From looking at the sources, the simplest way to implement that might be to swap the order of the two corrections, that is, change this: /* On big-endian machines, we count bits from the most significant. If the bit field insn does not, we must invert. */ if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) xbitpos = unit - bitsize - xbitpos; /* We have been counting XBITPOS within UNIT. Count instead within the size of the register. */ if (BITS_BIG_ENDIAN && !MEM_P (xop0)) xbitpos += GET_MODE_BITSIZE (op_mode) - unit; unit = GET_MODE_BITSIZE (op_mode); to look instead like: /* We have been counting XBITPOS within UNIT. Count instead within the size of the register. */ if (BYTES_BIG_ENDIAN && !MEM_P (xop0)) xbitpos += GET_MODE_BITSIZE (op_mode) - unit; unit = GET_MODE_BITSIZE (op_mode); /* On big-endian machines, we count bits from the most significant. If the bit field insn does not, we must invert. */ if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN) xbitpos = unit - bitsize - xbitpos; (Note that the condition in the first if must then check BYTES_BIG_ENDIAN instead of BITS_BIG_ENDIAN.) This change results in unchanged behaviour for register operands in all cases, and memory operands if BITS_BIG_ENDIAN == BYTES_BIG_ENDIAN. For the problematic case of memory operands with BITS_BIG_ENDIAN != BYTES_BIG ENDIAN it should result in the appropriate correction. Note that with that change, the new code your patch introduces to the ARM back-end will also need to change. You currently handle bitpos like this: base_addr = adjust_address (operands[1], HImode, bitpos / BITS_PER_UNIT); This implicitly assumes that bitpos counts according to BYTES_BIG_ENDIAN, not BITS_BIG_ENDIAN -- which exactly cancels out the common code behaviour introduced by your patch ... Thoughts? Am I overlooking something here? Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
[Path, AVR]: Implement __builtin_avr_fmul* if no hardware multiplier
The current implementation of __builtin_avr_fmul/fmuls/fmulsu has a gap if no hardware multiplier is available. This patch closes that gap by providing libgcc implementations named __fmul, __fmuls resp. __fmulsu. The implementations yield the same result as respective FMUL* instructions and have been testes against these instructions for all possible combinations of input values on an atmega88 device. Johann * doc/extend.texi (AVR Built-in Functions): Update documentation of __builtin_avr_fmul*. * config/avr/avr.c (avr_init_builtins): Don't depend on AVR_HAVE_MUL. * config/avr/avr-c.c (avr_cpu_cpp_builtins): Ditto. * config/avr/avr.md (fmul): Rename to fmul_insn. (fmuls): Rename to fmuls_insn. (fmulsu): Rename to fmulsu_insn. (fmul,fmuls,fmulsu): New expander. (*fmul.call,*fmuls.call,*fmulsu.call): New Insn. * config/avr/t-avr (LIB1ASMFUNCS): Add _fmul, _fmuls, _fmulsu. * config/avr/libgcc.S (__fmul): New function. (__fmuls): New function. (__fmulsu,__fmulsu_exit): New function. Index: doc/extend.texi === --- doc/extend.texi (revision 175800) +++ doc/extend.texi (working copy) @@ -8226,8 +8226,8 @@ or if not a specific built-in is impleme The following built-in functions map to the respective machine instruction, i.e. @code{nop}, @code{sei}, @code{cli}, @code{sleep}, @code{wdr}, @code{swap}, @code{fmul}, @code{fmuls} -resp. @code{fmulsu}. The latter three are only available if the AVR -device actually supports multiplication. +resp. @code{fmulsu}. The three @code{fmul*} built-ins are implemented +as library call if no hardware multiplier is available. @smallexample void __builtin_avr_nop (void) Index: config/avr/libgcc.S === --- config/avr/libgcc.S (revision 175628) +++ config/avr/libgcc.S (working copy) @@ -1417,3 +1417,91 @@ DEFUN __ashldi3 ret ENDF __ashldi3 #endif /* defined (L_ashldi3) */ + + +/***/ +;;; Softmul versions of FMUL, FMULS and FMULSU to implement +;;; __builtin_avr_fmul* if !AVR_HAVE_MUL +/***/ + +#define A1 24 +#define B1 25 +#define C0 22 +#define C1 23 +#define A0 __tmp_reg__ + +#ifdef L_fmuls +;;; r23:r22 = fmuls (r24, r25) like in FMULS instruction +;;; Clobbers: r24, r25, __tmp_reg__ +DEFUN __fmuls +;; A0.7 = negate result? +mov A0, A1 +eor A0, B1 +;; B1 = |B1| +sbrc B1, 7 +neg B1 +XJMP __fmulsu_exit +ENDF __fmuls +#endif /* L_fmuls */ + +#ifdef L_fmulsu +;;; r23:r22 = fmulsu (r24, r25) like in FMULSU instruction +;;; Clobbers: r24, r25, __tmp_reg__ +DEFUN __fmulsu +;; A0.7 = negate result? +mov A0, A1 +;; FALLTHRU +ENDF __fmulsu + +;; Helper for __fmuls and __fmulsu +DEFUN __fmulsu_exit +;; A1 = |A1| +sbrc A1, 7 +neg A1 +#ifdef __AVR_HAVE_JMP_CALL__ +;; Some cores have problem skipping 2-word instruction +tst A0 +brmi 1f +#else +sbrs A0, 7 +#endif /* __AVR_HAVE_JMP_CALL__ */ +XJMP __fmul +1: XCALL __fmul +;; C = -C iff A0.7 = 1 +com C1 +neg C0 +sbci C1, -1 +ret +ENDF __fmulsu_exit +#endif /* L_fmulsu */ + + +#ifdef L_fmul +;;; r22:r23 = fmul (r24, r25) like in FMUL instruction +;;; Clobbers: r24, r25, __tmp_reg__ +DEFUN __fmul +; clear result +clr C0 +clr C1 +clr A0 +1: tst B1 +;; 1.0 = 0x80, so test for bit 7 of B to see if A must to be added to C. +2: brpl 3f +;; C += A +add C0, A0 +adc C1, A1 +3: ;; A >>= 1 +lsr A1 +ror A0 +;; B <<= 1 +lsl B1 +brne 2b +ret +ENDF __fmul +#endif /* L_fmul */ + +#undef A0 +#undef A1 +#undef B1 +#undef C0 +#undef C1 Index: config/avr/avr.md === --- config/avr/avr.md (revision 175628) +++ config/avr/avr.md (working copy) @@ -3394,7 +3394,27 @@ (define_insn "wdr" (set_attr "cc" "none")]) ;; FMUL -(define_insn "fmul" +(define_expand "fmul" + [(set (reg:QI 24) +(match_operand:QI 1 "register_operand" "")) + (set (reg:QI 25) +(match_operand:QI 2 "register_operand" "")) + (parallel [(set (reg:HI 22) + (unspec:HI [(reg:QI 24) + (reg:QI 25)] UNSPEC_FMUL)) + (clobber (reg:HI 24))]) + (set (match_operand:HI 0 "register_operand" "") +(reg:HI 22))] + "" + { +if (AVR_HAVE_MUL) + { +emit_insn (gen_fmul_insn (operand0, operand1, operand2)); +DONE; + } + }) + +(define_insn "fmul_insn" [(set (match_operand:HI 0 "register_operand" "=r") (unspec:HI [(match_operand:QI 1 "register_operand" "a") (match_operand:QI 2 "register_operand" "a")] @@ -3406,8 +3426,38 @@ (define_insn "fmul" [(set_attr "len
[PATCH] Fix PR49615
This fixes an oversight in split_bbs_on_noreturn_calls. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied everywhere. Richard. 2011-07-04 Richard Guenther PR tree-optimization/49615 * tree-cfgcleanup.c (split_bbs_on_noreturn_calls): Fix basic-block index check. * g++.dg/torture/pr49615.C: New testcase. Index: gcc/tree-cfgcleanup.c === *** gcc/tree-cfgcleanup.c (revision 175752) --- gcc/tree-cfgcleanup.c (working copy) *** split_bbs_on_noreturn_calls (void) *** 599,605 BB is present in the cfg. */ if (bb == NULL || bb->index < NUM_FIXED_BLOCKS ! || bb->index >= n_basic_blocks || BASIC_BLOCK (bb->index) != bb || !gimple_call_noreturn_p (stmt)) continue; --- 599,605 BB is present in the cfg. */ if (bb == NULL || bb->index < NUM_FIXED_BLOCKS ! || bb->index >= last_basic_block || BASIC_BLOCK (bb->index) != bb || !gimple_call_noreturn_p (stmt)) continue; Index: gcc/testsuite/g++.dg/torture/pr49615.C === *** gcc/testsuite/g++.dg/torture/pr49615.C (revision 0) --- gcc/testsuite/g++.dg/torture/pr49615.C (revision 0) *** *** 0 --- 1,29 + /* { dg-do compile } */ + /* { dg-options "-g" } */ + + template + static inline bool Dispatch (T* obj, void (T::*func) ()) + { + (obj->*func) (); + } + class C + { + bool f (int); + void g (); + }; + bool C::f (int n) + { + bool b; + switch (n) + { + case 0: + b = Dispatch (this, &C::g); + case 1: + b = Dispatch (this, &C::g); + } + } + void C::g () + { + for (;;) { } + } +
Re: Ping #1: [Patch, AVR, 4.6+trunk]: PR44643 addendum
2011/7/4 Georg-Johann Lay : > Georg-Johann Lay wrote: > > http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02318.html > >> avr_insert_attributes uses TREE_READONLY on get readonlyness of node. >> >> That does not work for C++ arrays: it gives false error >> "variable must be const in order to be put into read-only section by >> means of '__attribute__((progmem))'". >> >> This patch peels arrays and uses TYPE_READONLY. >> >> I did not open separate PR for this, tagged it as addendum to PR44643 >> instead. >> >> Lightly tested on own code. There is no 'progmem' in testsuite, so >> from testsuite's perspective that code is dead, anyway... >> >> Johann >> >> PR target/44643 >> * config/avr/avr.c (avr_insert_attributes): Use TYPE_READONLY >> instead of TREE_READONLY. Approved. Denis.
Re: [Path, AVR]: Implement __builtin_avr_fmul* if no hardware multiplier
2011/7/4 Georg-Johann Lay : > The current implementation of __builtin_avr_fmul/fmuls/fmulsu has a > gap if no hardware multiplier is available. > > This patch closes that gap by providing libgcc implementations named > __fmul, __fmuls resp. __fmulsu. > > The implementations yield the same result as respective FMUL* > instructions and have been testes against these instructions for all > possible combinations of input values on an atmega88 device. > > Johann > > > * doc/extend.texi (AVR Built-in Functions): Update documentation > of __builtin_avr_fmul*. > * config/avr/avr.c (avr_init_builtins): Don't depend on > AVR_HAVE_MUL. > * config/avr/avr-c.c (avr_cpu_cpp_builtins): Ditto. > * config/avr/avr.md (fmul): Rename to fmul_insn. > (fmuls): Rename to fmuls_insn. > (fmulsu): Rename to fmulsu_insn. > (fmul,fmuls,fmulsu): New expander. > (*fmul.call,*fmuls.call,*fmulsu.call): New Insn. > * config/avr/t-avr (LIB1ASMFUNCS): Add _fmul, _fmuls, _fmulsu. > * config/avr/libgcc.S (__fmul): New function. > (__fmuls): New function. > (__fmulsu,__fmulsu_exit): New function. > Approved. Denis.
Re: [PATCH] Fix PR49518
Richard Guenther wrote on 04/07/2011 02:38:50 PM: > Handling of negative steps broke one of the many asserts in > the vectorizer. The following patch drops one that I can't > make sense of. I think all asserts need comments - especially > this one would, as I can't see why using vf is correct to > test against and not nelements (and why <= vf and not < vf). There is an explanation 10 rows above the assert. It doesn't make sense to peel more than vf iterations (and not nelements, since for the case of multiple types it may help to align more data-refs - see the comment in the code). IIRC <= is for the case of aligned access, but I am not sure about that, so maybe you are right. I don't see how it is related to negative steps. I think that the real reason for this failure is that the loads are actually irrelevant (hence, vf=4 that doesn't take char loads into account), but we don't check that when we analyze data-refs. So, in my opinion, the proper fix will add such check. Thanks, Ira > > Well, ok? > > Thanks, > Richard. > > 2011-07-04 Richard Guenther > >PR tree-optimization/49518 >* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): >Drop assert. > >* gcc.dg/torture/pr49518.c: New testcase. > > Index: gcc/tree-vect-data-refs.c > === > --- gcc/tree-vect-data-refs.c (revision 175800) > +++ gcc/tree-vect-data-refs.c (working copy) > @@ -1552,7 +1552,6 @@ vect_enhance_data_refs_alignment (loop_v > >for (j = 0; j < possible_npeel_number; j++) > { > - gcc_assert (npeel_tmp <= vf); >vect_peeling_hash_insert (loop_vinfo, dr, npeel_tmp); >npeel_tmp += nelements; > } > Index: gcc/testsuite/gcc.dg/torture/pr49518.c > === > --- gcc/testsuite/gcc.dg/torture/pr49518.c (revision 0) > +++ gcc/testsuite/gcc.dg/torture/pr49518.c (revision 0) > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > + > +int a, b; > +struct S { unsigned int s, t, u; } c, d = { 0, 1, 0 }; > + > +void > +test (unsigned char z) > +{ > + char e[] = {0, 0, 0, 0, 1}; > + for (c.s = 1; c.s; c.s++) > +{ > + b = e[c.s]; > + if (a) > + break; > + b = z >= c.u; > + if (d.t) > + break; > +} > +}
Re: [PATCH] Fix PR49518
On Mon, 4 Jul 2011, Ira Rosen wrote: > > > Richard Guenther wrote on 04/07/2011 02:38:50 PM: > > > Handling of negative steps broke one of the many asserts in > > the vectorizer. The following patch drops one that I can't > > make sense of. I think all asserts need comments - especially > > this one would, as I can't see why using vf is correct to > > test against and not nelements (and why <= vf and not < vf). > > There is an explanation 10 rows above the assert. It doesn't make sense to > peel more than vf iterations (and not nelements, since for the case of > multiple types it may help to align more data-refs - see the comment in the > code). IIRC <= is for the case of aligned access, but I am not sure about > that, so maybe you are right. > > I don't see how it is related to negative steps. > > I think that the real reason for this failure is that the loads are > actually irrelevant (hence, vf=4 that doesn't take char loads into > account), but we don't check that when we analyze data-refs. So, in my > opinion, the proper fix will add such check. The following also works for me: Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c (revision 175802) +++ tree-vect-data-refs.c (working copy) @@ -1495,6 +1495,9 @@ vect_enhance_data_refs_alignment (loop_v stmt = DR_STMT (dr); stmt_info = vinfo_for_stmt (stmt); + if (!STMT_VINFO_RELEVANT (stmt_info)) + continue; + /* For interleaving, only the alignment of the first access matters. */ if (STMT_VINFO_STRIDED_ACCESS (stmt_info) does that look better or do you propose to clean the datarefs vector from those references? Thanks, Richard.
Re: [PATCH] Fix bootstrap on OpenBSD, PR48851
Hi Richard, On Mon, Jul 4, 2011 at 4:04 AM, Richard Guenther wrote: > > It happens that OpenBSD suffers from a bogus fixinclude that changes > its perfectly valid NULL define from (void *)0 to 0. The fix itself > appears to be very old and is completely bogus - it replaces > (void *)0 with 0 under the assumption the former is invalid for C++ - > which is true - but 0 is inappropriate for C which is much worse. > > Thus, I propose to remove the fix altogether. Platform maintainers > can arrange for a new fix if the platforms still need fixing (which > I seriously doubt after so many years and platform obsoletion). > > This restores bootstrap on OpenBSD. > > Ok for trunk and active branches? Sounds completely reasonable to me, but I think the platform maintainers do need to say, "okay". Cheers - Bruce
Re: [PATCH] Fix bootstrap on OpenBSD, PR48851
On Mon, 4 Jul 2011, Bruce Korb wrote: > Hi Richard, > > On Mon, Jul 4, 2011 at 4:04 AM, Richard Guenther wrote: > > > > It happens that OpenBSD suffers from a bogus fixinclude that changes > > its perfectly valid NULL define from (void *)0 to 0. The fix itself > > appears to be very old and is completely bogus - it replaces > > (void *)0 with 0 under the assumption the former is invalid for C++ - > > which is true - but 0 is inappropriate for C which is much worse. > > > > Thus, I propose to remove the fix altogether. Platform maintainers > > can arrange for a new fix if the platforms still need fixing (which > > I seriously doubt after so many years and platform obsoletion). > > > > This restores bootstrap on OpenBSD. > > > > Ok for trunk and active branches? > > Sounds completely reasonable to me, but I think the platform maintainers > do need to say, "okay". Cheers - Bruce We do not have an Interix maintainer listed, that leaves David for AIX. David, is this ok? If not, can you please work on a better more specific fixinclude wrapping the C++ variant inside __GNUG__? Thanks, Richard.
Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns
On Fri, Jul 1, 2011 at 5:23 PM, Kai Tietz wrote: > So updated patch (bootstrapped and tested for all standard languages > plus Ada and Obj-C++) on x86_64-pc-linux-gnu host. > > Index: gcc-head/gcc/tree-ssa-forwprop.c > === > --- gcc-head.orig/gcc/tree-ssa-forwprop.c > +++ gcc-head/gcc/tree-ssa-forwprop.c > @@ -1602,6 +1602,156 @@ simplify_builtin_call (gimple_stmt_itera > return false; > } > > +/* Checks if expression has type of one-bit precision, or is a known > + truth-valued expression. */ > +static bool > +truth_valued_ssa_name (tree name) > +{ > + gimple def; > + tree type = TREE_TYPE (name); > + > + if (!INTEGRAL_TYPE_P (type)) > + return false; > + /* Don't check here for BOOLEAN_TYPE as the precision isn't > + necessarily one and so ~X is not equal to !X. */ > + if (TYPE_PRECISION (type) == 1) > + return true; > + def = SSA_NAME_DEF_STMT (name); > + if (is_gimple_assign (def)) > + return truth_value_p (gimple_assign_rhs_code (def)); > + return false; > +} > + > +/* Helper routine for simplify_bitwise_binary_1 function. > + Return for the SSA name NAME the expression X if it mets condition > + NAME = !X. Otherwise return NULL_TREE. > + Detected patterns for NAME = !X are: > + !X and X == 0 for X with integral type. > + X ^ 1, X != 1,or ~X for X with integral type with precision of one. */ > +static tree > +lookup_logical_inverted_value (tree name) > +{ > + tree op1, op2; > + enum tree_code code; > + gimple def; > + > + /* If name has none-intergal type, or isn't a SSA_NAME, then > + return. */ > + if (TREE_CODE (name) != SSA_NAME > + || !INTEGRAL_TYPE_P (TREE_TYPE (name))) > + return NULL_TREE; > + def = SSA_NAME_DEF_STMT (name); > + if (!is_gimple_assign (def)) > + return NULL_TREE; > + > + code = gimple_assign_rhs_code (def); > + op1 = gimple_assign_rhs1 (def); > + op2 = NULL_TREE; > + > + /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand. > + If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR, > + or BIT_NOT_EXPR, then return. */ > + if (code == EQ_EXPR || code == NE_EXPR > + || code == BIT_XOR_EXPR) > + op2 = gimple_assign_rhs2 (def); > + > + switch (code) > + { > + case TRUTH_NOT_EXPR: > + return op1; > + case BIT_NOT_EXPR: > + if (truth_valued_ssa_name (name)) > + return op1; > + break; > + case EQ_EXPR: > + /* Check if we have X == 0 and X has an integral type. */ > + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) > + break; > + if (integer_zerop (op2)) > + return op1; > + break; > + case NE_EXPR: > + /* Check if we have X != 1 and X is a truth-valued. */ > + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) > + break; > + if (integer_onep (op2) && truth_valued_ssa_name (op1)) > + return op1; > + break; > + case BIT_XOR_EXPR: > + /* Check if we have X ^ 1 and X is truth valued. */ > + if (integer_onep (op2) && truth_valued_ssa_name (op1)) > + return op1; > + break; > + default: > + break; > + } > + > + return NULL_TREE; > +} > + > +/* Try to optimize patterns X & !X -> zero, X | !X -> one, and > + X ^ !X -> one, if type of X is valid for this. > + > + See for list of detected logical-not patterns the > + lookup_logical_inverted_value function. */ As usual - refer to actual arguments. I'd do /* Optimize ARG1 CODE ARG2 to a constant for bitwise binary operations CODE if one operand has the logically inverted value of the other. */ > +static tree > +simplify_bitwise_binary_1 (enum tree_code code, tree arg1, > + tree arg2) > +{ > + tree a1not, a2not; > + tree op = NULL_TREE; > + > + /* If CODE isn't a bitwise binary operation, return NULL_TREE. */ > + if (code != BIT_AND_EXPR && code != BIT_IOR_EXPR > + && code != BIT_XOR_EXPR) > + return NULL_TREE; > + > + /* First check if operands ARG1 and ARG2 are equal. */ > + if (operand_equal_p (arg1, arg2, 0)) > + return NULL_TREE; That's an early out - use arg1 == arg2 instead and mention why we do not optimize it - it's done by fold_stmt. > + /* See if we have in arguments logical-not patterns. */ > + a1not = lookup_logical_inverted_value (arg1); > + a2not = lookup_logical_inverted_value (arg2); You didn't re-organize the code to only call one of the lookups if that succeeded as I requested. > + /* If there are no logical-not in arguments, return NULL_TREE. */ > + if (!a1not && !a2not) > + return NULL_TREE; > + > + /* If both arguments are logical-not patterns, then try to fold > + them or return NULL_TREE. */ > + if (a1not && a2not) > + { > + /* If logical-not operands of ARG1 and ARG2 are equal, then fold > + them.. */ No double-full-stop please. Instead of "fold" say "simplify". > + if (operand_equal_p (a1not, a2not, 0)) The only case where a1not or a2not
Re: Improve Solaris mudflap support (PR libmudflap/49550)
Frank, this patch has remained unreviewed for a week. Could you please have a look? Thanks. Rainer Rainer Orth writes: > This is the first of two patches to get mudflap fully working on > Solaris 11, both with Sun ld and GNU ld. > > It addresses a couple of testsuite failures: > > * Several tests fail with 3 unexpected register violations: > > *** > mudflap violation 1 (register): time=1309356076.070433 ptr=21680 size=16 > pc=7fa07a64 > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflap.so.0.0.0'__mf_register+0x2c > [0x7fa07a64] > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflap.so.0.0.0'__wrap_main+0x194 > [0x7fa07c18] > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/testsuite/heap-scalestress.exe'_start+0x5c > [0x10afc] > Nearby object 1: checked region begins 0B into and ends 15B into > mudflap object a2aa8: name=`/usr/include/iso/stdio_iso.h:163:15 __iob' > bounds=[21680,217bf] size=320 area=static check=0r/0w liveness=0 > alloc time=1309356076.069900 pc=7fa07a64 > number of nearby objects: 1 > > All 3 are 0, 16, or 32 bytes __iob[]. The error goes away with > -no-heur-stdlib. > > If running the test with -trace-calls, I find: > > mf: register ptr=21680 size=16 type=4 name='stdin' > mf: violation pc=7fa07b94 location= type=3 ptr=21680 size=16 > *** > mudflap violation 1 (register): time=1309365780.121411 ptr=21680 size=16 > pc=7fa07b94 > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflap.so.0.0.0'__mf_register+0x2c > [0x7fa07b94] > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.l > ibs/libmudflap.so.0.0.0'__wrap_main+0x194 [0x7fa07d48] > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/te > stsuite/heap-scalestress.exe'_start+0x5c [0x10afc] > Nearby object 1: checked region begins 0B into and ends 15B into > mudflap object a2aa8: name=`/usr/include/iso/stdio_iso.h:163:15 __iob' > bounds=[21680,217bf] size=320 area=static check=0r/0w liveness=0 > alloc time=1309365780.077107 pc=7fa07b94 > number of nearby objects: 1 > > The conflict is between > > mf: register ptr=21680 size=320 type=4 > name='/usr/include/iso/stdio_iso.h:163:15 __iob' > > and > > mf: register ptr=21680 size=16 type=4 name='stdin' > mf: violation pc=7fa07260 location= type=3 ptr=21680 size=16 > > where the registration of __iob has been done automatically by the > compiler. I avoid this problem by not registering stdin, stdout, and > stderr separately on Solaris. > > * Some tests were failing while calling unregister in munmap. It turned > out that there had been no corresponding mmap registration before. > This occurs because Solaris has mmap64 for largefile-aware programs > instead. Fixed by wrapping mmap64, too. What I don't know is if > mmap64 needs to be added to MFWRAP_SPEC in gcc.c? If so, I'd rather > do it by adding some MFWRAP_OS_SPEC to avoid having to duplicate the > whole spec in the Solaris config headers. > > * As noted in the last patch, the getmntent signature differs in > Solaris. This patch implements a wrapper for the Solaris version. > > * libmudflap.cth/pass37-frag.c would fail like this: > > *** > mudflap violation 1 (unregister): time=1309444614.922185 ptr=7f9e90a4 size=4 > pc=7fa07e78 > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflapth.so.0.0.0'__mf_unregister+0xec > [0x7fa07e78] > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflapth.so.0.0.0'__mf_pthread_cleanup+0x4c > [0x7fa277dc] > > /var/gcc/regression/trunk/11-gcc/build/sparc-sun-solaris2.11/libmudflap/.libs/libmudflapth.so.0.0.0'__mf_pthread_spawner+0x14c > [0x7fa27938] > Nearby object 1: checked region begins 0B into and ends 3B into > mudflap object a43b8: name=`errno area' > bounds=[7f9e90a4,7f9e90a7] size=4 area=static check=0r/0w liveness=0 > alloc time=1309444614.909941 pc=7fa077e8 thread=1 > number of nearby objects: 1 > FAIL: libmudflap.cth/pass37-frag.c execution test > > Investigating with -trace-calls reveals that all registrations and > unregistrations of errno are for the same address, which is wrong for > multithreaded programs which access errno via an accessor function. > To enable that, needs to be included with _REENTRANT > defined. It turned out that it suffices to do this in mf-hooks3.c. > > * libmudflap.c/heap-scalestress.c always timed out on my SPARC test > system: on a 1.2 GHz UltraSPARC-T2, it takes > > real8:47.06 > user 43.12 > sys 8:03.77 > > which is way over the limit. On my laptop (1.6 GHz Core i7), it takes > > real 37.35 > user 5.06 > sys 32.23 > > I've divided SCALE by 10 to account for this. > > * I've replaced all the __FreeBSD__ && ...
Re: [PATCH] Handle vectorization of invariant loads (PR46787)
On Wed, Jun 29, 2011 at 4:19 AM, Richard Guenther wrote: > > The following patch makes us handle invariant loads during vectorization. > Dependence analysis currently isn't clever enough to disambiguate them > thus we insert versioning-for-alias checks. For the testcase hoisting > the load is still always possible though, and for a read-after-write > dependence it would be possible for the vectorized loop copy as the > may-aliasing write is varying by the scalar variable size. > > The existing code for vectorizing invariant accesses looks very > suspicious - it generates a vector load at the scalar address > to then just extract the first vector element. Huh. IMHO this > can be simplified as done, by just re-using the scalar load result. > But maybe this code was supposed to deal with something entirely > different? > > This patch gives a 33% speedup to the phoronix himeno testcase > if you bump the maximum alias versioning checks we want to insert. > > I'm currently re-bootstrapping & testing this but an earlier version > was ok on x86_64-unknown-linux-gnu. > > 2011-06-29 Richard Guenther > > PR tree-optimization/46787 > * tree-data-ref.c (dr_address_invariant_p): Remove. > (find_data_references_in_stmt): Invariant accesses are ok now. > * tree-vect-stmts.c (vectorizable_load): Handle invariant > loads. > * tree-vect-data-refs.c (vect_analyze_data_ref_access): Allow > invariant loads. > > * gcc.dg/vect/vect-121.c: New testcase. > This also caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49628 -- H.J.
Re: PATCH: PR target/49600: Bad SSE2 int->float split in i386.md
On Mon, Jul 4, 2011 at 3:18 AM, Uros Bizjak wrote: > On Mon, Jul 4, 2011 at 7:13 AM, H.J. Lu wrote: > > In one SSE2 int->float split, when TARGET_USE_VECTOR_CONVERTS is true, > TARGET_INTER_UNIT_MOVES is false and GENERAL_REG_P (op1) is true. we > will get gcc_unreachable. This patch removes TARGET_INTER_UNIT_MOVES > check. OK for trunk? This will result in register allocation failure. Operand 0 of >> >> That particular sse2_loadld insn matches: >> >> (insn 49 22 50 5 (set (reg:V4SI 21 xmm0 [83]) >> (vec_merge:V4SI (vec_duplicate:V4SI (reg/v:SI 1 dx [orig:64 >> test ] [64])) >> (const_vector:V4SI [ >> (const_int 0 [0]) >> (const_int 0 [0]) >> (const_int 0 [0]) >> (const_int 0 [0]) >> ]) >> (const_int 1 [0x1]))) x.i:11 1365 {vec_setv4si_0} >> (nil)) >> > > Yes, but it should not be generated for !TARGET_INTER_UNIT_MOVES. The > constraint should be Yi, but then we don't shadow other alternatives > correctly. > sse2_loadld pattern has conditional constraint Yi that depends on TARGET_INTER_UNIT_MOVES, so we can't blindly generate sse2_loadld after reload. I'm testing attached patch. BTW: Do you perhaps have a testcase for this problem? >>> >>> I have a testcase. But it needs a new x86 optimization we are working on it. >>> 2011-07-03 Uros Bizjak PR target/49600 * config/i386/i386.md (SSE2 int->float split): Push operand 1 in general register to memory for !TARGET_INTER_UNIT_MOVES. >>> >>> I will give it a try. >>> >> >> It doesn't work: I still got > > Yes, I later noticed that I have changed the wrong pattern (the one > with memory clobber) ;( . Attached is the correct patch. > This works. Can you check it in? Thanks. -- H.J.
Re: [PATCH] Address lowering [1/3] Main patch
On Thu, Jun 30, 2011 at 4:39 PM, William J. Schmidt wrote: > This is the first of three patches related to lowering addressing > expressions to MEM_REFs and TARGET_MEM_REFs in late gimple. This patch > contains the new pass together with supporting changes in existing > modules. The second patch contains an independent change to the RTL > forward propagator to keep it from undoing an optimization made in the > first patch. The third patch contains new test cases and changes to > existing test cases. > > Although I've broken it up into three patches to make the review easier, > it would be best to commit at least the first and third together to > avoid regressions. The second can stand alone. > > I've done regression tests on powerpc64 and x86_64, and have asked > Andreas Krebbel to test against the IBM z (390) platform. I've done > performance regression testing on powerpc64. The only performance > regression of note is the 2% degradation to 188.ammp due to loss of > field disambiguation information. As discussed in another thread, > fixing this introduces more complexity than it's worth. Are there also performance improvements? What about code size? I tried to get an understanding to what kind of optimizations this patch produces based on the test of testcases you added, but I have a hard time here. Can you outline some please? I still do not like the implementation of yet another CSE machinery given that we already have two. I think most of the need for CSE comes from the use of the affine combination framework and force_gimple_operand. In fact I'd be interested to see cases that are optimized that could not be handled by a combine-like pattern matcher? Thanks, Richard.
Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
On 01/07/11 13:25, Richard Guenther wrote: Well - some operations work the same on both signedness if you just care about the twos-complement result. This includes multiplication (but not for example division). For this special case I suggest to not bother trying to invent a generic predicate but do something local in tree-ssa-math-opts.c. OK, here's my updated patch. I've taken the view that we *know* what size and signedness the result of the multiplication is, and we know what size the input to the addition must be, so all the check has to do is make sure it does that same conversion, even if by a roundabout means. What I hadn't grasped before is that when extending a value it's the source type that is significant, not the destination, so the checks are not as complex as I had thought. So, this patch adds a test to ensure that: 1. the type is not truncated so far that we lose any information; and 2. the type is only ever extended in the proper signedness. Also, just to be absolutely sure, I've also added a little bit of logic to permit extends that are then undone by a truncate. I'm really not sure what guarantees there are about what sort of cast sequences can exist? Is this necessary? I haven't managed to coax it to generated any examples of extends followed by truncates myself, but in any case, it's hardly any code and it'll make sure it's future proofed. OK? Andrew 2011-06-28 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (valid_types_for_madd_p): New function. (convert_plusminus_to_widen): Use valid_types_for_madd_p to identify optimization candidates. gcc/testsuite/ * gcc.target/arm/wmul-5.c: New file. * gcc.target/arm/no-wmla-1.c: New file. --- .../gcc/testsuite/gcc.target/arm/no-wmla-1.c | 11 ++ .../gcc/testsuite/gcc.target/arm/wmul-5.c | 10 ++ src/gcc-mainline/gcc/tree-ssa-math-opts.c | 112 ++-- 3 files changed, 123 insertions(+), 10 deletions(-) create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c new file mode 100644 index 000..17f7427 --- /dev/null +++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +int +foo (int a, short b, short c) +{ + int bc = b * c; +return a + (short)bc; +} + +/* { dg-final { scan-assembler "mul" } } */ diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c new file mode 100644 index 000..65c43e3 --- /dev/null +++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, char *b, char *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler "umlal" } } */ diff --git a/src/gcc-mainline/gcc/tree-ssa-math-opts.c b/src/gcc-mainline/gcc/tree-ssa-math-opts.c index d55ba57..5ef7bb4 100644 --- a/src/gcc-mainline/gcc/tree-ssa-math-opts.c +++ b/src/gcc-mainline/gcc/tree-ssa-math-opts.c @@ -2085,6 +2085,78 @@ convert_mult_to_widen (gimple stmt) return true; } +/* Check the input types, TYPE1 and TYPE2 to a widening multiply, + and then the convertions between the output of the multiply, and + the input to an addition EXPR, to ensure that they are compatible with + a widening multiply-and-accumulate. + + This function assumes that expr is a valid string of conversion expressions + terminated by a multiplication. + + This function tries NOT to make any (fragile) assumptions about what + sequence of conversions can exist in the input. */ + +static bool +valid_types_for_madd_p (tree type1, tree type2, tree expr) +{ + gimple stmt, prev_stmt; + enum tree_code code, prev_code; + tree prev_expr, type, prev_type; + int bitsize, prev_bitsize, initial_bitsize, min_bitsize; + bool initial_unsigned; + + initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2); + initial_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2); + + stmt = SSA_NAME_DEF_STMT (expr); + code = gimple_assign_rhs_code (stmt); + type = TREE_TYPE (expr); + bitsize = TYPE_PRECISION (type); + min_bitsize = bitsize; + + if (code == MULT_EXPR || code == WIDEN_MULT_EXPR) +return true; + + if (!INTEGRAL_TYPE_P (type) + || TYPE_PRECISION (type) < initial_bitsize) +return false; + + /* Step through the conversions backwards. */ + while (true) +{ + prev_expr = gimple_assign_rhs1 (stmt); + prev_stmt = SSA_NAME_DEF_STMT (prev_expr); + prev_code = gimple_assign_rhs_code (prev_stmt); + prev_type = TREE_TYPE (prev_expr); + prev_bitsize = TYPE_PRECISION (prev_type); + + if (prev_code == MULT_EXPR || prev_code == WIDEN
Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies
On 28/06/11 15:14, Andrew Stubbs wrote: On 28/06/11 13:33, Andrew Stubbs wrote: On 23/06/11 15:41, Andrew Stubbs wrote: If one or both of the inputs to a widening multiply are of unsigned type then the compiler will attempt to use usmul_widen_optab or umul_widen_optab, respectively. That works fine, but only if the target supports those operations directly. Otherwise, it just bombs out and reverts to the normal inefficient non-widening multiply. This patch attempts to catch these cases and use an alternative signed widening multiply instruction, if one of those is available. I believe this should be legal as long as the top bit of both inputs is guaranteed to be zero. The code achieves this guarantee by zero-extending the inputs to a wider mode (which must still be narrower than the output mode). OK? This update fixes the testsuite issue Janis pointed out. And this one fixes up the wmul-5.c testcase also. The patch has changed the correct result. Here's an update for the context changed by the update to patch 3. The content of the patch has not changed. Andrew 2011-07-04 Andrew Stubbs gcc/ * Makefile.in (tree-ssa-math-opts.o): Add langhooks.h dependency. * optabs.c (find_widening_optab_handler): Rename to ... (find_widening_optab_handler_and_mode): ... this, and add new argument 'found_mode'. * optabs.h (find_widening_optab_handler): Rename to ... (find_widening_optab_handler_and_mode): ... this. (find_widening_optab_handler): New macro. * tree-ssa-math-opts.c: Include langhooks.h (build_and_insert_cast): New function. (convert_mult_to_widen): Add new argument 'gsi'. Convert unsupported unsigned multiplies to signed. (convert_plusminus_to_widen): Likewise. (execute_optimize_widening_mul): Pass gsi to convert_mult_to_widen. gcc/testsuite/ * gcc.target/arm/wmul-5.c: Update expected result. * gcc.target/arm/wmul-6.c: New file. --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -2672,7 +2672,8 @@ tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \ tree-ssa-math-opts.o : tree-ssa-math-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(FLAGS_H) $(TREE_H) $(TREE_FLOW_H) $(TIMEVAR_H) \ $(TREE_PASS_H) alloc-pool.h $(BASIC_BLOCK_H) $(TARGET_H) \ - $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h + $(DIAGNOSTIC_H) $(RTL_H) $(EXPR_H) $(OPTABS_H) gimple-pretty-print.h \ + langhooks.h tree-ssa-alias.o : tree-ssa-alias.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ $(TREE_H) $(TM_P_H) $(EXPR_H) $(GGC_H) $(TREE_INLINE_H) $(FLAGS_H) \ $(FUNCTION_H) $(TIMEVAR_H) convert.h $(TM_H) coretypes.h langhooks.h \ --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -232,9 +232,10 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1) non-widening optabs also. */ enum insn_code -find_widening_optab_handler (optab op, enum machine_mode to_mode, - enum machine_mode from_mode, - int permit_non_widening) +find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode, + enum machine_mode from_mode, + int permit_non_widening, + enum machine_mode *found_mode) { for (; (permit_non_widening || from_mode != to_mode) && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode) @@ -245,7 +246,11 @@ find_widening_optab_handler (optab op, enum machine_mode to_mode, from_mode); if (handler != CODE_FOR_nothing) - return handler; + { + if (found_mode) + *found_mode = from_mode; + return handler; + } } return CODE_FOR_nothing; --- a/gcc/optabs.h +++ b/gcc/optabs.h @@ -808,8 +808,13 @@ extern void emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code); extern bool maybe_emit_unop_insn (enum insn_code, rtx, rtx, enum rtx_code); /* Find a widening optab even if it doesn't widen as much as we want. */ -extern enum insn_code find_widening_optab_handler (optab, enum machine_mode, - enum machine_mode, int); +#define find_widening_optab_handler(A,B,C,D) \ + find_widening_optab_handler_and_mode (A, B, C, D, NULL) +extern enum insn_code find_widening_optab_handler_and_mode (optab, + enum machine_mode, + enum machine_mode, + int, + enum machine_mode *); /* An extra flag to control optab_for_tree_code's behavior. This is needed to distinguish between machines with a vector shift that takes a scalar for the --- a/gcc/testsuite/gcc.target/arm/wmul-5.c +++ b/gcc/testsuite/gcc.target/arm/wmul-5.c @@ -7,4 +7,4 @@ foo (long long a, char *b, char *c) return a + *b * *c; } -/* { dg-final { scan-assembler "umlal" } } */ +/* { dg-final { scan-assembler "smlalbb" } } */ --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/wmul-6.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, unsigned char *b, signed char *c) +{ + return a + (long long)*b * (long long)*c; +} + +/* { dg-final { scan-assembler "smlal" } } */ --- a/gcc/tree-ssa-mat
Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs
On 28/06/11 16:08, Andrew Stubbs wrote: On 23/06/11 15:41, Andrew Stubbs wrote: This patch removes the restriction that the inputs to a widening multiply must be of the same mode. It does this by extending the smaller of the two inputs to match the larger; therefore, it remains the case that subsequent code (in the expand pass, for example) can rely on the type of rhs1 being the input type of the operation, and the gimple verification code is still valid. OK? This update fixes the testcase issue Janis highlighted. And this one updates the context changed by my update to patch 3. The content of the patch has not changed. Andrew 2011-06-28 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME. Ensure the the larger type is the first operand. (convert_mult_to_widen): Insert cast if type2 is smaller than type1. (convert_plusminus_to_widen): Likewise. gcc/testsuite/ * gcc.target/arm/wmul-7.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/wmul-7.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +unsigned long long +foo (unsigned long long a, unsigned char *b, unsigned short *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler "umlal" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -2051,9 +2051,17 @@ is_widening_mult_p (gimple stmt, *type2_out = *type1_out; } - /* FIXME: remove this restriction. */ - if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out)) -return false; + /* Ensure that the larger of the two operands comes first. */ + if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out)) +{ + tree tmp; + tmp = *type1_out; + *type1_out = *type2_out; + *type2_out = tmp; + tmp = *rhs1_out; + *rhs1_out = *rhs2_out; + *rhs2_out = tmp; +} return true; } @@ -2069,6 +2077,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) enum insn_code handler; enum machine_mode to_mode, from_mode; optab op; + int cast1 = false, cast2 = false; lhs = gimple_assign_lhs (stmt); type = TREE_TYPE (lhs); @@ -2107,16 +2116,26 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) return false; type1 = type2 = lang_hooks.types.type_for_mode (from_mode, 0); - - rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), - create_tmp_var (type1, NULL), rhs1, type1); - rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), - create_tmp_var (type2, NULL), rhs2, type2); + cast1 = cast2 = true; } else return false; } + if (TYPE_MODE (type2) != from_mode) +{ + type2 = lang_hooks.types.type_for_mode (from_mode, + TYPE_UNSIGNED (type2)); + cast2 = true; +} + + if (cast1) +rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), rhs1, type1); + if (cast2) +rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), rhs2, type2); + gimple_assign_set_rhs1 (stmt, fold_convert (type1, rhs1)); gimple_assign_set_rhs2 (stmt, fold_convert (type2, rhs2)); gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR); @@ -2215,6 +2234,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, optab this_optab; enum tree_code wmult_code; enum insn_code handler; + int cast1 = false, cast2 = false; lhs = gimple_assign_lhs (stmt); type = TREE_TYPE (lhs); @@ -2302,17 +2322,28 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type))) { type1 = type2 = lang_hooks.types.type_for_mode (mode, 0); - mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), - create_tmp_var (type1, NULL), - mult_rhs1, type1); - mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), - create_tmp_var (type2, NULL), - mult_rhs2, type2); + cast1 = cast2 = true; } else return false; } + if (TYPE_MODE (type2) != TYPE_MODE (type1)) +{ + type2 = lang_hooks.types.type_for_mode (TYPE_MODE (type1), + TYPE_UNSIGNED (type2)); + cast2 = true; +} + + if (cast1) +mult_rhs1 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type1, NULL), + mult_rhs1, type1); + if (cast2) +mult_rhs2 = build_and_insert_cast (gsi, gimple_location (stmt), + create_tmp_var (type2, NULL), + mult_rhs2, type2); + /* Verify that the convertions between the mult and the add doesn't do anything unexpected. */ if (!valid_types_for_madd_p (type1, type2, mult_rhs))
Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching
On 28/06/11 16:30, Andrew Stubbs wrote: On 23/06/11 15:42, Andrew Stubbs wrote: This patch fixes the case where widening multiply-and-accumulate were not recognised because the multiplication itself is not actually widening. This can happen when you have "DI + SI * SI" - the multiplication will be done in SImode as a non-widening multiply, and it's only the final accumulate step that is widening. This was not recognised for two reasons: 1. is_widening_mult_p inferred the output type from the multiply statement, which in not useful in this case. 2. The inputs to the multiply instruction may not have been converted at all (because they're not being widened), so the pattern match failed. The patch fixes these issues by making the output type explicit, and by permitting unconverted inputs (the types are still checked, so this is safe). OK? This update fixes Janis' testsuite issue. This updates the context changed by my update to patch 3. The content of this patch has not changed. Andrew 2011-07-04 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument 'type'. Use 'type' from caller, not inferred from 'rhs'. Don't reject non-conversion statements. Do return lhs in this case. (is_widening_mult_p): Add new argument 'type'. Use 'type' from caller, not inferred from 'stmt'. Pass type to is_widening_mult_rhs_p. (convert_mult_to_widen): Pass type to is_widening_mult_p. (convert_plusminus_to_widen): Likewise. gcc/testsuite/ * gcc.target/arm/wmul-8.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/wmul-8.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, int *b, int *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler "smlal" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -1963,7 +1963,8 @@ struct gimple_opt_pass pass_optimize_bswap = } }; -/* Return true if RHS is a suitable operand for a widening multiplication. +/* Return true if RHS is a suitable operand for a widening multiplication, + assuming a target type of TYPE. There are two cases: - RHS makes some value at least twice as wide. Store that value @@ -1973,32 +1974,32 @@ struct gimple_opt_pass pass_optimize_bswap = but leave *TYPE_OUT untouched. */ static bool -is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out) +is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out, + tree *new_rhs_out) { gimple stmt; - tree type, type1, rhs1; + tree type1, rhs1; enum tree_code rhs_code; if (TREE_CODE (rhs) == SSA_NAME) { - type = TREE_TYPE (rhs); stmt = SSA_NAME_DEF_STMT (rhs); if (!is_gimple_assign (stmt)) return false; - rhs_code = gimple_assign_rhs_code (stmt); - if (TREE_CODE (type) == INTEGER_TYPE - ? !CONVERT_EXPR_CODE_P (rhs_code) - : rhs_code != FIXED_CONVERT_EXPR) - return false; - rhs1 = gimple_assign_rhs1 (stmt); type1 = TREE_TYPE (rhs1); if (TREE_CODE (type1) != TREE_CODE (type) || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type)) return false; - *new_rhs_out = rhs1; + rhs_code = gimple_assign_rhs_code (stmt); + if (TREE_CODE (type) == INTEGER_TYPE + ? !CONVERT_EXPR_CODE_P (rhs_code) + : rhs_code != FIXED_CONVERT_EXPR) + *new_rhs_out = gimple_assign_lhs (stmt); + else + *new_rhs_out = rhs1; *type_out = type1; return true; } @@ -2013,28 +2014,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out) return false; } -/* Return true if STMT performs a widening multiplication. If so, - store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT - respectively. Also fill *RHS1_OUT and *RHS2_OUT such that converting - those operands to types *TYPE1_OUT and *TYPE2_OUT would give the - operands of the multiplication. */ +/* Return true if STMT performs a widening multiplication, assuming the + output type is TYPE. If so, store the unwidened types of the operands + in *TYPE1_OUT and *TYPE2_OUT respectively. Also fill *RHS1_OUT and + *RHS2_OUT such that converting those operands to types *TYPE1_OUT + and *TYPE2_OUT would give the operands of the multiplication. */ static bool -is_widening_mult_p (gimple stmt, +is_widening_mult_p (tree type, gimple stmt, tree *type1_out, tree *rhs1_out, tree *type2_out, tree *rhs2_out) { - tree type; - - type = TREE_TYPE (gimple_assign_lhs (stmt)); if (TREE_CODE (type) != INTEGER_TYPE && TREE_CODE (type) != FIXED_POINT_TYPE) return false; - if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out)) + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out, + rhs1_out)) return false; - if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out)) + if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
Re: [PATCH] Fix bootstrap on OpenBSD, PR48851
On Mon, Jul 4, 2011 at 8:51 AM, Richard Guenther wrote: > On Mon, 4 Jul 2011, Bruce Korb wrote: > >> Hi Richard, >> >> On Mon, Jul 4, 2011 at 4:04 AM, Richard Guenther wrote: >> > >> > It happens that OpenBSD suffers from a bogus fixinclude that changes >> > its perfectly valid NULL define from (void *)0 to 0. The fix itself >> > appears to be very old and is completely bogus - it replaces >> > (void *)0 with 0 under the assumption the former is invalid for C++ - >> > which is true - but 0 is inappropriate for C which is much worse. >> > >> > Thus, I propose to remove the fix altogether. Platform maintainers >> > can arrange for a new fix if the platforms still need fixing (which >> > I seriously doubt after so many years and platform obsoletion). >> > >> > This restores bootstrap on OpenBSD. >> > >> > Ok for trunk and active branches? >> >> Sounds completely reasonable to me, but I think the platform maintainers >> do need to say, "okay". Cheers - Bruce > > We do not have an Interix maintainer listed, that leaves David for AIX. > David, is this ok? If not, can you please work on a better more > specific fixinclude wrapping the C++ variant inside __GNUG__? Okay with me. Thanks, David
Re: [1/11] Use targetm.shift_truncation_mask more consistently
On 07/01/2011 10:27 AM, Bernd Schmidt wrote: > * simplify-rtx.c (simplify_const_binary_operation): Use the > shift_truncation_mask hook instead of performing modulo by > width. Compare against mode precision, not bitsize. > * combine.c (combine_simplify_rtx, simplify_shift_const_1): > Use shift_truncation_mask instead of constructing the value > manually. Ok. r~
Re: [PATCH] Address lowering [1/3] Main patch
Hi, On Mon, 4 Jul 2011, Richard Guenther wrote: > I still do not like the implementation of yet another CSE machinery > given that we already have two. >From reading it it really seems to be a normal block-local CSE, without anything fancy. Hence, moving the pass just a little earlier (before pass_vrp/pass_dominator) should already provide for all optimizations. If not those should be improved. I see that it is used for also getting rid of the zero-offset statements in case non-zero-offsets follow. I think that's generally worthwhile so probably should be done in one of the above optimizers. You handle NOP_EXPR different from CONVERT_EXPR. The middle-end doesn't distinguish between them (yes, ignore the comment about one generating code, the other not). Your check for small types: + if (TYPE_MODE (TREE_TYPE (TREE_OPERAND (expr, 0))) == SImode) + ref_found = true; You probably want != BLKmode . + if (changed && is_zero_offset_ref (gimple_assign_lhs (stmt))) +VEC_safe_push (gimple, heap, zero_offset_refs, stmt); + + rhs1 = gimple_assign_rhs1_ptr (stmt); + rhs_changed = tree_ssa_lower_addr_tree (rhs1, gsi, speed, false); + + /* Record zero-offset mem_refs on the RHS. */ + if (rhs_changed && is_zero_offset_ref (gimple_assign_rhs1 (stmt))) +VEC_safe_push (gimple, heap, zero_offset_refs, stmt); This possibly adds stmt twice to zero_offset_refs. Do you really want this? Ciao, Michael.
Re: [pph] Fix global variable assembly ordering (issue4627087)
On Fri, Jul 1, 2011 at 21:35, Gabriel Charette wrote: > As variables are discovered (while parsing the header) they are added to the > varpool and their RTL is built. > > We do not stream, nor the varpool, nor the RTL (and I don't think we want to > + that wouldn't > work with multiple pph). Right. Additionally, saving RTL makes the PPH target-dependent. We don't want that. > > We want to rebuild the varpool when streaming the global variables of the pph > in so as to > redefine them in the varpool in the same order they would have been found in > a regular > #include style parse. Right. > I'm not sure whether "global variables, not externals" is specific enough or > too broad (I can't reuse the caller > of varpool_finalize_decl (rest_of_decl_compilation) to take care of this > logic because it needs some parser > state which we no longer have). I will create more tests next week with > different orderings for functions, > structs, etc. coming in from the pph. Hm, I think we actually want to call rest_of_decl_compilation here. This is also used from the LTO front end when reconstructing variables. Your patch is in the right direction, though, so I've applied it for now. Diego.
[PATCH] Fix ICE with gfortran ... -L without argument (PR fortran/49623)
Hi! If -L doesn't have an argument, find_spec_file ICEs on it, as the argument is NULL. As suggested by Joseph, this disregards in this loop all options which don't have the required argument. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.6? 2011-07-04 Jakub Jelinek PR fortran/49623 * gfortranspec.c (lang_specific_driver): Ignore options with CL_ERR_MISSING_ARG errors. --- gcc/fortran/gfortranspec.c.jj 2011-07-04 14:58:56.0 +0200 +++ gcc/fortran/gfortranspec.c 2011-07-04 15:01:58.0 +0200 @@ -255,6 +255,9 @@ lang_specific_driver (struct cl_decoded_ for (i = 1; i < argc; ++i) { + if (decoded_options[i].errors & CL_ERR_MISSING_ARG) + continue; + switch (decoded_options[i].opt_index) { case OPT_SPECIAL_input_file: Jakub
[PATCH] Fix dead_debug_insert_before ICE (PR debug/49522)
Hi! In dead_debug_* we don't immediately rescan insns, because that kills all the df links we need to use, only queue their rescanning. There are two kinds of changes we do on the debug insns without immediate rescanning: 1) reset the debug insn 2) replace a reg use with DEBUG_EXPR of the same mode or subreg of a larger DEBUG_EXPR with the same outer mode as the reg In the attached testcase on arm a debug insn is reset, because a multi-reg register has been used there and as the debug insn location was that multi-reg register before, it is now VOIDmode after the reset - (clobber (const_int 0)). Fixed by disregarding the reset debug insns. Changes of kind 2) that needed rescanning don't need this, as the mode doesn't change in that case. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.6? 2011-07-04 Jakub Jelinek PR debug/49522 * df-problems.c (dead_debug_insert_before): Ignore uses where the use debug insn has been reset. * gcc.dg/debug/pr49522.c: New test. --- gcc/df-problems.c.jj2011-06-17 11:02:19.0 +0200 +++ gcc/df-problems.c 2011-07-04 10:46:42.0 +0200 @@ -3148,6 +3148,7 @@ dead_debug_insert_before (struct dead_de struct dead_debug_use *cur; struct dead_debug_use *uses = NULL; struct dead_debug_use **usesp = &uses; + bool no_reg_ok = false; rtx reg = NULL; rtx dval; rtx bind; @@ -3161,6 +3162,21 @@ dead_debug_insert_before (struct dead_de { if (DF_REF_REGNO (cur->use) == uregno) { + /* If cur->use insn has been meanwhile reset, but hasn't been +rescanned, just ignore that use. */ + if (DF_REF_REAL_LOC (cur->use) + == &INSN_VAR_LOCATION_LOC (DF_REF_INSN (cur->use)) + && VAR_LOC_UNKNOWN_P (*DF_REF_REAL_LOC (cur->use))) + { + gcc_assert (debug->to_rescan != NULL + && bitmap_bit_p (debug->to_rescan, + INSN_UID (DF_REF_INSN (cur->use; + *tailp = cur->next; + XDELETE (cur); + if (!reg) + no_reg_ok = true; + continue; + } *usesp = cur; usesp = &cur->next; *tailp = cur->next; @@ -3174,6 +3190,9 @@ dead_debug_insert_before (struct dead_de tailp = &(*tailp)->next; } + if (no_reg_ok && !reg) +return; + gcc_assert (reg); /* Create DEBUG_EXPR (and DEBUG_EXPR_DECL). */ --- gcc/testsuite/gcc.dg/debug/pr49522.c.jj 2011-07-04 10:54:23.0 +0200 +++ gcc/testsuite/gcc.dg/debug/pr49522.c2011-07-04 10:54:02.0 +0200 @@ -0,0 +1,41 @@ +/* PR debug/49522 */ +/* { dg-do compile } */ +/* { dg-options "-fcompare-debug" } */ + +int val1 = 0L; +volatile int val2 = 7L; +long long val3; +int *ptr = &val1; + +static int +func1 () +{ + return 0; +} + +static short int +func2 (short int a, unsigned int b) +{ + return !b ? a : a >> b; +} + +static unsigned long long +func3 (unsigned long long a, unsigned long long b) +{ + return !b ? a : a % b; +} + +void +func4 (unsigned short arg1, int arg2) +{ + for (arg2 = 0; arg2 < 2; arg2++) +{ + *ptr = func3 (func3 (10, func2 (val3, val2)), val3); + for (arg1 = -14; arg1 > 14; arg1 = func1 ()) + { + *ptr = -1; + if (foo ()) + ; + } +} +} Jakub
[PATCH] Fix an endless recursion during simplification of MULT (PR rtl-optimization/49472)
Hi! On the attached testcase simplify-rtx.c was endlessly oscillating when trying to simplify a complex debug insn location. The first hunk changes oscillation between 3 possible expressions into oscillation between 2 possible expressions, by preferring to change second argument instead of first, because swap_commutative_operands_p prefers to put NEG to the second argument instead of first. The second hunk fixes the oscillation by not trying to optimize if we just move the NEG around. Otherwise, on (mult (mult (reg A) (reg B)) (neg (reg B))) those hunks try to move the neg to the first argument to see if it would simplify things. That becomes then (mult (mult (reg A) (neg (reg B))) (reg B)) and as MULT is associative and swap_commutative_operands_p prefers to put NEG last, it optimizes it again into the original form and back endlessly. The patch still tries to simplify the negation of the other argument, but if the other argument is also MULT and it didn't really simplify it, just moved the negation around, it will stop. Bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk? The bug is latent on 4.6 branch, ok for branch as well? 2011-07-04 Jakub Jelinek PR rtl-optimization/49472 * simplify-rtx.c (simplify_unary_operation_1) : When negating MULT, negate the second operand instead of first. (simplify_binary_operation_1) : If one operand is a NEG and the other is MULT, don't attempt to optimize by negation of the MULT operand if it only moves the NEG operation around. * gfortran.dg/pr49472.f90: New test. --- gcc/simplify-rtx.c.jj 2011-06-21 16:46:01.0 +0200 +++ gcc/simplify-rtx.c 2011-07-04 12:14:51.0 +0200 @@ -686,13 +686,13 @@ simplify_unary_operation_1 (enum rtx_cod return simplify_gen_binary (MINUS, mode, temp, XEXP (op, 1)); } - /* (neg (mult A B)) becomes (mult (neg A) B). + /* (neg (mult A B)) becomes (mult A (neg B)). This works even for floating-point values. */ if (GET_CODE (op) == MULT && !HONOR_SIGN_DEPENDENT_ROUNDING (mode)) { - temp = simplify_gen_unary (NEG, mode, XEXP (op, 0), mode); - return simplify_gen_binary (MULT, mode, temp, XEXP (op, 1)); + temp = simplify_gen_unary (NEG, mode, XEXP (op, 1), mode); + return simplify_gen_binary (MULT, mode, XEXP (op, 0), temp); } /* NEG commutes with ASHIFT since it is multiplication. Only do @@ -2271,12 +2271,34 @@ simplify_binary_operation_1 (enum rtx_co if (GET_CODE (op0) == NEG) { rtx temp = simplify_unary_operation (NEG, mode, op1, mode); + /* If op1 is a MULT as well and simplify_unary_operation +just moved the NEG to the second operand, simplify_gen_binary +below could through simplify_associative_operation move +the NEG around again and recurse endlessly. */ + if (temp + && GET_CODE (op1) == MULT + && GET_CODE (temp) == MULT + && XEXP (op1, 0) == XEXP (temp, 0) + && GET_CODE (XEXP (temp, 1)) == NEG + && XEXP (op1, 1) == XEXP (XEXP (temp, 1), 0)) + temp = NULL_RTX; if (temp) return simplify_gen_binary (MULT, mode, XEXP (op0, 0), temp); } if (GET_CODE (op1) == NEG) { rtx temp = simplify_unary_operation (NEG, mode, op0, mode); + /* If op0 is a MULT as well and simplify_unary_operation +just moved the NEG to the second operand, simplify_gen_binary +below could through simplify_associative_operation move +the NEG around again and recurse endlessly. */ + if (temp + && GET_CODE (op0) == MULT + && GET_CODE (temp) == MULT + && XEXP (op0, 0) == XEXP (temp, 0) + && GET_CODE (XEXP (temp, 1)) == NEG + && XEXP (op0, 1) == XEXP (XEXP (temp, 1), 0)) + temp = NULL_RTX; if (temp) return simplify_gen_binary (MULT, mode, temp, XEXP (op1, 0)); } --- gcc/testsuite/gfortran.dg/pr49472.f90.jj2011-07-04 12:23:12.0 +0200 +++ gcc/testsuite/gfortran.dg/pr49472.f90 2011-07-04 12:22:53.0 +0200 @@ -0,0 +1,15 @@ +! PR rtl-optimization/49472 +! { dg-do compile } +! { dg-options "-O -fcompare-debug -ffast-math" } +subroutine pr49472 + integer, parameter :: n = 3 + real(8) :: a, b, c, d, e (n+1) + integer :: i + do i=2, (n+1) +b = 1. / ((i - 1.5d0) * 1.) +c = b * a +d = -b * c / (1. + b * b) ** 1.5d0 +e(i) = d + end do + call dummy (e) +end subroutine Jakub
Re: [testsuite, AVR]: Add some progmem test cases
On Jun 30, 2011, at 10:38 AM, Georg-Johann Lay wrote: > Is > ./testsuite/gcc.target/avr/ > realm of avr port maintainers? I'm fine with the avr people reviewing and approving all they think is ready for the tree. If they go out into the weeds, we can reign them in, I'm sure that would never happen. If a port is lacking in review bandwidth, I might fire up, but I don't think avr fits that description. >
Re: Ping #1: [testsuite, AVR]: Add some progmem test cases
On Jul 4, 2011, at 4:07 AM, Denis Chertykov wrote: >> >> testsuite/ >>* gcc.target/avr/torture/progmem-1.cpp: New file. > > I don't know who must approve tests. > If me then Approved You! If there are ugly details more related to the test suite framework, feel free to kick it up.
CFT: Move unwinder to toplevel libgcc
"Joseph S. Myers" writes: > On Mon, 20 Jun 2011, Rainer Orth wrote: > >> * Move all remaining unwinder-only macros to libgcc: UNW_IVMS_MODE, >> MD_UNW_COMPATIBLE_PERSONALITY_P, MD_FROB_UPDATE_CONTEXT. > > I don't see any sign of macros being poisoned in system.h. For macros > used in target-independent unwinder code - at least MD_FROB_UPDATE_CONTEXT > - that used to be defined in the host tm.h but now no longer should be, I > think poisoning in system.h is appropriate. Done in the updated patch below. Given that the other two are ia64 only and not documented in md.texi, I don't think they need to be poisoned. Otherwise, the patch is unchanged from the original submission: [build] Move unwinder to toplevel libgcc http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01452.html Unfortunately, it hasn't seen much comment. I'm now looking for testers especially on platforms with more change and approval of those parts: * Several IA-64 targets: ia64*-*-linux* ia64*-*-hpux* ia64-hp-*vms* * AIX: rs6000-ibm-aix* Thanks. Rainer 2011-06-12 Rainer Orth gcc: * Makefile.in (UNWIND_H): Remove. (LIB2ADDEH, LIB2ADDEHSTATIC, LIB2ADDEHSHARED): Move to ../libgcc/Makefile.in. (LIBUNWIND, SHLIBUNWIND_LINK, SHLIBUNWIND_INSTALL): Likewise. (LIBUNWINDDEP): Remove. (libgcc-support): Remove LIB2ADDEH, $(srcdir)/emutls.c dependencies. (libgcc.mvars): Remove LIB2ADDEH, LIB2ADDEHSTATIC, LIB2ADDEHSHARED, LIBUNWIND, SHLIBUNWIND_LINK, SHLIBUNWIND_INSTALL. (stmp-int-hdrs): Remove $(UNWIND_H) dependency. Don't copy $(UNWIND_H). * config.gcc (ia64*-*-linux*): Remove with_system_libunwind handling. * configure.ac (GCC_CHECK_UNWIND_GETIPINFO): Remove. * aclocal.m4: Regenerate. * configure: Regenerate. * emutls.c, unwind-c.c, unwind-compat.c, unwind-compat.h, unwind-dw2-fde-compat.c, unwind-dw2-fde-darwin.c, unwind-dw2-fde-glibc.c, unwind-dw2-fde.c, unwind-dw2-fde.h, unwind-dw2.c, unwind-dw2.h, unwind-generic.h, unwind-pe.h, unwind-sjlj.c, unwind.inc: Move to ../libgcc. * config/arm/libunwind.S, config/arm/pr-support.c, config/arm/unwind-arm.c, config/arm/unwind-arm.h: Move to ../libgcc/config/arm. * config/arm/t-bpabi (UNWIND_H, LIB2ADDEH): Remove. * config/arm/t-symbian (UNWIND_H, LIB2ADDEH): Remove. * config/frv/t-frv ($(T)frvbegin$(objext)): Use $(srcdir)/../libgcc to refer to unwind-dw2-fde.h. ($(T)frvend$(objext)): Likewise. * config/ia64/t-glibc (LIB2ADDEH): Remove. * config/ia64/t-glibc-libunwind: Move to ../libgcc/config/ia64. * config/ia64/fde-glibc.c, config/ia64/fde-vms.c, config/ia64/unwind-ia64.c, config/ia64/unwind-ia64.h: Move to ../libgcc/config/ia64. * config/ia64/t-hpux (LIB2ADDEH): Remove. * config/ia64/t-ia64 (LIB2ADDEH): Remove. * config/ia64/t-vms (LIB2ADDEH): Remove. * config/ia64/vms.h (UNW_IVMS_MODE, MD_UNW_COMPATIBLE_PERSONALITY_P): Remove. * config/picochip/t-picochip (LIB2ADDEH): Remove. * config/rs6000/aix.h (R_LR, MD_FROB_UPDATE_CONTEXT): Remove. * config/rs6000/t-darwin (LIB2ADDEH): Remove. * config/rs6000/darwin-fallback.c: Move to ../libgcc/config/rs6000. * config/sh/t-sh ($(T)unwind-dw2-Os-4-200.o): Use $(srcdir)/../libgcc to refer to unwinder sources. * config/spu/t-spu-elf (LIB2ADDEH): Remove. * config/t-darwin (LIB2ADDEH): Remove. * config/t-freebsd (LIB2ADDEH): Remove. * config/t-libunwind (LIB2ADDEH, LIB2ADDEHSTATIC): Remove. * config/t-linux (LIB2ADDEH): Remove. * config/t-sol2 (LIB2ADDEH): Remove. * config/xtensa/t-xtensa (LIB2ADDEH): Remove. * system.h (MD_FROB_UPDATE_CONTEXT): Poison. gcc/po: * EXCLUDES (unwind-c.c, unwind-dw2-fde-darwin.c, unwind-dw2-fde-glibc.c, unwind-dw2-fde.c, unwind-dw2-fde.h, unwind-dw2.c, unwind-pe.h, unwind-sjlj.c, unwind.h): Remove. libgcc: * Makefile.in (LIB2ADDEH, LIB2ADDEHSTATIC, LIB2ADDEHSHARED): New variables. (LIBUNWIND, SHLIBUNWIND_LINK, SHLIBUNWIND_INSTALL): New variables. (LIB2ADDEH, LIB2ADDEHSTATIC, LIB2ADDEHSHARED): Add $(srcdir)/emutls.c. (install-unwind_h): New target. (all): Depend on it. * config.host (unwind_header): New variable. (*-*-freebsd*): Set tmake_file to t-eh-dw2-dip. (*-*-linux*, frv-*-*linux*, *-*-kfreebsd*-gnu, *-*-knetbsd*-gnu, *-*-gnu*): Likewise, also for *-*-kopensolaris*-gnu. (*-*-solaris2*): Add t-eh-dw2-dip to tmake_file. (arm*-*-linux-*eabi, arm*-*-uclinux*eabi, arm*-*-eabi*): Add arm/t-bpabi to tmake_file. Set unwind_header. (arm*-*-symbianelf*): Add arm/t-symbian to tmake_file.
[PATCH] Fix tree_could_trap_p so that weak var accesses are considered trapping (PR tree-optimization/49618)
Hi! Before http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168951 set_mem_attributes_minus_bitpos would set MEM_NOTRAP_P for decls based on whether they are DECL_WEAK or not, but now it is set only from !tree_could_trap_p. These patches adjust tree_could_trap_p to say that references to weak vars/functions may trap (for calls it was doing that already). The first version of the patch is intended for 4.7 and only handles that way weak vars/functions that aren't known to be defined somewhere (either in current CU, or in the CUs included in -flto build). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? The second version is simplified one which always treats DECL_WEAK vars as maybe trapping. Ok for 4.6? Jakub 2011-07-04 Jakub Jelinek PR tree-optimization/49618 * tree-eh.c (tree_could_trap_p) : For DECL_WEAK t recurse on the decl. : For DECL_WEAK decls return true if expr isn't known to be defined in current TU or some other LTO partition. --- gcc/tree-eh.c.jj2011-06-17 11:02:19.0 +0200 +++ gcc/tree-eh.c 2011-07-04 14:27:01.0 +0200 @@ -2449,8 +2449,42 @@ tree_could_trap_p (tree expr) case CALL_EXPR: t = get_callee_fndecl (expr); /* Assume that calls to weak functions may trap. */ - if (!t || !DECL_P (t) || DECL_WEAK (t)) + if (!t || !DECL_P (t)) return true; + if (DECL_WEAK (t)) + return tree_could_trap_p (t); + return false; + +case FUNCTION_DECL: + /* Assume that accesses to weak functions may trap, unless we know +they are certainly defined in current TU or in some other +LTO partition. */ + if (DECL_WEAK (expr)) + { + struct cgraph_node *node; + if (!DECL_EXTERNAL (expr)) + return false; + node = cgraph_function_node (cgraph_get_node (expr), NULL); + if (node && node->in_other_partition) + return false; + return true; + } + return false; + +case VAR_DECL: + /* Assume that accesses to weak vars may trap, unless we know +they are certainly defined in current TU or in some other +LTO partition. */ + if (DECL_WEAK (expr)) + { + struct varpool_node *node; + if (!DECL_EXTERNAL (expr)) + return false; + node = varpool_variable_node (varpool_get_node (expr), NULL); + if (node && node->in_other_partition) + return false; + return true; + } return false; default: 2011-07-04 Jakub Jelinek PR tree-optimization/49618 * tree-eh.c (tree_could_trap_p) : For DECL_WEAK decls return true. --- gcc/tree-eh.c.jj2011-05-11 17:01:05.0 +0200 +++ gcc/tree-eh.c 2011-07-04 14:32:54.0 +0200 @@ -2459,6 +2459,13 @@ tree_could_trap_p (tree expr) return true; return false; +case VAR_DECL: +case FUNCTION_DECL: + /* Assume that accesses to weak vars or functions may trap. */ + if (DECL_WEAK (expr)) +return true; + return false; + default: return false; }
[testsuite, ada] Fix run_acats for shells without type -p
My last run_acats patch broke platforms where CONFIG_SHELL doesn't support type -p (like Solaris < 11 /bin/ksh): when using awk to extract the last output field, the exit code is from the last command in the pipe (always 0), not type, so the which function returns an empty string. This patch fixes this by decoupling type/type -p from extracting the last field. Bootstrapped on i386-pc-solaris2.10 and i386-pc-solaris2.11. Ok for mainline, 4.6 and 4.5 branches (where the offending patch has been installed)? Thanks. Rainer 2011-07-01 Rainer Orth * ada/acats/run_acats (which): Extract last field from type -p, type output only if command succeeded. diff --git a/gcc/testsuite/ada/acats/run_acats b/gcc/testsuite/ada/acats/run_acats --- a/gcc/testsuite/ada/acats/run_acats +++ b/gcc/testsuite/ada/acats/run_acats @@ -14,8 +14,8 @@ fi # Fall back to whence which ksh88 and ksh93 provide, but bash does not. which () { -path=`type -p $* 2>/dev/null | awk '{print $NF}'` && { echo $path; return 0; } -path=`type $* 2>/dev/null | awk '{print $NF}'` && { echo $path; return 0; } +path=`type -p $* 2>/dev/null` && { echo $path | awk '{print $NF}'; return 0; } +path=`type $* 2>/dev/null` && { echo $path | awk '{print $NF}'; return 0; } path=`whence $* 2>/dev/null` && { echo $path; return 0; } return 1 } -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH] Fix ICE during combine (PR rtl-optimization/49619)
Hi! The following testcase ICEs, because simplify_gen_binary (IOR, HImode, ...) simplifies into (subreg:HI (reg:SI ...) 0), but was still passing mode (HImode) as second argument to recursive combine_simplify_rtx call. The second argument is op0_mode, so is supposed to be the real mode which should be assumed for its first operand. Passing mode in that case is only safe if simplify_gen_binary doesn't actually simplify it, but as simplify_gen_binary would simplify constant arguments anyway into a constant, it doesn't make any sense to hint combine_simplify_rtx about the original op0_mode. That is something only useful when called from subst, which simplifies the operands (which may turn them from non-VOIDmode into VOIDmode) and then calls combine_simplify_rtx to simplify the whole operation. The second part of the patch attempts to optimize more, as simplify_gen_binary may already simplify the expression, so often (including the testcase) combine_simplify_rtx doesn't simplify anything, i.e. tor == temp, yet it is simplified over (ior plus_arg0 plus_arg1). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? For 4.6, I think safer would be just the first one liner change to pass VOIDmode to combine_simplify_rtx. Is that ok for 4.6? 2011-07-04 Jakub Jelinek PR rtl-optimization/49619 * combine.c (combine_simplify_rtx): In PLUS -> IOR simplification pass VOIDmode as op0_mode to recursive call, and return temp even when different from tor, just if it is not IOR of the original PLUS arguments. * gcc.dg/pr49619.c: New test. --- gcc/combine.c.jj2011-06-21 16:46:01.0 +0200 +++ gcc/combine.c 2011-07-04 16:05:52.0 +0200 @@ -5681,12 +5681,17 @@ combine_simplify_rtx (rtx x, enum machin { /* Try to simplify the expression further. */ rtx tor = simplify_gen_binary (IOR, mode, XEXP (x, 0), XEXP (x, 1)); - temp = combine_simplify_rtx (tor, mode, in_dest, 0); + temp = combine_simplify_rtx (tor, VOIDmode, in_dest, 0); /* If we could, great. If not, do not go ahead with the IOR replacement, since PLUS appears in many special purpose address arithmetic instructions. */ - if (GET_CODE (temp) != CLOBBER && temp != tor) + if (GET_CODE (temp) != CLOBBER + && (GET_CODE (temp) != IOR + || ((XEXP (temp, 0) != XEXP (x, 0) + || XEXP (temp, 1) != XEXP (x, 1)) + && (XEXP (temp, 0) != XEXP (x, 1) + || XEXP (temp, 1) != XEXP (x, 0) return temp; } break; --- gcc/testsuite/gcc.dg/pr49619.c.jj 2011-07-04 16:04:21.0 +0200 +++ gcc/testsuite/gcc.dg/pr49619.c 2011-07-04 16:04:06.0 +0200 @@ -0,0 +1,13 @@ +/* PR rtl-optimization/49619 */ +/* { dg-do compile } */ +/* { dg-options "-O -fno-tree-fre" } */ + +extern int a, b; + +void +foo (int x) +{ + a = 2; + b = 0; + b = (a && ((a = 1, 0 >= b) || (short) (x + (b & x; +} Jakub
[wwwdocs] Buildstat update for 4.6
Latest results for 4.6.x -tgc Testresults for 4.6.1: hppa2.0w-hp-hpux11.00 hppa2.0w-hp-hpux11.11 hppa64-hp-hpux11.11 i386-pc-solaris2.10 i686-pc-linux-gnu (2) sparc-sun-solaris2.8 x86_64-unknown-linux-gnu Testresults for 4.6.0 sparc-sun-solaris2.10 x86_64-unknown-linux-gnu Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/buildstat.html,v retrieving revision 1.4 diff -u -r1.4 buildstat.html --- buildstat.html 4 Jun 2011 20:14:24 - 1.4 +++ buildstat.html 4 Jul 2011 18:38:15 - @@ -42,9 +42,18 @@ +hppa2.0w-hp-hpux11.00 + +Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00197.html";>4.6.1 + + + + hppa2.0w-hp-hpux11.11 Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03306.html";>4.6.1, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02630.html";>4.6.0 @@ -53,6 +62,7 @@ hppa64-hp-hpux11.11 Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03440.html";>4.6.1, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02801.html";>4.6.0 @@ -61,6 +71,7 @@ i386-pc-solaris2.8 Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00139.html";>4.6.1, http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00175.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg03106.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02960.html";>4.6.0, @@ -84,6 +95,7 @@ i386-pc-solaris2.10 Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00327.html";>4.6.1, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02738.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02705.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02704.html";>4.6.0, @@ -106,6 +118,8 @@ i686-pc-linux-gnu Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03400.html";>4.6.1, +http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03128.html";>4.6.1, http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg03610.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00440.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00064.html";>4.6.0, @@ -154,6 +168,7 @@ sparc-sun-solaris2.8 Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-07/msg00138.html";>4.6.1, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02959.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02933.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02716.html";>4.6.0, @@ -175,6 +190,7 @@ sparc-sun-solaris2.10 Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg02863.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02835.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02725.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02714.html";>4.6.0 @@ -203,6 +219,8 @@ x86_64-unknown-linux-gnu Test results: +http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03135.html";>4.6.1, +http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg01380.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg03091.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-04/msg00445.html";>4.6.0, http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg03102.html";>4.6.0,
Re: [PATCH] Fix an endless recursion during simplification of MULT (PR rtl-optimization/49472)
> 2011-07-04 Jakub Jelinek > > PR rtl-optimization/49472 > * simplify-rtx.c (simplify_unary_operation_1) : When > negating MULT, negate the second operand instead of first. > (simplify_binary_operation_1) : If one operand is > a NEG and the other is MULT, don't attempt to optimize by > negation of the MULT operand if it only moves the NEG operation > around. > > * gfortran.dg/pr49472.f90: New test. OK for mainline and 4.6 branch. -- Eric Botcazou
Re: [testsuite, ada] Fix run_acats for shells without type -p
> This patch fixes this by decoupling type/type -p from extracting the > last field. > > Bootstrapped on i386-pc-solaris2.10 and i386-pc-solaris2.11. > > Ok for mainline, 4.6 and 4.5 branches (where the offending patch has > been installed)? OK, but if this new patch introduces new regressions, please revert this change and the previous one, thanks. Arno
Re: [testsuite, ada] Fix run_acats for shells without type -p
Arnaud Charlet writes: >> This patch fixes this by decoupling type/type -p from extracting the >> last field. >> >> Bootstrapped on i386-pc-solaris2.10 and i386-pc-solaris2.11. >> >> Ok for mainline, 4.6 and 4.5 branches (where the offending patch has >> been installed)? > > OK, but if this new patch introduces new regressions, please revert this > change and the previous one, thanks. I will. The fragility of this stuff suggests that I should revisit and finish my ACATS via DejaGnu patch ;-) Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch tree-optimization]: Do bitwise operator optimizations for X op !X patterns
Ok, reworked version. The folding of X op X and !X op !X seems indeed not being necessary. So function simplifies much. Bootstrapped and regression tested for all standard languages (plus Ada and Obj-C++). Ok for apply? Regards, Kai Index: gcc-head/gcc/tree-ssa-forwprop.c === --- gcc-head.orig/gcc/tree-ssa-forwprop.c +++ gcc-head/gcc/tree-ssa-forwprop.c @@ -1602,6 +1602,129 @@ simplify_builtin_call (gimple_stmt_itera return false; } +/* Checks if expression has type of one-bit precision, or is a known + truth-valued expression. */ +static bool +truth_valued_ssa_name (tree name) +{ + gimple def; + tree type = TREE_TYPE (name); + + if (!INTEGRAL_TYPE_P (type)) +return false; + /* Don't check here for BOOLEAN_TYPE as the precision isn't + necessarily one and so ~X is not equal to !X. */ + if (TYPE_PRECISION (type) == 1) +return true; + def = SSA_NAME_DEF_STMT (name); + if (is_gimple_assign (def)) +return truth_value_p (gimple_assign_rhs_code (def), type); + return false; +} + +/* Helper routine for simplify_bitwise_binary_1 function. + Return for the SSA name NAME the expression X if it mets condition + NAME = !X. Otherwise return NULL_TREE. + Detected patterns for NAME = !X are: + !X and X == 0 for X with integral type. + X ^ 1, X != 1,or ~X for X with integral type with precision of one. */ +static tree +lookup_logical_inverted_value (tree name) +{ + tree op1, op2; + enum tree_code code; + gimple def; + + /* If name has none-intergal type, or isn't a SSA_NAME, then + return. */ + if (TREE_CODE (name) != SSA_NAME + || !INTEGRAL_TYPE_P (TREE_TYPE (name))) +return NULL_TREE; + def = SSA_NAME_DEF_STMT (name); + if (!is_gimple_assign (def)) +return NULL_TREE; + + code = gimple_assign_rhs_code (def); + op1 = gimple_assign_rhs1 (def); + op2 = NULL_TREE; + + /* Get for EQ_EXPR or BIT_XOR_EXPR operation the second operand. + If CODE isn't an EQ_EXPR, BIT_XOR_EXPR, TRUTH_NOT_EXPR, + or BIT_NOT_EXPR, then return. */ + if (code == EQ_EXPR || code == NE_EXPR + || code == BIT_XOR_EXPR) +op2 = gimple_assign_rhs2 (def); + + switch (code) +{ +case TRUTH_NOT_EXPR: + return op1; +case BIT_NOT_EXPR: + if (truth_valued_ssa_name (name)) + return op1; + break; +case EQ_EXPR: + /* Check if we have X == 0 and X has an integral type. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; + if (integer_zerop (op2)) + return op1; + break; +case NE_EXPR: + /* Check if we have X != 1 and X is a truth-valued. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (op1))) + break; + if (integer_onep (op2) && truth_valued_ssa_name (op1)) + return op1; + break; +case BIT_XOR_EXPR: + /* Check if we have X ^ 1 and X is truth valued. */ + if (integer_onep (op2) && truth_valued_ssa_name (op1)) + return op1; + break; +default: + break; +} + + return NULL_TREE; +} + +/* Optimize ARG1 CODE ARG2 to a constant for bitwise binary + operations CODE, if one operand has the logically inverted + value of the other. */ +static tree +simplify_bitwise_binary_1 (enum tree_code code, tree type, + tree arg1, tree arg2) +{ + tree anot; + + /* If CODE isn't a bitwise binary operation, return NULL_TREE. */ + if (code != BIT_AND_EXPR && code != BIT_IOR_EXPR + && code != BIT_XOR_EXPR) +return NULL_TREE; + + /* First check if operands ARG1 and ARG2 are equal. If so + return NULL_TREE as this optimization is handled fold_stmt. */ + if (arg1 == arg2) +return NULL_TREE; + /* See if we have in arguments logical-not patterns. */ + if (((anot = lookup_logical_inverted_value (arg1)) == NULL_TREE + || anot != arg2) + && ((anot = lookup_logical_inverted_value (arg2)) == NULL_TREE + || anot != arg1)) +return NULL_TREE; + + /* X & !X -> 0. */ + if (code == BIT_AND_EXPR) +return fold_convert (type, integer_zero_node); + /* X | !X -> 1 and X ^ !X -> 1, if X is truth-valued. */ + if (truth_valued_ssa_name (anot)) +return fold_convert (type, integer_one_node); + + /* ??? Otherwise result is (X != 0 ? X : 1). not handled. */ + return NULL_TREE; +} + /* Simplify bitwise binary operations. Return true if a transformation applied, otherwise return false. */ @@ -1769,6 +1892,15 @@ simplify_bitwise_binary (gimple_stmt_ite return true; } + /* Try simple folding for X op !X, and X op X. */ + res = simplify_bitwise_binary_1 (code, TREE_TYPE (arg1), arg1, arg2); + if (res != NULL_TREE) +{ + gimple_assign_set_rhs_from_tree (gsi, res); + update_stmt (gsi_stmt (*gsi)); + return true; +} + return false; } Index: gcc-head/gcc/testsuite/gcc.dg/binop-notand1a.c === --- /dev/null +++ gcc-head/gcc/tes
Re: [PATCH] Fix bootstrap on OpenBSD, PR48851
On Jul 4, 2011, at 4:04 AM, Richard Guenther wrote: > It happens that OpenBSD suffers from a bogus fixinclude that changes > its perfectly valid NULL define from (void *)0 to 0. The fix itself > appears to be very old and is completely bogus I don't agree with the completely bogus part. Why not replace it with: #undef NULL #ifdef __GNUG__ #define NULL __null #else /* G++ */ #ifndef __cplusplus #define NULL ((void *)0) #else /* C++ */ #define NULL 0 #endif /* C++ */ #endif /* G++ */ ? This is C++ friendly, C friendly and modern. It should be very safe and should work just about everywhere. > - it replaces > (void *)0 with 0 under the assumption the former is invalid for C++ - > which is true - but 0 is inappropriate for C which is much worse. A #define to 0 is, for the C language, last I checked valid. You may not like it, but welcome to C. > Thus, I propose to remove the fix altogether. Breaking all systems that are broken, isn't a good tradeoff. Now, looking at the PR, in this case, one could add a bypass __GNUG__ to this fix, and avoid the change on OpenBSD. This would also fix the problem. I do not think removing the fix is a good idea.
Re: [PATCH] Fix ICE with gfortran ... -L without argument (PR fortran/49623)
Dear Jakub, Yes! OK for trunk and, if you will, for 4.6. Thanks Paul On Mon, Jul 4, 2011 at 7:22 PM, Jakub Jelinek wrote: > Hi! > > If -L doesn't have an argument, find_spec_file ICEs on it, as > the argument is NULL. As suggested by Joseph, this disregards in > this loop all options which don't have the required argument. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk/4.6? > > 2011-07-04 Jakub Jelinek > > PR fortran/49623 > * gfortranspec.c (lang_specific_driver): Ignore options with > CL_ERR_MISSING_ARG errors. > > --- gcc/fortran/gfortranspec.c.jj 2011-07-04 14:58:56.0 +0200 > +++ gcc/fortran/gfortranspec.c 2011-07-04 15:01:58.0 +0200 > @@ -255,6 +255,9 @@ lang_specific_driver (struct cl_decoded_ > > for (i = 1; i < argc; ++i) > { > + if (decoded_options[i].errors & CL_ERR_MISSING_ARG) > + continue; > + > switch (decoded_options[i].opt_index) > { > case OPT_SPECIAL_input_file: > > Jakub > -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
[pph] Split c1eabi (issue4635089)
This test was exposing multiple failures. To isolate them better, I split it in two. I simplified c1eabi1.{cc,h} to test a single header file. This fails in assembly comparison because we do not emit static initializers properly out of the pph image. The original test fails because c2eabi1.h includes another pph image, which produces a bogus duplicate declaration error that throws the diagnostic routines into a mutually-recursive infinite call loop. I am currently working on the c1eabi1 failure. Tested on x86_64. Committed. Diego. * g++.dg/pph/c1eabi1.cc: Move main from c1eabi1.h Remove timeout. Add expected asm difference. * g++.dg/pph/c1eabi1.h: Do not include stdio.h, stdlib.h nor math.h. Declare abort, abs, exit, fabs and printf. * g++.dg/pph/c2eabi1.cc: New. * g++.dg/pph/c2eabi1.h: New. * g++.dg/pph/pph.map: Add c2eabi1.h. diff --git a/gcc/testsuite/g++.dg/pph/c1eabi1.cc b/gcc/testsuite/g++.dg/pph/c1eabi1.cc index 3f5038a..d676732 100644 --- a/gcc/testsuite/g++.dg/pph/c1eabi1.cc +++ b/gcc/testsuite/g++.dg/pph/c1eabi1.cc @@ -1,5 +1,191 @@ -// { dg-timeout 2 { target *-*-* } } -// { dg-xfail-if "INFINITE" { "*-*-*" } { "-fpph-map=pph.map" } } // { dg-options "-w -fpermissive" } +// pph asm xdiff #include "c1eabi1.h" + +int main () { + unsigned char bytes[256]; + int i, j, k, n; + int *result; + + /* Table 2. Double-precision floating-point arithmetic. */ + deq (__aeabi_dadd (dzero, done), done); + deq (__aeabi_dadd (done, done), dtwo); + deq (__aeabi_ddiv (dminus_four, dminus_two), dtwo); + deq (__aeabi_ddiv (dminus_two, dtwo), dminus_one); + deq (__aeabi_dmul (dtwo, dtwo), dfour); + deq (__aeabi_dmul (dminus_one, dminus_two), dtwo); + deq (__aeabi_dneg (dminus_one), done); + deq (__aeabi_dneg (dfour), dminus_four); + deq (__aeabi_drsub (done, dzero), dminus_one); + deq (__aeabi_drsub (dtwo, dminus_two), dminus_four); + deq (__aeabi_dsub (dzero, done), dminus_one); + deq (__aeabi_dsub (dminus_two, dtwo), dminus_four); + + /* Table 3. Double-precision floating-point comparisons. */ + ieq (__aeabi_dcmpeq (done, done), 1); + ieq (__aeabi_dcmpeq (done, dzero), 0); + ieq (__aeabi_dcmpeq (dNaN, dzero), 0); + ieq (__aeabi_dcmpeq (dNaN, dNaN), 0); + + ieq (__aeabi_dcmplt (dzero, done), 1); + ieq (__aeabi_dcmplt (done, dzero), 0); + ieq (__aeabi_dcmplt (dzero, dzero), 0); + ieq (__aeabi_dcmplt (dzero, dNaN), 0); + ieq (__aeabi_dcmplt (dNaN, dNaN), 0); + + ieq (__aeabi_dcmple (dzero, done), 1); + ieq (__aeabi_dcmple (done, dzero), 0); + ieq (__aeabi_dcmple (dzero, dzero), 1); + ieq (__aeabi_dcmple (dzero, dNaN), 0); + ieq (__aeabi_dcmple (dNaN, dNaN), 0); + + ieq (__aeabi_dcmpge (dzero, done), 0); + ieq (__aeabi_dcmpge (done, dzero), 1); + ieq (__aeabi_dcmpge (dzero, dzero), 1); + ieq (__aeabi_dcmpge (dzero, dNaN), 0); + ieq (__aeabi_dcmpge (dNaN, dNaN), 0); + + ieq (__aeabi_dcmpgt (dzero, done), 0); + ieq (__aeabi_dcmpgt (done, dzero), 1); + ieq (__aeabi_dcmplt (dzero, dzero), 0); + ieq (__aeabi_dcmpgt (dzero, dNaN), 0); + ieq (__aeabi_dcmpgt (dNaN, dNaN), 0); + + ieq (__aeabi_dcmpun (done, done), 0); + ieq (__aeabi_dcmpun (done, dzero), 0); + ieq (__aeabi_dcmpun (dNaN, dzero), 1); + ieq (__aeabi_dcmpun (dNaN, dNaN), 1); + + /* Table 4. Single-precision floating-point arithmetic. */ + feq (__aeabi_fadd (fzero, fone), fone); + feq (__aeabi_fadd (fone, fone), ftwo); + feq (__aeabi_fdiv (fminus_four, fminus_two), ftwo); + feq (__aeabi_fdiv (fminus_two, ftwo), fminus_one); + feq (__aeabi_fmul (ftwo, ftwo), ffour); + feq (__aeabi_fmul (fminus_one, fminus_two), ftwo); + feq (__aeabi_fneg (fminus_one), fone); + feq (__aeabi_fneg (ffour), fminus_four); + feq (__aeabi_frsub (fone, fzero), fminus_one); + feq (__aeabi_frsub (ftwo, fminus_two), fminus_four); + feq (__aeabi_fsub (fzero, fone), fminus_one); + feq (__aeabi_fsub (fminus_two, ftwo), fminus_four); + + /* Table 5. Single-precision floating-point comparisons. */ + ieq (__aeabi_fcmpeq (fone, fone), 1); + ieq (__aeabi_fcmpeq (fone, fzero), 0); + ieq (__aeabi_fcmpeq (fNaN, fzero), 0); + ieq (__aeabi_fcmpeq (fNaN, fNaN), 0); + + ieq (__aeabi_fcmplt (fzero, fone), 1); + ieq (__aeabi_fcmplt (fone, fzero), 0); + ieq (__aeabi_fcmplt (fzero, fzero), 0); + ieq (__aeabi_fcmplt (fzero, fNaN), 0); + ieq (__aeabi_fcmplt (fNaN, fNaN), 0); + + ieq (__aeabi_fcmple (fzero, fone), 1); + ieq (__aeabi_fcmple (fone, fzero), 0); + ieq (__aeabi_fcmple (fzero, fzero), 1); + ieq (__aeabi_fcmple (fzero, fNaN), 0); + ieq (__aeabi_fcmple (fNaN, fNaN), 0); + + ieq (__aeabi_fcmpge (fzero, fone), 0); + ieq (__aeabi_fcmpge (fone, fzero), 1); + ieq (__aeabi_fcmpge (fzero, fzero), 1); + ieq (__aeabi_fcmpge (fzero, fNaN), 0); + ieq (__aeabi_fcmpge (fNaN, fNaN), 0); + + ieq (__aeabi_fcmpgt (fzero, fone), 0); + ieq (__aeabi_fcmpgt (fone, fzero), 1); + ieq (__aeabi_fcmplt (fzero, fzero), 0); + ieq (__aeabi_fcmpgt (fzero, fNaN), 0); + i
Re: C++ PATCH to improve 'aka's on type printing in diagnostics
Jason Merrill writes: | On 06/14/2011 01:38 PM, Jason Merrill wrote: | > While I was at it, I've also tweaked the compiler to also print the | > typedef-stripped version of a type when appropriate, which should help | > with understanding template error messages. | | I noticed that this was sometimes printing an aka that was exactly the | same, which looks a bit goofy. So this patch makes sure that the | typedef-stripped version actually prints out differently before | appending the {aka}. | | Tested x86_64-pc-linux-gnu. Gaby: I'm not entirely comfortable | messing directly with the obstack here, but the pp interface doesn't | seem to support multiple strings at once. Does this approach make | sense to you, or do you have a better idea? | Hi Jason, Please go ahead with your patch, and open a PR request for a better interface (assigned to me). The diagnostic machinery should support what you want to do without people having to deal directly with the lower-level storage management. Thanks! -- Gaby
Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area
"H.J. Lu" writes: > RTL-based forward propagation pass shouldn't propagate hard register. That's seems a bit draconian. Many fixed hard registers ought to be OK. E.g. there doesn't seem to be anything wrong with propagating uses of the stack or frame pointers, subject to the usual availability checks. To play devil's advocate, an alternative might be to (a) make local_ref_killed_between_p return true for non-fixed hard registers when a call or asm comes between the two instructions (b) make use_killed_between return true for non-fixed hard registers when the instructions are in different basic blocks Thoughts? Richard
Re: Ping: C-family stack check for threads
Richard Henderson wrote: On 07/03/2011 08:06 AM, Thomas Klein wrote: > +/* > + * Write prolouge part of stack check into asm file. > + * For Thumb this may look like this: > + * push {rsym,ramn} > + * ldr rsym, .LSPCHK0 > + * ldr rsym, [rsym] > + * ldr ramn, .LSPCHK0 + 4 > + * add rsym, rsym, ramn > + * cmp sp, rsym > + * bhs .LSPCHK1 > + * push {lr} > + * bl __thumb_stack_failure > + * .align 2 > + * .LSPCHK0: > + * .word symbol_addr_of(stack_limit_rtx) > + * .word lenght_of(amount) > + * .LSPCHK1: > + * pop {rsym,ramn} > + */ > +void > +stack_check_output_function (FILE *f, int reg0, int reg1, unsigned amount, > + unsigned numregs) > +{ Is there an exceedingly good reason you're emitting this much code as text, rather than as rtl? To me, the stack check is one coherent operation. This is placed after an initial push, which can't be eliminated, but before a major stack adjustment. I have, had some problems with rtl at prologue stage. Is there a way to encapsulate a rtl sequence within prologue. There is a emit_multi_reg_push but is there something like emit_multi_reg_pop, too. Are the other operations (compare, branche, ..) still allowed? In particular, you adjust the stack but not the unwind info. So if one puts a breakpoint at your __thumb_stack_failure function, the unwind information will be incorrect. Yes, if the failure function is taken the info will be wrong. If this is a major problem do I have to add this info after any push and pop operation? Will the rtl push/pop do this already for me? Regards Thomas Klein
Re: [PATCH] Fix ICE during combine (PR rtl-optimization/49619)
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > For 4.6, I think safer would be just the first one liner change to pass > VOIDmode to combine_simplify_rtx. Is that ok for 4.6? > > 2011-07-04 Jakub Jelinek > > PR rtl-optimization/49619 > * combine.c (combine_simplify_rtx): In PLUS -> IOR simplification > pass VOIDmode as op0_mode to recursive call, and return temp even > when different from tor, just if it is not IOR of the original > PLUS arguments. > > * gcc.dg/pr49619.c: New test. OK for mainline, and for 4.6/4.5 branch as far as the first part is concerned. -- Eric Botcazou
[pph] Tweak some tests (issue4668052)
This patch adds an assertion to x1ten-hellos to make sure that the loop counter is properly initialized and ends in 10. It also calls exit instead of return. In c1eabi1.h I forgot to surround the system function signatures in extern "C" {}. Tested on x86_64. Committed. Diego. * g++.dg/pph/c1eabi1.h: Surround system function prototypes with extern "C" {}. * g++.dg/pph/x1ten-hellos.cc (main): Tidy. Assert that i is 10 at the end of the loop. Call exit instead of 'return 0'. * g++.dg/pph/x1ten-hellos.h: Do not include stdio.h. diff --git a/gcc/testsuite/g++.dg/pph/c1eabi1.h b/gcc/testsuite/g++.dg/pph/c1eabi1.h index 77ebfa3..f43913f 100644 --- a/gcc/testsuite/g++.dg/pph/c1eabi1.h +++ b/gcc/testsuite/g++.dg/pph/c1eabi1.h @@ -33,11 +33,13 @@ /* Simplified version of c2eabi1.cc - Do not include other system headers here. Simply forward declare the library functions used by this header. */ -extern void abort(void); -extern int abs(int); -extern void exit(int); -extern double fabs(double); -extern int printf(const char *, ...); +extern "C" { + extern void abort(void); + extern int abs(int); + extern void exit(int); + extern double fabs(double); + extern int printf(const char *, ...); +} /* All these functions are defined to use the base ABI, so use the attribute to ensure the tests use the base ABI to call them even diff --git a/gcc/testsuite/g++.dg/pph/x1ten-hellos.cc b/gcc/testsuite/g++.dg/pph/x1ten-hellos.cc index 865b149..704b3fc 100644 --- a/gcc/testsuite/g++.dg/pph/x1ten-hellos.cc +++ b/gcc/testsuite/g++.dg/pph/x1ten-hellos.cc @@ -1,10 +1,17 @@ // { dg-do run } + #include "x1ten-hellos.h" int main(void) { A a; - for (int i = 0; i < 10; i++) + int i; + + for (i = 0; i < 10; i++) a.hello(); - return 0; + + if (i != 10) +abort (); + + exit (0); } diff --git a/gcc/testsuite/g++.dg/pph/x1ten-hellos.h b/gcc/testsuite/g++.dg/pph/x1ten-hellos.h index 2a53b66..c165c01 100644 --- a/gcc/testsuite/g++.dg/pph/x1ten-hellos.h +++ b/gcc/testsuite/g++.dg/pph/x1ten-hellos.h @@ -1,6 +1,10 @@ #ifndef A_H_ #define A_H_ -#include +extern "C" { + int printf(const char*, ...); + void abort(void); + void exit(int); +}; class A { -- This patch is available for review at http://codereview.appspot.com/4668052
Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area
On Mon, Jul 4, 2011 at 12:57 PM, Richard Sandiford wrote: > "H.J. Lu" writes: >> RTL-based forward propagation pass shouldn't propagate hard register. > > That's seems a bit draconian. Many fixed hard registers ought to be OK. > E.g. there doesn't seem to be anything wrong with propagating uses of > the stack or frame pointers, subject to the usual availability checks. > > To play devil's advocate, an alternative might be to > > (a) make local_ref_killed_between_p return true for non-fixed hard > registers when a call or asm comes between the two instructions > > (b) make use_killed_between return true for non-fixed hard registers > when the instructions are in different basic blocks > > Thoughts? > There are a few problems with this suggestions: 1. The comments says: /* If USE is a subreg, see if it can be replaced by a pseudo. */ static bool forward_propagate_subreg (df_ref use, rtx def_insn, rtx def_set) { It indicates this function is intended to work on pseudo registers. 2. propagate_rtx avoids hard registers: static rtx propagate_rtx (rtx x, enum machine_mode mode, rtx old_rtx, rtx new_rtx, bool speed) { rtx tem; bool collapsed; int flags; if (REG_P (new_rtx) && REGNO (new_rtx) < FIRST_PSEUDO_REGISTER) return NULL_RTX; It seems that fwprop is intended to deal with pseudo registers. If we want to extend it to hard registers, that should be a separate project. Thanks. -- H.J.
[pph] Split c1meteor-contest.cc (issue4654087)
The test c1meteor-contest.cc had similar issues as c1eabi1.cc. The inclusion of system headers that have been PPH'd confuse the compiler. I split the test so we have one version without additional includes and another with the standard includes. The version without additional includes works fine. Tested on x86_64. Committed. Diego. * g++.dg/pph/c1meteor-contest.cc: Make executable. Move function main() from c1meteor-contest.h. * g++.dg/pph/c1meteor-contest.h: Do not include stdlib.h nor stdio.h. Add prototype for system functions qsort, printf and atoi. * g++.dg/pph/c2meteor-contest.cc: New. * g++.dg/pph/c2meteor-contest.h: New. * g++.dg/pph/pph.map: Add c2meteor-contest.h diff --git a/gcc/testsuite/g++.dg/pph/c1meteor-contest.cc b/gcc/testsuite/g++.dg/pph/c1meteor-contest.cc index e745afe..ff1765c 100644 --- a/gcc/testsuite/g++.dg/pph/c1meteor-contest.cc +++ b/gcc/testsuite/g++.dg/pph/c1meteor-contest.cc @@ -1,4 +1,17 @@ -/* { dg-timeout 2 { target *-*-* } } */ -// { dg-xfail-if "INFINITE" { "*-*-*" } { "-fpph-map=pph.map" } } /* { dg-options "-w" } */ +/* { dg-do run } */ + #include "c1meteor-contest.h" + +int main(int argc, char **argv) { + if(argc > 1) + max_solutions = atoi(argv[1]); + calc_pieces(); + calc_rows(); + solve(0, 0); + printf("%d solutions found\n\n", solution_count); + qsort(solutions, solution_count, 50 * sizeof(signed char), solution_sort); + pretty(solutions[0]); + pretty(solutions[solution_count-1]); + return 0; +} diff --git a/gcc/testsuite/g++.dg/pph/c1meteor-contest.h b/gcc/testsuite/g++.dg/pph/c1meteor-contest.h index 3c465ab..698ccf5 100644 --- a/gcc/testsuite/g++.dg/pph/c1meteor-contest.h +++ b/gcc/testsuite/g++.dg/pph/c1meteor-contest.h @@ -36,8 +36,16 @@ POSSIBILITY OF SUCH DAMAGE. * contributed by Christian Vosteen */ -#include -#include +/* Simplified version of c2meteor-contest.h - Do not include other system + headers here. Simply forward declare the library functions used + by this header. */ +extern "C" { + typedef __SIZE_TYPE__ size_t; + void qsort(void *, size_t, size_t, int (*)(const void *, const void *)); + int printf(const char *, ...); + int atoi(const char *); +} + #define TRUE 1 #define FALSE 0 @@ -614,17 +622,4 @@ void pretty(signed char *b) { } printf("\n"); } - -int main(int argc, char **argv) { - if(argc > 1) - max_solutions = atoi(argv[1]); - calc_pieces(); - calc_rows(); - solve(0, 0); - printf("%d solutions found\n\n", solution_count); - qsort(solutions, solution_count, 50 * sizeof(signed char), solution_sort); - pretty(solutions[0]); - pretty(solutions[solution_count-1]); - return 0; -} #endif diff --git a/gcc/testsuite/g++.dg/pph/c2meteor-contest.cc b/gcc/testsuite/g++.dg/pph/c2meteor-contest.cc new file mode 100644 index 000..e35cca4 --- /dev/null +++ b/gcc/testsuite/g++.dg/pph/c2meteor-contest.cc @@ -0,0 +1,17 @@ +/* { dg-timeout 2 { target *-*-* } } */ +// { dg-xfail-if "INFINITE" { "*-*-*" } { "-fpph-map=pph.map" } } +/* { dg-options "-w" } */ +#include "c2meteor-contest.h" + +int main(int argc, char **argv) { + if(argc > 1) + max_solutions = atoi(argv[1]); + calc_pieces(); + calc_rows(); + solve(0, 0); + printf("%d solutions found\n\n", solution_count); + qsort(solutions, solution_count, 50 * sizeof(signed char), solution_sort); + pretty(solutions[0]); + pretty(solutions[solution_count-1]); + return 0; +} diff --git a/gcc/testsuite/g++.dg/pph/c2meteor-contest.h b/gcc/testsuite/g++.dg/pph/c2meteor-contest.h new file mode 100644 index 000..33a9907 --- /dev/null +++ b/gcc/testsuite/g++.dg/pph/c2meteor-contest.h @@ -0,0 +1,617 @@ +/* { dg-options "-w" } */ +#ifndef __PPH_GUARD_H +#define __PPH_GUARD_H +/* +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +* Neither the name of "The Computer Language Benchmarks Game" nor the +name of "The Computer Language Shootout Benchmarks" nor the names of +its contributors may be used to endorse or promote products derived +from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQU
Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area
On Mon, Jul 4, 2011 at 1:52 PM, H.J. Lu wrote: > On Mon, Jul 4, 2011 at 12:57 PM, Richard Sandiford > wrote: >> "H.J. Lu" writes: >>> RTL-based forward propagation pass shouldn't propagate hard register. >> >> That's seems a bit draconian. Many fixed hard registers ought to be OK. >> E.g. there doesn't seem to be anything wrong with propagating uses of >> the stack or frame pointers, subject to the usual availability checks. >> >> To play devil's advocate, an alternative might be to >> >> (a) make local_ref_killed_between_p return true for non-fixed hard >> registers when a call or asm comes between the two instructions >> >> (b) make use_killed_between return true for non-fixed hard registers >> when the instructions are in different basic blocks >> >> Thoughts? >> > > There are a few problems with this suggestions: > > 1. The comments says: > > /* If USE is a subreg, see if it can be replaced by a pseudo. */ > > static bool > forward_propagate_subreg (df_ref use, rtx def_insn, rtx def_set) > { > > It indicates this function is intended to work on pseudo registers. > > 2. propagate_rtx avoids hard registers: > > static rtx > propagate_rtx (rtx x, enum machine_mode mode, rtx old_rtx, rtx new_rtx, > bool speed) > { > rtx tem; > bool collapsed; > int flags; > > if (REG_P (new_rtx) && REGNO (new_rtx) < FIRST_PSEUDO_REGISTER) > return NULL_RTX; > > It seems that fwprop is intended to deal with pseudo registers. If we > want to extend it to hard registers, that should be a separate project. > > Thanks. forward_propagate_subreg issue was introduced by http://gcc.gnu.org/ml/gcc-patches/2009-08/msg01203.html Before that, fwprop never tries to work on hard registers. Alan, is your change to process hard registers intentional? Thanks. -- H.J.
Re: C++ PATCH to improve 'aka's on type printing in diagnostics
I thought of a different way to do it that would stay encapsulated in type_as_string, so this is the version I'm going to check in. Tested x86_64-pc-linux-gnu, applying to trunk. commit 689a3e58f4eebbcdafec81f06e8af699045fff3a Author: Jason Merrill Date: Fri Jul 1 00:16:46 2011 -0400 * error.c (type_to_string): Avoid redundant akas. diff --git a/gcc/cp/error.c b/gcc/cp/error.c index 7c90ec4..664b918 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -2634,14 +2634,28 @@ type_to_string (tree typ, int verbose) reinit_cxx_pp (); dump_type (typ, flags); + /* If we're printing a type that involves typedefs, also print the + stripped version. But sometimes the stripped version looks + exactly the same, so we don't want it after all. To avoid printing + it in that case, we play ugly obstack games. */ if (typ && TYPE_P (typ) && typ != TYPE_CANONICAL (typ) && !uses_template_parms (typ)) { + int aka_start; char *p; + struct obstack *ob = pp_base (cxx_pp)->buffer->obstack; + /* Remember the end of the initial dump. */ + int len = obstack_object_size (ob); tree aka = strip_typedefs (typ); pp_string (cxx_pp, " {aka"); pp_cxx_whitespace (cxx_pp); + /* And remember the start of the aka dump. */ + aka_start = obstack_object_size (ob); dump_type (aka, flags); pp_character (cxx_pp, '}'); + p = (char*)obstack_base (ob); + /* If they are identical, cut off the aka with a NUL. */ + if (memcmp (p, p+aka_start, len) == 0) + p[len] = '\0'; } return pp_formatted_text (cxx_pp); } diff --git a/gcc/testsuite/g++.dg/diagnostic/aka1.C b/gcc/testsuite/g++.dg/diagnostic/aka1.C new file mode 100644 index 000..37f8df9 --- /dev/null +++ b/gcc/testsuite/g++.dg/diagnostic/aka1.C @@ -0,0 +1,15 @@ +// Basic test for typedef stripping in diagnostics. + +struct A { + void f(); +}; + +void A::f() { + // We don't want an aka for the injected-class-name. + A a = 0; // { dg-error "type .A. requested" } +} + +typedef A B; + +// We do want an aka for a real typedef. +B b = 0; // { dg-error "B .aka A." }
Re: C++ PATCH to improve 'aka's on type printing in diagnostics
Jason Merrill writes: | I thought of a different way to do it that would stay encapsulated in | type_as_string, so this is the version I'm going to check in. OK, thanks. -- Gaby
C++ PATCH to improve pretty-printing of function calls
Before this patch, GCC described the candidate as template decltype (((TypeC*)this)->TypeC::b.template typename TypeA::type TypeB::fn [with int U = U, int N = 10, typename TypeA::type = TypeA::type]()) TypeC::fn() after the patch, it's template decltype (((TypeC*)this)->TypeC::b.fn()) TypeC::fn() it doesn't make any sense to have the template header or return type in the middle of an expression, nor to have the [with ...] template bindings. Tested x86_64-pc-linux-gnu, applying to trunk. commit 70816c82793a089f530a0df105c129aa9f6dfa65 Author: Jason Merrill Date: Sun Jul 3 17:25:40 2011 -0400 * error.c (dump_template_bindings): Don't print typenames for a partial instantiation. (dump_function_decl): If we aren't printing function arguments, print template arguments as rather than [with ...]. (dump_expr): Don't print return type or template header. [BASELINK]: Use BASELINK_FUNCTIONS rather than get_first_fn. * pt.c (dependent_template_arg_p): Handle null arg. diff --git a/gcc/cp/error.c b/gcc/cp/error.c index 664b918..b16fce6 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -307,9 +307,12 @@ dump_template_bindings (tree parms, tree args, VEC(tree,gc)* typenames) parms = TREE_CHAIN (parms); } + /* Don't bother with typenames for a partial instantiation. */ + if (VEC_empty (tree, typenames) || uses_template_parms (args)) +return; + FOR_EACH_VEC_ELT (tree, typenames, i, t) { - bool dependent = uses_template_parms (args); if (need_comma) pp_separate_with_comma (cxx_pp); dump_type (t, TFF_PLAIN_IDENTIFIER); @@ -317,11 +320,7 @@ dump_template_bindings (tree parms, tree args, VEC(tree,gc)* typenames) pp_equal (cxx_pp); pp_cxx_whitespace (cxx_pp); push_deferring_access_checks (dk_no_check); - if (dependent) - ++processing_template_decl; t = tsubst (t, args, tf_none, NULL_TREE); - if (dependent) - --processing_template_decl; pop_deferring_access_checks (); /* Strip typedefs. We can't just use TFF_CHASE_TYPEDEF because pp_simple_type_specifier doesn't know about it. */ @@ -1379,17 +1378,37 @@ dump_function_decl (tree t, int flags) if (show_return) dump_type_suffix (TREE_TYPE (fntype), flags); -} - /* If T is a template instantiation, dump the parameter binding. */ - if (template_parms != NULL_TREE && template_args != NULL_TREE) + /* If T is a template instantiation, dump the parameter binding. */ + if (template_parms != NULL_TREE && template_args != NULL_TREE) + { + pp_cxx_whitespace (cxx_pp); + pp_cxx_left_bracket (cxx_pp); + pp_cxx_ws_string (cxx_pp, M_("with")); + pp_cxx_whitespace (cxx_pp); + dump_template_bindings (template_parms, template_args, typenames); + pp_cxx_right_bracket (cxx_pp); + } +} + else if (template_args) { - pp_cxx_whitespace (cxx_pp); - pp_cxx_left_bracket (cxx_pp); - pp_cxx_ws_string (cxx_pp, M_("with")); - pp_cxx_whitespace (cxx_pp); - dump_template_bindings (template_parms, template_args, typenames); - pp_cxx_right_bracket (cxx_pp); + bool need_comma = false; + int i; + pp_cxx_begin_template_argument_list (cxx_pp); + template_args = INNERMOST_TEMPLATE_ARGS (template_args); + for (i = 0; i < TREE_VEC_LENGTH (template_args); ++i) + { + tree arg = TREE_VEC_ELT (template_args, i); + if (need_comma) + pp_separate_with_comma (cxx_pp); + if (ARGUMENT_PACK_P (arg)) + pp_cxx_left_brace (cxx_pp); + dump_template_argument (arg, TFF_PLAIN_IDENTIFIER); + if (ARGUMENT_PACK_P (arg)) + pp_cxx_right_brace (cxx_pp); + need_comma = true; + } + pp_cxx_end_template_argument_list (cxx_pp); } } @@ -1724,7 +1743,9 @@ dump_expr (tree t, int flags) case OVERLOAD: case TYPE_DECL: case IDENTIFIER_NODE: - dump_decl (t, (flags & ~TFF_DECL_SPECIFIERS) | TFF_NO_FUNCTION_ARGUMENTS); + dump_decl (t, ((flags & ~(TFF_DECL_SPECIFIERS|TFF_RETURN_TYPE +|TFF_TEMPLATE_HEADER)) + | TFF_NO_FUNCTION_ARGUMENTS)); break; case INTEGER_CST: @@ -2289,7 +2310,7 @@ dump_expr (tree t, int flags) break; case BASELINK: - dump_expr (get_first_fn (t), flags & ~TFF_EXPR_IN_PARENS); + dump_expr (BASELINK_FUNCTIONS (t), flags & ~TFF_EXPR_IN_PARENS); break; case EMPTY_CLASS_EXPR: diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 7236e7e..e7be08b 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -18848,7 +18848,7 @@ dependent_template_arg_p (tree arg) is dependent. This is consistent with what any_dependent_template_arguments_p [that calls this function] does. */ - if (arg == error_mark_node) + if (!arg || arg == error_mark_node) return true; if (TREE_CODE (arg) == ARGUMENT_PACK_SELECT) diff --git a/gcc/testsuite/g++.dg/cpp0x/diag1.C b/gcc/testsuite/g++.dg/cpp0x/diag1.C new file mode 100644 index 000..b3f30bc --- /dev/null +++ b/gcc/testsui
Re: C++ PATCH for c++/49003 (DR 1207, use of 'this' in trailing return type)
On 06/29/2011 05:15 PM, Jason Merrill wrote: This patch adds support for use of 'this' (implicitly or explicitly) in the trailing-return-type of a member function. The above patch wasn't enough, though. The following patch fixes some issues that arose with real uses, including mangling. Tested x86_64-pc-linux-gnu, applying to trunk. commit ef43a979a3f46150f383b9deab70dd412d66f96b Author: Jason Merrill Date: Sun Jul 3 17:14:56 2011 -0400 DR 1207 PR c++/49589 * mangle.c (write_expression): Handle 'this'. * parser.c (cp_parser_postfix_dot_deref_expression): Allow incomplete *this. * semantics.c (potential_constant_expression_1): Check that DECL_CONTEXT is set on 'this'. diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c index 134c9ea..81b772f 100644 --- a/gcc/cp/mangle.c +++ b/gcc/cp/mangle.c @@ -2495,6 +2495,11 @@ write_expression (tree expr) else if (TREE_CODE_CLASS (code) == tcc_constant || (abi_version_at_least (2) && code == CONST_DECL)) write_template_arg_literal (expr); + else if (code == PARM_DECL && DECL_ARTIFICIAL (expr)) +{ + gcc_assert (!strcmp ("this", IDENTIFIER_POINTER (DECL_NAME (expr; + write_string ("fpT"); +} else if (code == PARM_DECL) { /* A function parameter used in a late-specified return type. */ diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index d79326d..6bb15ed 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -5281,7 +5281,11 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser, postfix_expression); scope = NULL_TREE; } - else + /* Unlike the object expression in other contexts, *this is not + required to be of complete type for purposes of class member + access (5.2.5) outside the member function body. */ + else if (scope != current_class_ref + && !(processing_template_decl && scope == current_class_type)) scope = complete_type_or_else (scope, NULL_TREE); /* Let the name lookup machinery know that we are processing a class member access expression. */ diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index e29705c..619c058 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -7791,7 +7791,8 @@ potential_constant_expression_1 (tree t, bool want_rval, tsubst_flags_t flags) STRIP_NOPS (x); if (is_this_parameter (x)) { - if (DECL_CONSTRUCTOR_P (DECL_CONTEXT (x)) && want_rval) + if (want_rval && DECL_CONTEXT (x) + && DECL_CONSTRUCTOR_P (DECL_CONTEXT (x))) { if (flags & tf_error) sorry ("use of the value of the object being constructed " diff --git a/gcc/testsuite/g++.dg/abi/mangle48.C b/gcc/testsuite/g++.dg/abi/mangle48.C new file mode 100644 index 000..dc9c492 --- /dev/null +++ b/gcc/testsuite/g++.dg/abi/mangle48.C @@ -0,0 +1,23 @@ +// Testcase for 'this' mangling +// { dg-options -std=c++0x } + +struct B +{ + template U f(); +}; + +struct A +{ + B b; + // { dg-final { scan-assembler "_ZN1A1fIiEEDTcldtdtdefpT1b1fIT_EEEv" } } + template auto f() -> decltype (b.f()); + // { dg-final { scan-assembler "_ZN1A1gIiEEDTcldtptfpT1b1fIT_EEEv" } } + template auto g() -> decltype (this->b.f()); +}; + +int main() +{ + A a; + a.f(); + a.g(); +} commit acbc60694bf95f13f9088ed4d5b3d18780aaf754 Author: Jason Merrill Date: Mon Jul 4 10:44:29 2011 -0400 * cp-demangle.c (d_expression): Handle 'this'. (d_print_comp) [DEMANGLE_COMPONENT_FUNCTION_PARAM]: Likewise. diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index f136322..29badbb 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -2738,10 +2738,18 @@ d_expression (struct d_info *di) /* Function parameter used in a late-specified return type. */ int index; d_advance (di, 2); - index = d_compact_number (di); - if (index < 0) - return NULL; - + if (d_peek_char (di) == 'T') + { + /* 'this' parameter. */ + d_advance (di, 1); + index = 0; + } + else + { + index = d_compact_number (di) + 1; + if (index == 0) + return NULL; + } return d_make_function_param (di, index); } else if (IS_DIGIT (peek) @@ -4400,9 +4408,17 @@ d_print_comp (struct d_print_info *dpi, int options, return; case DEMANGLE_COMPONENT_FUNCTION_PARAM: - d_append_string (dpi, "{parm#"); - d_append_num (dpi, dc->u.s_number.number + 1); - d_append_char (dpi, '}'); + { + long num = dc->u.s_number.number; + if (num == 0) + d_append_string (dpi, "this"); + else + { + d_append_string (dpi, "{parm#"); + d_append_num (dpi, num); + d_append_char (dpi, '}'); + } + } return; case DEMANGLE_COMPONENT_GLOBAL_CONSTRUCTORS: diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected index 4980cf1..2dc74be 100644 --- a/libiberty/testsuite/demangle-expected +++ b/libiberty/testsuite/demangle-expected @@ -3905,6 +3905,10 @@ decltype ({parm#1}+{parm#2}) add(int, doubl
Re: C++ PATCH to improve pretty-printing of function calls
Jason Merrill writes: | Before this patch, GCC described the candidate as | | template decltype (((TypeC*)this)->TypeC::b.template | typename TypeA::type TypeB::fn [with int U = U, int N = 10, | typename TypeA::type = TypeA::type]()) TypeC::fn() ouch! | after the patch, it's | | template decltype (((TypeC*)this)->TypeC::b.fn()) TypeC::fn() | | it doesn't make any sense to have the template header or return type | in the middle of an expression, nor to have the [with ...] template | bindings. agreed. Thanks! -- Gaby
Re: [Patch 2/3] ARM 64 bit atomic operations
On 1 July 2011 20:38, Joseph S. Myers wrote: Hi Joseph, Thanks for your comments. > On Fri, 1 Jul 2011, Dr. David Alan Gilbert wrote: > >> +/* For write */ >> +#include >> +/* For abort */ >> +#include > > Please don't include system headers in libgcc without appropriate > inhibit_libc checks for bootstrap purposes. In this case, it would seem > better just to declare the functions you need. OK. >> +/* Check that the kernel has a new enough version at load */ >> +void __check_for_sync8_kernelhelper (void) > > Shouldn't this function be static? Yep. >> +{ >> + if (__kernel_helper_version < 5) >> + { >> + const char err[] = "A newer kernel is required to run this binary. >> (__kernel_cmpxchg64 helper)\n"; >> + /* At this point we need a way to crash with some information >> + for the user - I'm not sure I can rely on much else being >> + available at this point, so do the same as generic-morestack.c >> + write() and abort(). */ >> + write (2 /* stderr */, err, sizeof(err)); > > "write" is in the user's namespace in ISO C so it's not ideal to have a > call to it. If there isn't a reserved-namespace version, using the > syscall directly (hardcoding the syscall number) might be better. OK, fair enough. >> +void (*__sync8_kernelhelper_inithook[]) (void) __attribute__ ((section >> (".init_array"))) = { >> + &__check_for_sync8_kernelhelper >> +}; > > Shouldn't this also be static (marked "used" if needed)? Though I'd have > thought simply marking the function as a constructor would be simpler and > better OK, can do - I wasn't too sure if constructor would end up later in the initialisation - I was worrying whether that might end up after a C++ constructor that might actually use; (although I'm not actually sure if that's more or less likely to happen with constructor v init_array). Dave
Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area
On Mon, Jul 04, 2011 at 01:57:34PM -0700, H.J. Lu wrote: > forward_propagate_subreg issue was introduced by > > http://gcc.gnu.org/ml/gcc-patches/2009-08/msg01203.html > > Before that, fwprop never tries to work on hard registers. I question this claim. It seems to me that fwprop did look at paradoxical subregs of hard regs before my change. > Alan, > is your change to process hard registers intentional? I didn't set out to do anything special with hard regs one way or the other, just extended what was already done for paradoxical subregs to sign and zero extended subregs. -- Alan Modra Australia Development Lab, IBM
Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area
On Mon, Jul 4, 2011 at 4:54 PM, Alan Modra wrote: > On Mon, Jul 04, 2011 at 01:57:34PM -0700, H.J. Lu wrote: >> forward_propagate_subreg issue was introduced by >> >> http://gcc.gnu.org/ml/gcc-patches/2009-08/msg01203.html >> >> Before that, fwprop never tries to work on hard registers. > > I question this claim. It seems to me that fwprop did look at > paradoxical subregs of hard regs before my change. I should have said " fwprop never tries to work on zero/sign-extended hard registers." >> Alan, >> is your change to process hard registers intentional? > > I didn't set out to do anything special with hard regs one way or the > other, just extended what was already done for paradoxical subregs to > sign and zero extended subregs. > Does your change depend on processing zero/sign-extended hard registers? -- H.J.
libjava patches for RTEMS
Hi, GCJ is available on RTEMS/pc386. Here is the libjava testsuite result on RTEMS/pc386: === libjava Summary === # of expected passes2249 # of unexpected failures94 # of untested testcases 66 As the testsuite result is good enough, I think it's time to get the patch reviewed and merged into gcc. The patch is attached. :) Best Regards, Jie libjava.patch Description: Binary data
RE: [RFC] Add middle end hook for stack red zone size
PING... I just merged with the latest code base and generated new patch as attached. Thanks, -Jiangning > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Jiangning Liu > Sent: 2011年6月28日 4:38 PM > To: gcc-patches@gcc.gnu.org > Subject: [RFC] Add middle end hook for stack red zone size > > This patch is to fix PR38644, which is a bug with long history about > stack red zone access, and PR30282 is correlated. > > Originally red zone concept is not exposed to middle-end, and back-end > uses special logic to add extra memory barrier RTL and help the correct > dependence in middle-end. This way different back-ends must handle red > zone problem by themselves. For example, X86 target introduced function > ix86_using_red_zone() to judge red zone access, while POWER introduced > offset_below_red_zone_p() to judge it. Note that they have different > semantics, but the logic in caller sites of back-end uses them to > decide whether adding memory barrier RTL or not. If back-end > incorrectly handles this, bug would be introduced. > > Therefore, the correct method should be middle-end handles red zone > related things to avoid the burden in different back-ends. To be > specific for PR38644, this middle-end problem causes incorrect behavior > for ARM target. > This patch exposes red zone concept to middle-end by introducing a > middle-end/back-end hook TARGET_STACK_RED_ZONE_SIZE defined in > target.def, and by default its value is 0. Back-end may redefine this > function to provide concrete red zone size according to specific ABI > requirements. > > In middle end, scheduling dependence is modified by using this hook > plus checking stack frame pointer adjustment instruction to decide > whether memory references need to be all flushed out or not. In theory, > if TARGET_STACK_RED_ZONE_SIZE is defined correctly, back-end would not > be required to specially handle this scheduling dependence issue by > introducing extra memory barrier RTL. > > In back-end, the following changes are made to define the hook, > 1) For X86, TARGET_STACK_RED_ZONE_SIZE is redefined to be > ix86_stack_red_zone_size() in i386.c, which is an newly introduced > function. > 2) For POWER, TARGET_STACK_RED_ZONE_SIZE is redefined to be > rs6000_stack_red_zone_size() in rs6000.c, which is also a newly defined > function. > 3) For ARM and others, TARGET_STACK_RED_ZONE_SIZE is defined to be > default_stack_red_zone_size in targhooks.c, and this function returns 0, > which means ARM eabi and others don't support red zone access at all. > > In summary, the relationship between ABI and red zone access is like > below, > > - > | ARCH | ARM | X86 |POWER | others | > |--|---|---|---|| > |ABI | EABI | MS_64 | other | AIX | V4 || > |--|---|---|---||--|| > | RED ZONE | No | YES | No | YES | No | No | > |--|---|---|---||--|| > | RED ZONE SIZE| 0 | 128 | 0 |220/288 | 0 |0 | > - > > Thanks, > -Jiangning stack-red-zone-patch-38644-4.patch Description: Binary data
Re: PATCH [6/n]: Prepare x32: PR rtl-optimization/47449: Don't propagate hard register non-local goto save area
On Mon, Jul 04, 2011 at 05:09:28PM -0700, H.J. Lu wrote: > On Mon, Jul 4, 2011 at 4:54 PM, Alan Modra wrote: > > I didn't set out to do anything special with hard regs one way or the > > other, just extended what was already done for paradoxical subregs to > > sign and zero extended subregs. > > Does your change depend on processing zero/sign-extended > hard registers? At the time I wrote the patch I was more interested in pseudos. I expect that powerpc64 won't be greatly affected if hard regs were excluded from this fwprop optimization, but you need to discuss your patch with maintainers of this code. My opinion as a one-time contributor to fwprop doesn't count for much. -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] Fix PR49518
Richard Guenther wrote on 04/07/2011 03:30:59 PM: > > > > Richard Guenther wrote on 04/07/2011 02:38:50 PM: > > > > > Handling of negative steps broke one of the many asserts in > > > the vectorizer. The following patch drops one that I can't > > > make sense of. I think all asserts need comments - especially > > > this one would, as I can't see why using vf is correct to > > > test against and not nelements (and why <= vf and not < vf). > > > > There is an explanation 10 rows above the assert. It doesn't make sense to > > peel more than vf iterations (and not nelements, since for the case of > > multiple types it may help to align more data-refs - see the comment in the > > code). IIRC <= is for the case of aligned access, but I am not sure about > > that, so maybe you are right. > > > > I don't see how it is related to negative steps. > > > > I think that the real reason for this failure is that the loads are > > actually irrelevant (hence, vf=4 that doesn't take char loads into > > account), but we don't check that when we analyze data-refs. So, in my > > opinion, the proper fix will add such check. > > The following also works for me: > > Index: tree-vect-data-refs.c > === > --- tree-vect-data-refs.c (revision 175802) > +++ tree-vect-data-refs.c (working copy) > @@ -1495,6 +1495,9 @@ vect_enhance_data_refs_alignment (loop_v >stmt = DR_STMT (dr); >stmt_info = vinfo_for_stmt (stmt); > > + if (!STMT_VINFO_RELEVANT (stmt_info)) > + continue; > + >/* For interleaving, only the alignment of the first access > matters. */ >if (STMT_VINFO_STRIDED_ACCESS (stmt_info) > > does that look better or do you propose to clean the datarefs > vector from those references? Well, this is certainly enough to fix the PR. I am not sure if we can just remove these data-refs from the dependence checks. After that all the alignment and access checks are at least redundant. Thanks, Ira > > Thanks, > Richard.