Re: [PATCH][PING] Vectorize conversions directly
gcc-patches-ow...@gcc.gnu.org wrote on 22/11/2011 03:31:22 PM: > From: Ramana Radhakrishnan > > gcc/testsuite/lib/ > > * target-supports.exp (check_effective_target_vect_intfloat_cvt): True > > for ARM NEON. > > (check_effective_target_vect_uintfloat_cvt): Likewise. > > (check_effective_target_vect_intfloat_cvt): Likewise. > > (check_effective_target_vect_floatuint_cvt): Likewise. > > (check_effective_target_vect_floatint_cvt): Likewise. > > (check_effective_target_vect_extract_even_odd): Likewise. > > I'm not sure about enabling the vect_extract_even_odd case. If this > assumes the presence of an extract-even-odd from registers type > operation, then the Neon port doesn't really support vec_extract_even > / vec_extract_odd forms - You do have them in one single instruction > if you tried to load them from / or store them to memory which is the > vld2 / vst2 instruction while the register form of vuzp which reads > and writes to both source operands is not really supported directly > from the backend. Right. Dmitry, you can do this instead: Index: fast-math-pr35982.c === --- fast-math-pr35982.c (revision 181150) +++ fast-math-pr35982.c (working copy) @@ -20,7 +20,7 @@ return avg; } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_extract_even_odd } } } */ -/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { xfail vect_extract_even_odd } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_extract_even_odd || vect_strided2 } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { xfail { vect_extract_even_odd || vect_strided2 } } } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ Ira > > The other testsuite changes look OK to me. > > cheers > Ramana > > > >
Re: RFA: Fix PR middle-end/50074
On Sat, Nov 26, 2011 at 02:17:08AM -0500, Joern Rennecke wrote: > 2011-11-19 Joern Rennecke > > PR middle-end/50074 > * calls.c (mem_overlaps_already_clobbered_arg_p): > Return false if no outgoing arguments have been stored so far. Ok for trunk. > Index: calls.c > === > --- calls.c (revision 2195) > +++ calls.c (working copy) > @@ -1668,6 +1668,8 @@ mem_overlaps_already_clobbered_arg_p (rt > { >HOST_WIDE_INT i; > > + if (sbitmap_empty_p (stored_args_map)) > +return false; >if (addr == crtl->args.internal_arg_pointer) > i = 0; >else if (GET_CODE (addr) == PLUS Jakub
RFA: Fix PR tree-optimization/50802
With this rewrite of simplify_conversion_using_ranges we go back to the original problem of considering if a single conversion is sufficient considering the known input range. Bootstrapped and regtested on i686-pc-linux-gnu. 2011-11-18 Joern Rennecke PR tree-optimization/50802 * tree-vrp.c (simplify_conversion_using_ranges): Rewrite test considering what happens to ranges during sign changes and/or intermediate narrowing conversions. Index: tree-vrp.c === --- tree-vrp.c (revision 2195) +++ tree-vrp.c (working copy) @@ -7254,7 +7254,9 @@ simplify_conversion_using_ranges (gimple tree innerop, middleop, finaltype; gimple def_stmt; value_range_t *innervr; - double_int innermin, innermax, middlemin, middlemax; + bool inner_unsigned_p, middle_unsigned_p, final_unsigned_p; + unsigned inner_prec, middle_prec, final_prec; + double_int innermin, innermed, innermax, middlemin, middlemed, middlemax; finaltype = TREE_TYPE (gimple_assign_lhs (stmt)); if (!INTEGRAL_TYPE_P (finaltype)) @@ -7279,33 +7281,49 @@ simplify_conversion_using_ranges (gimple the middle conversion is removed. */ innermin = tree_to_double_int (innervr->min); innermax = tree_to_double_int (innervr->max); - middlemin = double_int_ext (innermin, TYPE_PRECISION (TREE_TYPE (middleop)), - TYPE_UNSIGNED (TREE_TYPE (middleop))); - middlemax = double_int_ext (innermax, TYPE_PRECISION (TREE_TYPE (middleop)), - TYPE_UNSIGNED (TREE_TYPE (middleop))); - /* If the middle values are not equal to the original values fail. - But only if the inner cast truncates (thus we ignore differences - in extension to handle the case going from a range to an anti-range - and back). */ - if ((TYPE_PRECISION (TREE_TYPE (innerop)) - > TYPE_PRECISION (TREE_TYPE (middleop))) - && (!double_int_equal_p (innermin, middlemin) - || !double_int_equal_p (innermax, middlemax))) -return false; + + inner_prec = TYPE_PRECISION (TREE_TYPE (innerop)); + middle_prec = TYPE_PRECISION (TREE_TYPE (middleop)); + final_prec = TYPE_PRECISION (finaltype); + + /* If the first conversion is not injective, the second must not + be widening. */ + if (double_int_cmp (double_int_sub (innermax, innermin), + double_int_mask (middle_prec), true) > 0 + && middle_prec < final_prec) +return false; + /* We also want a medium value so that we can track the effect that + narrowing conversions with sign change have. */ + inner_unsigned_p = TYPE_UNSIGNED (TREE_TYPE (innerop)); + if (inner_unsigned_p) +innermed = double_int_rshift (double_int_mask (inner_prec), + 1, inner_prec, false); + else +innermed = double_int_zero; + if (double_int_cmp (innermin, innermed, inner_unsigned_p) >= 0 + || double_int_cmp (innermed, innermax, inner_unsigned_p) >= 0) +innermed = innermin; + + middle_unsigned_p = TYPE_UNSIGNED (TREE_TYPE (middleop)); + middlemin = double_int_ext (innermin, middle_prec, middle_unsigned_p); + middlemed = double_int_ext (innermed, middle_prec, middle_unsigned_p); + middlemax = double_int_ext (innermax, middle_prec, middle_unsigned_p); + /* Require that the final conversion applied to both the original and the intermediate range produces the same result. */ + final_unsigned_p = TYPE_UNSIGNED (finaltype); if (!double_int_equal_p (double_int_ext (middlemin, - TYPE_PRECISION (finaltype), - TYPE_UNSIGNED (finaltype)), + final_prec, final_unsigned_p), double_int_ext (innermin, - TYPE_PRECISION (finaltype), - TYPE_UNSIGNED (finaltype))) + final_prec, final_unsigned_p)) + || !double_int_equal_p (double_int_ext (middlemed, + final_prec, final_unsigned_p), + double_int_ext (innermed, + final_prec, final_unsigned_p)) || !double_int_equal_p (double_int_ext (middlemax, - TYPE_PRECISION (finaltype), - TYPE_UNSIGNED (finaltype)), + final_prec, final_unsigned_p), double_int_ext (innermax, - TYPE_PRECISION (finaltype), - TYPE_UNSIGNED (finaltype + final_prec, final_unsigned_p))) return false; gimple_assign_set_rhs1 (stmt, innerop);
[PATCH, i386]: Fix sync long long failures on 32bit x86
On Fri, Nov 25, 2011 at 8:31 PM, Uros Bizjak wrote: > However, the patch uncovers certain problems with existing fild/fistpl > implementation of atomic load/store. It fails in several of thread > simulation tests, i.e. > > FAIL: gcc.dg/simulate-thread/atomic-load-longlong.c -O0 -g thread > simulation test > > with: > > 1: x/i $pc > > => 0x8048582 : fild -0x8(%ebp) > > 0x08048585 104 __atomic_store_n (&result, ret, > __ATOMIC_SEQ_CST); > > 1: x/i $pc > > => 0x8048585 : fistp 0x8049ac0 > > 0x0804858b 104 __atomic_store_n (&result, ret, > __ATOMIC_SEQ_CST); > > 1: x/i $pc > > => 0x804858b : lock orl $0x0,(%esp) > > FAIL: Invalid result returned from fetch At the end of the day, the problem was trivial, missing %Z suffix for fild and fistp instructions. Attached patch fixes all sync long long failures, including thread simulation tests. 2011-11-26 Uros Bizjak * config/i386/sync.md (movdi_via_fpu): Add %Z insn suffixes. Tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: sync.md === --- sync.md (revision 181736) +++ sync.md (working copy) @@ -123,7 +123,7 @@ DONE; }) -;; ??? From volume 3 section 7.1.1 Guaranteed Atomic Operations, +;; ??? From volume 3 section 8.1.1 Guaranteed Atomic Operations, ;; Only beginning at Pentium family processors do we get any guarantee of ;; atomicity in aligned 64-bit quantities. Beginning at P6, we get a ;; guarantee for 64-bit accesses that do not cross a cacheline boundary. @@ -281,7 +281,7 @@ (unspec:DI [(match_operand:DI 1 "memory_operand" "m")] UNSPEC_MOVA)) (clobber (match_operand:DF 2 "register_operand" "=f"))] "TARGET_80387" - "fild\t%1\;fistp\t%0" + "fild%Z1\t%1\;fistp%Z0\t%0" [(set_attr "type" "multi") ;; Worst case based on full sib+offset32 addressing modes (set_attr "length" "14")])
Re: Find more shrink-wrapping opportunities
Bernd Schmidt writes: > + CLEAR_HARD_REG_SET (set_regs); > + note_stores (PATTERN (scan), record_hard_reg_sets, > +&set_regs); > + if (CALL_P (scan)) > + IOR_HARD_REG_SET (set_regs, call_used_reg_set); > + for (link = REG_NOTES (scan); link; link = XEXP (link, 1)) > + if (REG_NOTE_KIND (link) == REG_INC) > + record_hard_reg_sets (XEXP (link, 0), NULL, &set_regs); > + > + if (TEST_HARD_REG_BIT (set_regs, srcreg) > + || reg_referenced_p (SET_DEST (set), > +PATTERN (scan))) > + { > + scan = NULL_RTX; > + break; > + } > + if (CALL_P (scan)) > + { > + rtx link = CALL_INSN_FUNCTION_USAGE (scan); > + while (link) > + { > + rtx tmp = XEXP (link, 0); > + if (GET_CODE (tmp) == USE > + && reg_referenced_p (SET_DEST (set), tmp)) > + break; > + link = XEXP (link, 1); > + } > + if (link) > + { > + scan = NULL_RTX; > + break; > + } > + } Could we use DF_REF_USES/DEFS here instead? I'd like to get a stage where new code should treat DF_REF_USES/DEFS as the default way of testing for register usage. I think this sort of ad-hoc liveness stuff should be a last resort, and should have a comment saying why DF_REF_USES/DEFS isn't suitable. Rather than walk all the instructions of intermediate blocks, you could just test DF_LR_BB_INFO (bb)->use to see whether the block uses the register at all. > + FOR_BB_INSNS_SAFE (entry_block, insn, curr) Is there any particular reason for doing a forward walk rather than a backward walk? A backward walk would allow chains to be moved, and would avoid the need for the quadraticness in the current: for each insn I in bb for each insn J after I in bb since you could then keep an up-to-date record of what registers are used or set by the instructions that you aren't moving. Also: +/* Look for sets of call-saved registers in the first block of the + function, and move them down into successor blocks if the register + is used only on one path. This exposes more opportunities for + shrink-wrapping. + These kinds of sets often occur when incoming argument registers are + moved to call-saved registers because their values are live across + one or more calls during the function. */ + +static void +prepare_shrink_wrap (basic_block entry_block) There's no check for call-savedness. Is that deliberate? I'm seeing a case where we have: (set (reg 2) (reg 15)) (set (reg 15) (reg 2)) (yes, it's silly, but bear with me) for call-clobbered registers 2 and 15. We move the second instruction as far as possible, making 2 live for much longer. So if the prologue uses 2 as a temporary register, this would actually prevent shrink-wrapping. The reason I'm suddenly "reviewing" the code now is that it doesn't prevent shrink-wrapping, because nothing adds register 2 to the liveness info of the affected blocks. The temporary prologue value of register 2 is then moved into register 15. Testcase is gcc.c-torture/execute/920428-2.c on mips64-linux-gnu cc1 (although any MIPS should do) with: -O2 -mabi=32 -mips16 -mno-shared -mabicalls -march=mips32 Richard
[Patch, Fortran] MOVE_ALLOC fixes
Dear all, (First, this is *not* for the 4.6/4.7 rejects-valid regression, which is related to intent(in) pointers with allocatable components.) When debugging an issue with with polymorphic arrays and MOVE_ALLOC, I got lost in the code generation of move_alloc - and didn't like the generated code. Thus, I have rewritten the trans*.c part of it. (It turned out that the issue, we had, was unrelated to move_alloc.) Changes: * Replace call to libgfortran by inline code (much faster and shorter code) * For arrays: Deallocate "from" (deep freeing) * For polymorphic arrays: set _vptr. Actually, the required code is rather simple: For move_alloc(from, to), one just needs to do: a) Deallocate "to", taking allocatable components and the polymorphic types into account (the latter is a to-do item, cf. PR 46174). b) Do a simple assignment: to = from namely: If both are scalar variables, those are pointers and one does a pointer assignment. If they are polymorphic and/or an array, one does a (nonpointer) assignment to the class container or the array descriptor. c) Setting "from = NULL" (nonpolymorphic scalars) or "from.data = NULL" (nonpolymorphic arrays) or "from._data = NULL" (polymorphic scalars) or "from._data.data = NULL" (polymorphic arrays). For (b) the current expr-ref-walking function for polymorphic arrays either give access to class._data or to class._vptr. It is extremly difficult to access "class" itself. Thus, I now do two assignments: One nonpointer one to array descriptor and one pointer assignment to the _vptr. Build and regtested with the trunk with Paul's polymorphic array patch applied. (I will do a bootstrap and regtest with a clean trunk before committal.) OK for the trunk? Tobias PS: I'll add _gfortran_move_alloc to the list of functions which can be removed after the ABI breakage. 2011-11-26 Tobias Burnus PR fortran/51306 PR fortran/48700 * check.c (gfc_check_move_alloc): Make sure that from/to are both polymorphic or neither. * trans-intrinsic.c (conv_intrinsic_move_alloc): Cleanup, generate inline code. 2011-11-26 Tobias Burnus PR fortran/51306 PR fortran/48700 * gfortran.dg/move_alloc_5.f90: Add dg-error. * gfortran.dg/select_type_23.f03: Add dg-error. * gfortran.dg/move_alloc_6.f90: New. diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index d9b9a9c..e2b0d66 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -2691,6 +2709,14 @@ gfc_check_move_alloc (gfc_expr *from, gfc_expr *to) if (same_type_check (to, 1, from, 0) == FAILURE) return FAILURE; + if (to->ts.type != from->ts.type) +{ + gfc_error ("The FROM and TO arguments in MOVE_ALLOC call at %L must be " + "either both polymorphic or both nonpolymorphic", + &from->where); + return FAILURE; +} + if (to->rank != from->rank) { gfc_error ("the '%s' and '%s' arguments of '%s' intrinsic at %L must " diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c index 4244570..37a1ba6 100644 --- a/gcc/fortran/trans-intrinsic.c +++ b/gcc/fortran/trans-intrinsic.c @@ -5892,7 +5892,7 @@ } -/* Generate code for SELECTED_REAL_KIND (P, R) intrinsic function. */ +/* Generate code for SELECTED_REAL_KIND (P, R, RADIX) intrinsic function. */ static void gfc_conv_intrinsic_sr_kind (gfc_se *se, gfc_expr *expr) @@ -7182,50 +7190,122 @@ conv_intrinsic_atomic_ref (gfc_code *code) static tree conv_intrinsic_move_alloc (gfc_code *code) { - if (code->ext.actual->expr->rank == 0) -{ - /* Scalar arguments: Generate pointer assignments. */ - gfc_expr *from, *to, *deal; - stmtblock_t block; - tree tmp; - gfc_se se; + stmtblock_t block; + gfc_expr *from_expr, *to_expr; + gfc_expr *to_expr2, *from_expr2; + gfc_se from_se, to_se; + gfc_ss *from_ss, *to_ss; + tree tmp; - from = code->ext.actual->expr; - to = code->ext.actual->next->expr; + gfc_start_block (&block); - gfc_start_block (&block); + from_expr = code->ext.actual->expr; + to_expr = code->ext.actual->next->expr; - /* Deallocate 'TO' argument. */ - gfc_init_se (&se, NULL); - se.want_pointer = 1; - deal = gfc_copy_expr (to); - if (deal->ts.type == BT_CLASS) - gfc_add_data_component (deal); - gfc_conv_expr (&se, deal); - tmp = gfc_deallocate_scalar_with_status (se.expr, NULL, true, - deal, deal->ts); - gfc_add_expr_to_block (&block, tmp); - gfc_free_expr (deal); + gfc_init_se (&from_se, NULL); + gfc_init_se (&to_se, NULL); - if (to->ts.type == BT_CLASS) - tmp = gfc_trans_class_assign (to, from, EXEC_POINTER_ASSIGN); - else - tmp = gfc_trans_pointer_assignment (to, from); - gfc_add_expr_to_block (&block, tmp); + if (from_expr->rank == 0) +{ + /* Deallocate "to". */ - if (from->ts.type == BT_CLASS) - tmp = gfc_trans_class_assign (from, gfc_get_null_expr (NULL), - EXEC_POINTER_ASSIGN); + if (from_expr->ts
[v3] partial fix for libstdc++/51296
This fixes the easy bits. PR libstdc++/51296 * testsuite/30_threads/thread/native_handle/typesizes.cc: Do not run on alpha*-*-osf*. * testsuite/30_threads/future/cons/constexpr.cc: Disable debug symbols. * testsuite/30_threads/shared_future/cons/constexpr.cc: Likewise. Tested x86_64-linux, committed to trunk. Index: testsuite/30_threads/thread/native_handle/typesizes.cc === --- testsuite/30_threads/thread/native_handle/typesizes.cc (revision 181460) +++ testsuite/30_threads/thread/native_handle/typesizes.cc (working copy) @@ -1,5 +1,5 @@ -// { dg-do run { target *-*-linux* *-*-solaris* *-*-cygwin alpha*-*-osf* mips-sgi-irix6* } } -// { dg-options " -std=gnu++0x -pthread" { target *-*-linux* alpha*-*-osf* mips-sgi-irix6* } } +// { dg-do run { target *-*-linux* *-*-solaris* *-*-cygwin mips-sgi-irix6* } } +// { dg-options " -std=gnu++0x -pthread" { target *-*-linux* mips-sgi-irix6* } } // { dg-options " -std=gnu++0x -pthreads" { target *-*-solaris* } } // { dg-options " -std=gnu++0x " { target *-*-cygwin } } // { dg-require-cstdint "" } Index: testsuite/30_threads/future/cons/constexpr.cc === --- testsuite/30_threads/future/cons/constexpr.cc (revision 181459) +++ testsuite/30_threads/future/cons/constexpr.cc (working copy) @@ -1,12 +1,12 @@ // { dg-do compile } -// { dg-options "-std=gnu++0x -fno-inline -save-temps" } +// { dg-options "-std=gnu++0x -fno-inline -save-temps -g0" } // { dg-require-cstdint "" } // { dg-require-gthreads "" } // { dg-require-atomic-builtins "" } // { dg-final { scan-assembler-not "_ZNSt6futureIvEC2Ev" } } // { dg-final { scan-assembler-not "_ZNSt6futureIiEC2Ev" } } -// Copyright (C) 2010 Free Software Foundation, Inc. +// Copyright (C) 2010, 2011 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the Index: testsuite/30_threads/shared_future/cons/constexpr.cc === --- testsuite/30_threads/shared_future/cons/constexpr.cc(revision 181459) +++ testsuite/30_threads/shared_future/cons/constexpr.cc(working copy) @@ -1,12 +1,12 @@ // { dg-do compile } -// { dg-options "-std=gnu++0x -fno-inline -save-temps" } +// { dg-options "-std=gnu++0x -fno-inline -save-temps -g0" } // { dg-require-cstdint "" } // { dg-require-gthreads "" } // { dg-require-atomic-builtins "" } // { dg-final { scan-assembler-not "_ZNSt13shared_futureIvEC2Ev" } } // { dg-final { scan-assembler-not "_ZNSt13shared_futureIiEC2Ev" } } -// Copyright (C) 2010 Free Software Foundation, Inc. +// Copyright (C) 2010, 2011 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the
[PATCH, testsuite]: Require vec_double for gcc.dg/vect/fast-math-vect-call-2.c
Hello! 2011-11-26 Uros Bizjak * gcc.dg/vect/fast-math-vect-call-2.c: Require vect_double effective target. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: gcc.dg/vect/fast-math-vect-call-2.c === --- gcc.dg/vect/fast-math-vect-call-2.c (revision 181739) +++ gcc.dg/vect/fast-math-vect-call-2.c (working copy) @@ -1,3 +1,5 @@ +/* { dg-require-effective-target vect_double } */ + #include "tree-vect.h" extern long int lrint (double);
Re: [PATCH] Ignore EDGE_PRESERVE in flow info verification (PR rtl-optimization/49912)
On 11/25/2011 10:34 AM, Jakub Jelinek wrote: > PR rtl-optimization/49912 > * cfgrtl.c (rtl_verify_flow_info_1): Ignore also EDGE_PRESERVE bit > when counting n_branch. > > * g++.dg/other/pr49912.C: New test. Ok. r~
Re: [PATCH] PR50325 store_bit_field: Fix for big endian targets
> Being in stage 3 shouldn't stop us trying to fix bugs in the compiler: > we're not in the final run-up to a release yet (that could still be five > months away if history is anything to go by). It should just be a > clamp-down on major new functionality. But Richard's proposed change wasn't really a bugfix (he also proposed a bugfix which should be applied of course) but rather a cleanup. As the ongoing story shows, this is a delicate area so, if we are to clean things up here at this stage, the level of testing should at least be adjusted. -- Eric Botcazou
[gcov] fix 51297
I've applied this patch to fix issue 51297. Solaris' bsearch blows up with a NULL array. I also noticed the possibility of passing NULL pointers to memcpy, so took the opportunity of fixing that too. Thanks to Eric for verifying the fix is good. nathan 2011-11-26 Nathan Sidwell PR gcov-profile/51297 * gcov.c (main): Allocate initial names and sources arrays. (find_source): Don't check for null name or source arrays here. Index: gcov.c === --- gcov.c (revision 181744) +++ gcov.c (working copy) @@ -406,6 +406,11 @@ main (int argc, char **argv) /* Handle response files. */ expandargv (&argc, &argv); + a_names = 10; + names = XNEWVEC (name_map_t, a_names); + a_sources = 10; + sources = XNEWVEC (source_t, a_sources); + argno = process_args (argc, argv); if (optind == argc) print_usage (true); @@ -874,8 +879,6 @@ find_source (const char *file_name) { /* Extend the name map array -- we'll be inserting one or two entries. */ - if (!a_names) - a_names = 10; a_names *= 2; name_map = XNEWVEC (name_map_t, a_names); memcpy (name_map, names, n_names * sizeof (*names)); @@ -894,8 +897,6 @@ find_source (const char *file_name) if (n_sources == a_sources) { - if (!a_sources) - a_sources = 10; a_sources *= 2; src = XNEWVEC (source_t, a_sources); memcpy (src, sources, n_sources * sizeof (*sources));
Re: [PATCH 0/5] Convert Sparc to atomic optabs
> The first four patches simply do the conversion, a piece at a time, > assuming the RMO for all cpus. The new form of the membar_v8 insn looks a bit strange since operand 1 is disregarded. If this is as intended, a comment would be in order. > The fifth patch adds the ability to explicitly set the memory model > for the program, and to adjust the barriers emitted based on that > memory model. If we agree on the spelling of that option (3 m's in > a row seems fairly harsh) I'll write some documentation for it. "memory model" is used consistently in the manual so, if the triple m is deemed too disturbing, -mmemory-model would be a little better. In any case, I don't expect it to be used at all in practice. And memory_order should be renamed into sparc_memory_model (or sparc_memmodel?) and the enum constants prefixed with MM (which is the name of the PSTATE register holding the value). > An unwritten sixth patch would allow the default memory model to be > set by the operating system, as Dave tells me that Linux always uses > TSO. And I guess when generating code specifically for Ultra-III, > which doesn't implement the more relaxed models. We should probably default to TSO for V8/V9 at this point. This builds fine on Solaris, 32-bit and 64-bit, but I guess I'm seeing the ICE Dave reported. Do you have a patchlet for it? -- Eric Botcazou
Re: [PATCH 0/5] Convert Sparc to atomic optabs
On 11/26/2011 01:58 PM, Eric Botcazou wrote: >> The first four patches simply do the conversion, a piece at a time, >> assuming the RMO for all cpus. > > The new form of the membar_v8 insn looks a bit strange since operand 1 is > disregarded. If this is as intended, a comment would be in order. Yeah, that's intended, as the fallback after we've matched the other two cases we can handle separately for TSO on v8. > "memory model" is used consistently in the manual so, if the triple m is > deemed > too disturbing, -mmemory-model would be a little better. In any case, I > don't > expect it to be used at all in practice. And memory_order should be renamed > into sparc_memory_model (or sparc_memmodel?) and the enum constants prefixed > with MM (which is the name of the PSTATE register holding the value). Ok. > This builds fine on Solaris, 32-bit and 64-bit, but I guess I'm seeing the > ICE > Dave reported. Do you have a patchlet for it? Try top-of-branch git://repo.or.cz/gcc/rth.git rth/atomic/sparc That's the last iteration I went through with Dave. r~
Adjust omp-low test for alignment
The m68k-linux failure for the various omp atomic tests is due to the fact that BIGGEST_ALIGNMENT is 16 bits on that platform. I think it's pretty reasonable to assume that if something is aligned to BIGGEST_ALIGNEMENT, then it can be considered "aligned". Tested on x86_64-linux and m68k-linux cross. r~ * omp-low.c (expand_omp_atomic): Assume anything aligned to BIGGEST_ALIGNMENT is aligned. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index a4bfb84..4e1c2ba 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -5501,7 +5501,9 @@ expand_omp_atomic (struct omp_region *region) unsigned int align = TYPE_ALIGN_UNIT (type); /* __sync builtins require strict data alignment. */ - if (exact_log2 (align) >= index) + /* ??? Assume BIGGEST_ALIGNMENT *is* aligned. */ + if (exact_log2 (align) >= index + || align * BITS_PER_UNIT >= BIGGEST_ALIGNMENT) { /* Atomic load. */ if (loaded_val == stored_val
Fix init_sync_optabs iteration
Testing on m68k-linux -mcpu=5206 (aka coldfire, aka no cas insn) revealed that we hadn't properly registered __sync_val_compare_and_swap_4. Oops. r~ * optabs.c (init_sync_libfuncs_1): Include max in iteration. diff --git a/gcc/optabs.c b/gcc/optabs.c index 1aafd28..0ce21e9 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -6606,7 +6606,7 @@ init_sync_libfuncs_1 (optab tab, const char *base, int max) buf[len + 2] = '\0'; mode = QImode; - for (i = 1; i < max; i *= 2) + for (i = 1; i <= max; i *= 2) { buf[len + 1] = '0' + i; set_optab_libfunc (tab, mode, buf);
Fix expand_atomic_fetch_op wrt unused_result
Testing on coldfire revealed a double call to library functions for some omp atomic calls. This was due to this function trying to return "target" for success for unused_result. Except that for unused_result, target is null, which indicates expansion failure. Oops. r~ * optabs.c (expand_atomic_fetch_op): Always return result. diff --git a/gcc/optabs.c b/gcc/optabs.c index 0ce21e9..a1917cc 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -8068,7 +8068,7 @@ expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, { /* If the result isn't used, no need to do compensation code. */ if (unused_result) - return target; + return result; /* Issue compensation code. Fetch_after == fetch_before OP val. Fetch_before == after REVERSE_OP val. */ @@ -8110,9 +8110,7 @@ expand_atomic_fetch_op (rtx target, rtx mem, rtx val, enum rtx_code code, result = emit_library_call_value (libfunc, NULL, LCT_NORMAL, mode, 2, addr, ptr_mode, val, mode); - if (unused_result) - return target; - if (fixup) + if (!unused_result && fixup) result = expand_simple_binop (mode, code, result, val, target, true, OPTAB_LIB_WIDEN); return result;
[libstdc++] doc/xml/manual/abi.xml -- fix references to GCC as well as GNU/Linux
The last remaining instance of Linux vs GNU/Linux in the libstdc++ documentation is fixed thusly. Applied. On the way I spotted an odd reference to GCC. Looking at the overall document, it occurs to me that - newer versions of GCC are not covered, and - references to GCC generally are of the form gcc-X.Y instead of GCC X.Y. Is this something one of you guys (libstdc++) could have a look at? Thanks, Gerald 2011-11-26 Gerald Pfeifer * doc/xml/manual/abi.xml (Prerequisites): Refer to GNU/Linux. Fix reference to GCC. Index: doc/xml/manual/abi.xml === --- doc/xml/manual/abi.xml (revision 181742) +++ doc/xml/manual/abi.xml (working copy) @@ -596,8 +596,8 @@ - Most modern Linux and BSD versions, particularly ones using - gcc-3.1.x tools and more recent vintages, will meet the + Most modern GNU/Linux and BSD versions, particularly ones using + GCC 3.1 and later, will meet the requirements above, as does Solaris 2.5 and up.
[libstdc++] scripts/run_doxygen comment tweak
And this completes the exercise for libstdc++. Fun, fun, fun. Gerald 2011-11-27 Gerald Pfeifer * scripts/run_doxygen (problematic): Change Linux reference to GNU/Linux. Index: scripts/run_doxygen === --- scripts/run_doxygen (revision 181742) +++ scripts/run_doxygen (working copy) @@ -267,8 +267,8 @@ rm stdheader # Some of the pages for generated modules have text that confuses certain -# implementations of man(1), e.g., Linux's. We need to have another top-level -# *roff tag to /stop/ the .SH NAME entry. +# implementations of man(1), e.g. on GNU/Linux. We need to have another +# top-level *roff tag to /stop/ the .SH NAME entry. problematic=`egrep --files-without-match '^\.SH SYNOPSIS' [A-Z]*.3` #problematic='Containers.3 Sequences.3 Assoc_containers.3 Iterator_types.3'
Re: [libstdc++] Reference GNU/Linux in doc/xml/manual/using.xml
On Sun, 13 Nov 2011, Jonathan Wakely wrote: > Would that be improved by replacing i386 with x86? > > libstdc++ omits several features when built for 80386 due to the lack > of certain atomic operations, so I think it might be useful if the > manual used x86 to refer to the generic architecture, as opposed to > i386 which makes me think of -march=i386. Sure, your logic makes a lot of sense. Happy to make this change for you. The patch below just went in. It is the last I have for libstdc++ for now. Would you mind regenerating the HTML files from their XML sources as we had discussed a bit ago? Thanks, Gerald 2011-11-27 Gerald Pfeifer * doc/xml/manual/using.xml (Prerequisites): Refer to x86 instead of i386. Index: doc/xml/manual/using.xml === --- doc/xml/manual/using.xml(revision 181742) +++ doc/xml/manual/using.xml(working copy) @@ -1269,7 +1269,7 @@ to display how ad hoc this is: On Solaris, both -pthreads and -threads (with subtly different meanings) are honored. On OSF, -pthread and -threads (with subtly different meanings) are - honored. On GNU/Linux i386, -pthread is honored. On FreeBSD, + honored. On GNU/Linux x86, -pthread is honored. On FreeBSD, -pthread is honored. Some other ports use other switches. AFAIK, none of this is properly documented anywhere other than in ``gcc -dumpspecs'' (look at lib and cpp entries).
Re: [Patch, wwwdocs, committed] Update Fortran section in gcc-4.7/changes.html
On Wed, 16 Nov 2011, Tobias Burnus wrote: I have committed the following patch for http://gcc.gnu.org/gcc-4.7/changes.html#fortran How about the following minor editorial change on top? Gerald Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v retrieving revision 1.63 diff -u -r1.63 changes.html --- changes.html23 Nov 2011 23:27:28 - 1.63 +++ changes.html27 Nov 2011 02:47:25 - @@ -432,7 +432,7 @@ only the addresses are printed. http://gcc.gnu.org/wiki/Fortran2003Status";>Fortran 2003: - Generic interface name which have the same name as derived types + Generic interface names which have the same name as derived types are now supported, which allows to write constructor functions. Note that Fortran does not support static constructor functions; only default initialization or an explicit structure-constructor
Re: libtool update
On Mon, 21 Nov 2011, Andi Kleen wrote: > It would be good if that could be done for 4.7. Then slim LTO > bootstrap has a chance to work. Are there any reasons left not to > update? That's one reason. The FreeBSD changes (which we now get separately) another. The GNU/Linux vs Linux fix that RMS reported another. Please. :-) Gerald
Re: [PATCH 3/4] hppa: Install __sync libfuncs for linux.
On Sat, 12 Nov 2011, Dave Anglin wrote: >> John, Richard, while you are at it, mind making this GNU/Linux per >> guidance from RMS? (That'll save us work later on.) > Yes. I don't want to participate in this controversy. That was more a rhetorical question. :-\ Addressing this is not really optional, so I bit the bullet and applied the patch below myself. Gerald 2011-11-27 Gerald Pfeifer * config/pa/pa-linux.h (TARGET_GAS): Remove comment. Index: config/pa/pa-linux.h === --- config/pa/pa-linux.h(revision 181742) +++ config/pa/pa-linux.h(working copy) @@ -133,7 +133,6 @@ } \ while (0) -/* Linux always uses gas. */ #undef TARGET_GAS #define TARGET_GAS 1
[wwwdocs] Document annotalysis branch in svn.html
Document the annotalysis branch, move the thread-annotations branch to the list of inactive branches and refer to annotalysis. As discussed with Delesley; applied. Gerald Index: svn.html === RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v retrieving revision 1.165 diff -u -r1.165 svn.html --- svn.html17 Aug 2011 01:24:27 - 1.165 +++ svn.html27 Nov 2011 06:46:31 - @@ -156,6 +156,14 @@ + annotalysis + This branch contains the implementation of thread safety annotations + and analysis (http://gcc.gnu.org/wiki/ThreadSafetyAnnotation";>http://gcc.gnu.org/wiki/ThreadSafetyAnnotation). + The branch is maintained by + mailto:deles...@google.com";>Delesley Hutchins. + Patches and discussion on this branch should be marked with the tag + [annotalysis] in the subject line. + struct-reorg-branch This branch is for the development of structure reorganization optimizations, including field reordering, structure splitting for @@ -278,13 +286,6 @@ Patches should be marked with the tag [stack] in the subject line. - thread-annotations - This branch contains the implementation of thread safety annotations - and analysis (http://gcc.gnu.org/wiki/ThreadSafetyAnnotation";>http://gcc.gnu.org/wiki/ThreadSafetyAnnotation). - The branch is maintained by mailto:l...@google.com";>Le-Chun Wu. - Patches and discussion on this branch should be marked with the tag - [thread-annotations] in the subject line. - rtl-fud-branch This branch is for the development of factored use-def chains as an SSA form for RTL. Patches should be marked with the tag @@ -657,6 +658,11 @@ coordinate work with others. This branch was maintained by the folks at Apple. It has been superseded by apple-local-200502-branch. + thread-annotations + This branch contained the implementation of thread safety annotations + and analysis (http://gcc.gnu.org/wiki/ThreadSafetyAnnotation";>http://gcc.gnu.org/wiki/ThreadSafetyAnnotation). + It was superseded by the annotalysis branch. + stree-branch This branch was for improving compilation speed and reducing memory use by representing declarations as small flat data structures whenever
[PR testsuite/47013] Fix SMS testsuite faliures (re-submission)
Hello, Attached is a new version of the patch. Thanks to Dominique Dhumieres for testing on powerpc-apple-darwin9. Tested ppc64-redhat-linux on with both -m32,-m64 and SPU. OK for mainline? Thanks, Revital testsuite/Changelog PR rtl-optimization/47013 * gcc.dg/sms-2.c: Change scan-tree-dump-times and the code itself to preserve the function. * gcc.dg/sms-6.c: Add --param sms-min-sc=1. Add dg-options for powerpc*-*-*. Avoid superfluous spaces in dg-final. * gcc.dg/sms-3.c: Add --param sms-min-sc=1 and -fmodulo-sched-allow-regmoves flags. * gcc.dg/sms-7.c: Likewise. Remove dg-final for powerpc*-*-* and avoid superfluous spaces in dg-final for spu-*-*. * gcc.dg/sms-4.c: Add dg-options for powerpc*-*-*. * gcc.dg/sms-8.c: Add --param sms-min-sc=1. Add dg-options and change scan-rtl-dump-times for powerpc*-*-*. * gcc.dg/sms-5.c: Add --param sms-min-sc=1 flag, remove powerpc*-*-* from dg-final and avoid superfluous spaces in dg-final. * gcc.dg/sms-9.c: Remove -fno-auto-inc-dec. Index: testsuite/gcc.dg/sms-2.c === --- testsuite/gcc.dg/sms-2.c(revision 181698) +++ testsuite/gcc.dg/sms-2.c(working copy) @@ -4,12 +4,11 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms" } */ - +int th, h, em, nlwm, nlwS, nlw, sy; void fun (nb) int nb; { - int th, h, em, nlwm, nlwS, nlw, sy; while (nb--) while (h--) @@ -33,5 +32,5 @@ fun (nb) } } -/* { dg-final { scan-rtl-dump-times "SMS succeeded" 1 "sms" { target spu-*-* powerpc*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "SMS loop many exits" 1 "sms" { target spu-*-* powerpc*-*-* } } } */ /* { dg-final { cleanup-rtl-dump "sms" } } */ Index: testsuite/gcc.dg/sms-6.c === --- testsuite/gcc.dg/sms-6.c(revision 181698) +++ testsuite/gcc.dg/sms-6.c(working copy) @@ -1,5 +1,6 @@ /* { dg-do run } */ -/* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms" } */ +/* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms --param sms-min-sc=1" } */ +/* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms --param sms-min-sc=1 -fmodulo-sched-allow-regmoves" { target powerpc*-*-* } } */ extern void abort (void); @@ -43,7 +44,7 @@ int main() return 0; } -/* { dg-final { scan-rtl-dump-times "SMS succeeded" 1 "sms" { target spu-*-* } } } */ -/* { dg-final { scan-rtl-dump-times "SMS succeeded" 3 "sms" { target powerpc*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "SMS succeeded" 1 "sms" { target spu-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "SMS succeeded" 3 "sms" { target powerpc*-*-* } } } */ /* { dg-final { cleanup-rtl-dump "sms" } } */ Index: testsuite/gcc.dg/sms-3.c === --- testsuite/gcc.dg/sms-3.c(revision 181698) +++ testsuite/gcc.dg/sms-3.c(working copy) @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O2 -fmodulo-sched -funroll-loops -fdump-rtl-sms" } */ +/* { dg-options "-O2 -fmodulo-sched -funroll-loops -fdump-rtl-sms --param sms-min-sc=1 -fmodulo-sched-allow-regmoves" } */ extern void abort (void); Index: testsuite/gcc.dg/sms-7.c === --- testsuite/gcc.dg/sms-7.c(revision 181698) +++ testsuite/gcc.dg/sms-7.c(working copy) @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O2 -fmodulo-sched -fstrict-aliasing -fdump-rtl-sms" } */ +/* { dg-options "-O3 -fmodulo-sched -fstrict-aliasing -fdump-rtl-sms -fmodulo-sched-allow-regmoves --param sms-min-sc=1" } */ extern void abort (void); @@ -44,7 +44,6 @@ int main() return 0; } -/* { dg-final { scan-rtl-dump-times "SMS succeeded" 1 "sms" { target spu-*-* } } } */ -/* { dg-final { scan-rtl-dump-times "SMS succeeded" 3 "sms" { target powerpc*-*-* } } } */ +/* { dg-final { scan-rtl-dump-times "SMS succeeded" 1 "sms" { target spu-*-* } } } */ /* { dg-final { cleanup-rtl-dump "sms" } } */ Index: testsuite/gcc.dg/sms-4.c === --- testsuite/gcc.dg/sms-4.c(revision 181698) +++ testsuite/gcc.dg/sms-4.c(working copy) @@ -1,6 +1,7 @@ /* Inspired from sbitmap_a_or_b_and_c_cg function in sbitmap.c. */ /* { dg-do run } */ /* { dg-options "-O2 -fmodulo-sched -fmodulo-sched-allow-regmoves -fdump-rtl-sms" } */ +/* { dg-options "-O2 -fmodulo-sched -fmodulo-sched-allow-regmoves -fdump-rtl-sms --param sms-min-sc=1" { target powerpc*-*-* } } */ extern void abort (void); Index: testsuite/gcc.dg/sms-8.c === --- testsuite/gcc.dg/sms-8.c(revision 181698) +++ testsuite/gcc.dg/sms-8.c(working copy) @@ -3,7 +3,8 @@ that was not fixed by reg-moves. */ /* { dg-