[Bug other/54182] -fvisibility=hidden shouldn't be disabled with -fPIE -pie

2012-08-11 Thread wbrana at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54182

--- Comment #18 from wbrana  2012-08-11 07:01:18 UTC 
---
I can use it, but other people don't have to know about this bug.


[Bug target/54226] New: Executables compiled with -pie do not work on NetBSD/sparc or sparc

2012-08-11 Thread martin at netbsd dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54226

 Bug #: 54226
   Summary: Executables compiled with -pie do not work on
NetBSD/sparc or sparc
Classification: Unclassified
   Product: gcc
   Version: 4.5.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: mar...@netbsd.org


Due to a missing -fPIC when compiling libgcc crtstuff, the binaries are not
actually position independent.

On NetBSD, a "hello world" compiled with gcc -fpie -pie hello.c links to a
binary like this:

a.out: file format elf64-sparc

Program Header:
PHDR off0x0040 vaddr 0x0040 paddr
0x0040 align 2**3
 filesz 0x0150 memsz 0x0150 flags r-x
  INTERP off0x0190 vaddr 0x0190 paddr
0x0190 align 2**0
 filesz 0x0017 memsz 0x0017 flags r--
LOAD off0x vaddr 0x paddr
0x align 2**20
 filesz 0x0b34 memsz 0x0b34 flags r-x
...

but of course the first section is not mapped at 0.

Fix is simple: set TARGET_LIBGCC2_CFLAGS in libgcc/config.host (via a new
additional ${tmake_file}) to -fPIC.

I can provide a simple patch doing that, however, I fail to see why this would
be a NetBSD speciality, i.e. why it works without that flags on other systems -
or maybe it just does not work there as well and we need a broader fix (same
solution, different match in config.host).


[Bug other/54182] -fvisibility=hidden shouldn't be disabled with -fPIE -pie

2012-08-11 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54182

Jakub Jelinek  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||INVALID

--- Comment #19 from Jakub Jelinek  2012-08-11 
07:32:52 UTC ---
It is not a bug and not everyone wants the same as you, e.g. you can't refer to
symbols in the PIE from plugins if you do.  When compiling PIE executables, you
shouldn't be using the -fPIC/-fpic flag, but -fPIE/-fpie instead, which works
similarly to -fvisibility=hidden in that references to those symbols are
cheaper (can use GOT/IP relative addressing on many architectures), but the
symbols are still exported.  Please stop reopening.


[Bug other/54182] -fvisibility=hidden shouldn't be disabled with -fPIE -pie

2012-08-11 Thread wbrana at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54182

--- Comment #20 from wbrana  2012-08-11 07:39:37 UTC 
---
Why -fvisibility=hidden is enabled by default without -fPIE, but disabled with
-fPIE?


[Bug other/54182] -fvisibility=hidden shouldn't be disabled with -fPIE -pie

2012-08-11 Thread wbrana at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54182

wbrana  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|INVALID |

--- Comment #21 from wbrana  2012-08-11 07:51:34 UTC 
---
People who don't want -fvisibility=hidden should use -fvisibility=normal and
-fvisibility=hidden should be enabled by default.


[Bug target/54226] Executables compiled with -pie do not work on NetBSD/sparc or sparc

2012-08-11 Thread sch...@linux-m68k.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54226

Andreas Schwab  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2012-08-11
 Ever Confirmed|0   |1

--- Comment #1 from Andreas Schwab  2012-08-11 08:25:43 
UTC ---
gcc 4.5 is no longer maintained, please try out 4.6 at least.

*-*-netbsd* (in 4.5/4.6) already uses t-netbsd and t-libgcc-pic (in gcc/config)
which causes both libgcc and crtstuff to be compiled with -fPIC.  In 4.7 and
later this is set by t-crtstuff-pic and t-libgcc-pic in libgcc/config.


[Bug middle-end/54224] Bogus -Wunused-function warning with static function

2012-08-11 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54224

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org
Summary|[4.8 Regression] Bogus  |Bogus -Wunused-function
   |-Wunused-function warning   |warning with static
   |with static function|function

--- Comment #1 from Tobias Burnus  2012-08-11 
09:03:06 UTC ---
I just realized that I misremembered the TREE_PUBLIC patch: It has been
committed after 4.7 was branched (cf. PR40973 and PR40973 / PR52916). Thus, it
is not a regression.


Another, possibly related question is: Why is the function "hello_integer" not
inlined into its only user? The compiler does realize that the variable is
local as in the "nm" output the "t" shows and the ".constprop.0" suffix with
-O2/-O3/-Ofast:

 t __mod_say_hello_MOD_hello_integer.constprop.0
0080 T __mod_say_hello_MOD_say_hello


[Bug fortran/54221] Explicit private access specifier signals "unexpected defined but not used [-Wunused-function]" warning

2012-08-11 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54221

--- Comment #2 from Tobias Burnus  2012-08-11 
09:06:21 UTC ---
(In reply to comment #1)
> a) There is a bogus warning. I think that's a middle-end bug
See PR 54224.

> Patch for issue (b):
Seems to work. One might have to do likewise for module variables
(gfc_finish_var_decl), cf. PR52751. [The original code has been added for
PR40973 and fixed for PR52916.]


[Bug target/54222] [avr] Implement fixed-point support

2012-08-11 Thread gjl at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54222

Georg-Johann Lay  changed:

   What|Removed |Added

  Attachment #27984|0   |1
is obsolete||

--- Comment #2 from Georg-Johann Lay  2012-08-11 
10:21:08 UTC ---
Created attachment 27988
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27988
Tentative patch against 4.8

Discussed here:

http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00632.html


[Bug target/54222] [avr] Implement fixed-point support

2012-08-11 Thread gjl at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54222

Georg-Johann Lay  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #3 from Georg-Johann Lay  2012-08-11 
10:26:19 UTC ---
...and here is the respective ChangeLogs for attachment 27988 from comment #2:

libgcc/
PR target/54222
* config/avr/lib1funcs-fixed.S: New file.
* config/avr/lib1funcs.S: Include it.  Undefine some divmodsi
after they are used.
(neg2, neg4): New macros.
* config/avr/avr-lib.h (TA, UTA): Adjust according to gcc's
avr-modes.def.
* config/avr/t-avr (LIB1ASMFUNCS): Add: _fractqqsf, _fractuqqsf,
_fracthqsf, _fractuhqsf, _fracthasf, _fractuhasf, _fractsasf,
_fractusasf, _fractsfqq, _fractsfuqq, _fractsfhq, _fractsfuhq,
_fractsfha, _fractsfsa, _mulqq3, _muluqq3, _mulhq3, _muluhq3,
_mulha3, _muluha3, _mulsa3, _mulusa3, _divqq3, _udivuqq3, _divhq3,
_udivuhq3, _divha3, _udivuha3, _divsa3, _udivusa3.

gcc/
PR target/54222
* avr-modes.def (HA, SA, DA, TA, UTA): Adjust modes.
* avr/avr-fixed.md: New file.
* avr/avr.md: Include it.
(cc): Add: minus.
(adjust_len): Add: minus, minus64, ufract, sfract.
(ALL1, ALL2, ALL4, ORDERED234): New mode iterators.
(MOVMODE): Add: QQ, UQQ, HQ, UHQ, HA, UHA, SQ, USQ, SA, USA.
(MPUSH): Add: HQ, UHQ, HA, UHA, SQ, USQ, SA, USA.
(pushqi1, xload8_A, xload_8, movqi_insn, *reload_inqi, addqi3,
subqi3, ashlqi3, *ashlqi3, ashrqi3, lshrqi3, *lshrqi3, *cmpqi,
cbranchqi4, *cpse.eq): Generalize to handle all 8-bit modes in ALL1.
(*movhi, reload_inhi, addhi3, *addhi3, addhi3_clobber, subhi3,
ashlhi3, *ashlhi3_const, ashrhi3, *ashirhi3_const, lshrhi3,
*lshrhi3_const, *cmphi, cbranchhi4): Generalize to handle all
16-bit modes in ALL2.
(subhi3, casesi, strlenhi): Add clobber when expanding minus:HI.
(*movsi, *reload_insi, addsi3, subsi3, ashlsi3, *ashlsi3_const,
ashrsi3, *ashrhi3_const, *ashrsi3_const, lshrsi3, *lshrsi3_const,
*reversed_tstsi, *cmpsi, cbranchsi4): Generalize to handle all
32-bit modes in ALL4.
* avr-dimode.md (ALL8): New mode iterator.
(adddi3, adddi3_insn, adddi3_const_insn, subdi3, subdi3_insn,
subdi3_const_insn, cbranchdi4, compare_di2,
compare_const_di2, ashrdi3, lshrdi3, rotldi3, ashldi3_insn,
ashrdi3_insn, lshrdi3_insn, rotldi3_insn): Generalize to handle
all 64-bit modes in ALL8.
* config/avr/avr-protos.h (avr_to_int_mode): New prototype.
(avr_out_fract, avr_out_minus, avr_out_minus64): New prototypes.
* config/avr/avr.c (TARGET_FIXED_POINT_SUPPORTED_P): Return true.
(avr_scalar_mode_supported_p): Allow if ALL_FIXED_POINT_MODE_P.
(avr_builtin_setjmp_frame_value): Use gen_subhi3 and return new
pseudo instead of gen_rtx_MINUS.
(avr_print_operand, avr_operand_rtx_cost): Handle: CONST_FIXED.
(notice_update_cc): Handle: CC_MINUS.
(output_movqi): Generalize to handle respective fixed-point modes.
(output_movhi, output_movsisf, avr_2word_insn_p): Ditto.
(avr_out_compare, avr_out_plus_1): Also handle fixed-point modes.
(avr_assemble_integer): Ditto.
(output_reload_in_const, output_reload_insisf): Ditto.
(avr_out_fract, avr_out_minus, avr_out_minus64): New functions.
(avr_to_int_mode): New function.
(adjust_insn_length): Handle: ADJUST_LEN_SFRACT,
ADJUST_LEN_UFRACT, ADJUST_LEN_MINUS, ADJUST_LEN_MINUS64.
* config/avr/predicates.md (const0_operand): Allow const_fixed.
(const_operand, const_or_immediate_operand): New.
(nonmemory_or_const_operand): New.
* config/avr/constraints.md (Ynn, Y00, Y01, Y02, Ym1, Ym2, YIJ):
New constraints.


[Bug target/54226] Executables compiled with -pie do not work on NetBSD/sparc or sparc

2012-08-11 Thread martin at netbsd dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54226

Martin Husemann  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution||FIXED

--- Comment #2 from Martin Husemann  2012-08-11 
10:39:35 UTC ---
I see - that is fine, and sorry I did not check newer branches before.


[Bug fortran/48636] Enable more inlining with -O2 and higher

2012-08-11 Thread jamborm at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48636

--- Comment #16 from Martin Jambor  2012-08-11 
10:50:29 UTC ---
Author: jamborm
Date: Sat Aug 11 10:50:24 2012
New Revision: 190313

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190313
Log:
2012-08-11  Martin Jambor  

PR fortran/48636
* ipa-inline.h (condition): New fields offset, agg_contents and by_ref.
* ipa-inline-analysis.c (agg_position_info): New type.
(add_condition): New parameter aggpos, also store agg_contents, by_ref
and offset.
(dump_condition): Also dump aggregate conditions.
(evaluate_conditions_for_known_args): Also handle aggregate
conditions.  New parameter known_aggs.
(evaluate_properties_for_edge): Gather known aggregate contents.
(inline_node_duplication_hook): Pass NULL known_aggs to
evaluate_conditions_for_known_args.
(unmodified_parm): Split into unmodified_parm and unmodified_parm_1.
(unmodified_parm_or_parm_agg_item): New function.
(set_cond_stmt_execution_predicate): Handle values passed in
aggregates.
(set_switch_stmt_execution_predicate): Likewise.
(will_be_nonconstant_predicate): Likewise.
(estimate_edge_devirt_benefit): Pass new parameter known_aggs to
ipa_get_indirect_edge_target.
(estimate_calls_size_and_time): New parameter known_aggs, pass it
recrsively to itself and to estimate_edge_devirt_benefit.
(estimate_node_size_and_time): New vector known_aggs, pass it o
functions which need it.
(remap_predicate): New parameter offset_map, use it to remap aggregate
conditions.
(remap_edge_summaries): New parameter offset_map, pass it recursively
to itself and to remap_predicate.
(inline_merge_summary): Also create and populate vector offset_map.
(do_estimate_edge_time): New vector of known aggregate contents,
passed to functions which need it.
(inline_read_section): Stream new fields of condition.
(inline_write_summary): Likewise.
* ipa-cp.c (ipa_get_indirect_edge_target): Also examine the aggregate
contents.  Let all local callers pass NULL for known_aggs.

* testsuite/gfortran.dg/pr48636.f90: New test.


Added:
trunk/gcc/testsuite/gfortran.dg/pr48636.f90
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-cp.c
trunk/gcc/ipa-inline-analysis.c
trunk/gcc/ipa-inline.h
trunk/gcc/ipa-prop.h
trunk/gcc/testsuite/ChangeLog


[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2012-08-11 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033

Marc Glisse  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||glisse at gcc dot gnu.org
 Resolution||FIXED

--- Comment #29 from Marc Glisse  2012-08-11 
11:26:40 UTC ---
I guess both subscript and shuffle are in, so I can close. There is already PR
53094 to track the constexpr issues.


[Bug tree-optimization/54227] New: [4.8 Regression]: [alpha] Variable arguments handling broken by r190229

2012-08-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54227

 Bug #: 54227
   Summary: [4.8 Regression]: [alpha] Variable arguments handling
broken by r190229
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ubiz...@gmail.com
Target: alphaev68-pc-linux-gnu


r190229 [1] totally broke varargs on alpha, resulting in some 400 unexpected
failures, in c-torture/execute, they are:

Running target unix
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/20020412-1.c execution,  -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/20041113-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/20041113-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/20041113-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/20041113-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/20041113-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/20041113-1.c execution,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/20041113-1.c execution,  -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/920501-8.c execution,  -O1 
FAIL: gcc.c-torture/execute/920501-8.c execution,  -O2 
FAIL: gcc.c-torture/execute/920501-8.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/920501-8.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/920501-8.c execution,  -Os 
FAIL: gcc.c-torture/execute/920501-8.c execution,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/920501-8.c execution,  -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/920625-1.c execution,  -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/920726-1.c execution,  -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/920908-1.c execution,  -O1 
FAIL: gcc.c-torture/execute/920908-1.c execution,  -O2 
FAIL: gcc.c-torture/execute/920908-1.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/920908-1.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/920908-1.c execution,  -Os 
FAIL: gcc.c-torture/execute/920908-1.c execution,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none 
FAIL: gcc.c-torture/execute/920908-1.c execution,  -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects 
FAIL: gcc.c-torture/execute/931004-10.c execution,  -O1 
FAIL: gcc.c-torture/execute/931004-10.c execution,  -O2 
FAIL: gcc.c-torture/execute/931004-10.c execution,  -O3 -fomit-frame-pointer 
FAIL: gcc.c-torture/execute/931004-10.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
FAIL: gcc.c-torture/execute/931004-10.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
FAIL: gcc.c-torture/execute/931004-10.c execution,  -O3 -g 
FAIL: gcc.c-torture/execute/931004-10.c execution,  -Os 
FAIL: gcc.c-torture/execute/931004-10.c execution,  -O2 -flto
-fno-use-linker-plugin -flto-p

[Bug tree-optimization/54227] [4.8 Regression]: [alpha] Variable arguments handling broken by r190229

2012-08-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54227

Uros Bizjak  changed:

   What|Removed |Added

   Keywords||wrong-code
 CC||rguenth at gcc dot gnu.org
   Target Milestone|--- |4.8.0

--- Comment #1 from Uros Bizjak  2012-08-11 11:45:38 
UTC ---
Adding CC.


[Bug tree-optimization/54227] [4.8 Regression]: [alpha] Variable arguments handling broken by r190229

2012-08-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54227

--- Comment #2 from Uros Bizjak  2012-08-11 12:23:43 
UTC ---
Created attachment 27989
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27989
Preprocessed source of the failing test

The difference between r190228 and r190229 when compiled with -O2 starts in:

--- r190228.i.076t.stdarg   2012-08-11 14:22:44.0 +0200
+++ r190229.i.076t.stdarg   2012-08-11 14:22:33.0 +0200
@@ -5,7 +5,7 @@
 bb4 will be executed at most once for each va_start in bb2
 bb6 will be executed at most once for each va_start in bb2
 bb8 will be executed at most once for each va_start in bb2
-test: va_list escapes 0, needs to save 32 GPR units and 1 FPR units.
+test: va_list escapes 0, needs to save 32 GPR units and 0 FPR units.
 test (int x)
 {
   struct va_list ap;


[Bug tree-optimization/54227] [4.8 Regression]: [alpha] Variable arguments handling broken by r190229

2012-08-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54227

--- Comment #3 from Uros Bizjak  2012-08-11 13:02:18 
UTC ---
Strange, there is no difference between
r190228.i.150t.optimized and r190229.i.150t.optimized,

but:

--- r190228.i.152r.vregs2012-08-11 14:53:20.0 +0200
+++ r190229.i.152r.vregs2012-08-11 14:53:26.0 +0200
@@ -1,88 +1,72 @@

 ;; Function test (test, funcdef_no=0, decl_uid=1378, cgraph_uid=0)

-(note 1 0 8 NOTE_INSN_DELETED)
+(note 1 0 4 NOTE_INSN_DELETED)
 ;; basic block 2, loop depth 0, count 0, freq 1, maybe hot
 ;;  prev block 0, next block 4, flags: (NEW, REACHABLE, RTL, MODIFIED)
 ;;  pred:   ENTRY [100.0%]  (FALLTHRU)
-(note 8 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
-(insn 2 8 3 2 (set (mem/c:DI (plus:DI (reg/f:DI 31 AP)
-(const_int 56 [0x38])) [0 S8 A8])
-(reg:DI 17 $17)) 20041113-1.c:4 236 {*movdi}
- (nil))
-(insn 3 2 4 2 (set (mem/c:DI (plus:DI (reg/f:DI 31 AP)
-(const_int 64 [0x40])) [0 S8 A8])
-(reg:DI 18 $18)) 20041113-1.c:4 236 {*movdi}
- (nil))
-(insn 4 3 5 2 (set (mem/c:DI (plus:DI (reg/f:DI 31 AP)
-(const_int 72 [0x48])) [0 S8 A8])
-(reg:DI 19 $19)) 20041113-1.c:4 236 {*movdi}
- (nil))
-(insn 5 4 6 2 (set (mem/c:DI (plus:DI (reg/f:DI 31 AP)
-(const_int 80 [0x50])) [0 S8 A8])
-(reg:DI 20 $20)) 20041113-1.c:4 236 {*movdi}
- (nil))
-(insn 6 5 7 2 (set (reg/v:DI 99 [ x ])
+(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
+(insn 2 4 3 2 (set (reg/v:DI 99 [ x ])
 (reg:DI 16 $16 [ x ])) 20041113-1.c:4 236 {*movdi}
  (nil))

A whole pack of moves to memory is missing.


[Bug fortran/46897] [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign

2012-08-11 Thread pault at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46897

Paul Thomas  changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot   |pault at gcc dot gnu.org
   |gnu.org |

--- Comment #8 from Paul Thomas  2012-08-11 13:18:01 
UTC ---
Created attachment 27990
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27990
An almost fix for the PR

This takes the last wrinkles out of the version that Alessandro and I last
worked on.

The final step is to add the code for pointer components and to prepare the
testcases.

Cheers

Paul


[Bug tree-optimization/54227] [4.8 Regression]: [alpha] Variable arguments handling broken by r190229

2012-08-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54227

--- Comment #4 from Uros Bizjak  2012-08-11 13:19:51 
UTC ---
(In reply to comment #3)

> A whole pack of moves to memory is missing.

Ah, this is the consequence of following difference in stdarg dumps:

-test: va_list escapes 0, needs to save 32 GPR units and 1 FPR units.
+test: va_list escapes 0, needs to save 32 GPR units and 0 FPR units.

In alpha.c, around line 6100, we have:

  if (cfun->va_list_fpr_size & 1)
{
  tmp = gen_rtx_MEM (BLKmode,
 plus_constant (Pmode, virtual_incoming_args_rtx,
(cum + 6) * UNITS_PER_WORD));
  MEM_NOTRAP_P (tmp) = 1;
  set_mem_alias_set (tmp, set);
  move_block_from_reg (16 + cum, tmp, count);
}


[Bug middle-end/54228] New: [4.6 Regression] 22_locale/num_put/put/char/9780-2.cc

2012-08-11 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54228

 Bug #: 54228
   Summary: [4.6 Regression] 22_locale/num_put/put/char/9780-2.cc
Classification: Unclassified
   Product: gcc
   Version: 4.6.4
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: hjl.to...@gmail.com


On Linux/x86, revision 190306 gave

FAIL: 22_locale/num_put/put/char/9780-2.cc execution test

Revision 189763 is OK.


[Bug lto/54229] New: [4.8 Regression] LTO is broken

2012-08-11 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54229

 Bug #: 54229
   Summary: [4.8 Regression] LTO is broken
Classification: Unclassified
   Product: gcc
   Version: 4.6.4
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: hjl.to...@gmail.com
CC: hubi...@gcc.gnu.org


On Linux/x86, revision 190312:

http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00289.html

failed most of LTO tests in GCC testsuite:

/usr/local/x86_64-unknown-linux-gnu/bin/ld: invalid symbol kind found^M
collect2: error: ld returned 1 exit status^M
compiler exited with status 1
output is:
/usr/local/x86_64-unknown-linux-gnu/bin/ld: invalid symbol kind found^M
collect2: error: ld returned 1 exit status^M

FAIL: gcc.c-torture/execute/builtins/abs-1.c compilation,  -O2 -flto
-flto-partition=none

Revision 190310 is OK.


[Bug tree-optimization/54227] [4.8 Regression]: [alpha] Variable arguments handling broken by r190229

2012-08-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54227

--- Comment #5 from Uros Bizjak  2012-08-11 14:18:23 
UTC ---
Patch in testing:

--cut here--
Index: config/alpha/alpha.c
===
--- config/alpha/alpha.c(revision 190311)
+++ config/alpha/alpha.c(working copy)
@@ -5942,7 +5942,7 @@

   base = get_base_address (base);
   if (TREE_CODE (base) != VAR_DECL
-  || !bitmap_bit_p (si->va_list_vars, DECL_UID (base)))
+  || !bitmap_bit_p (si->va_list_vars, DECL_UID (base) + num_ssa_names))
 return false;

   offset = gimple_op (stmt, 1 + offset_arg);
--cut here--


[Bug target/20020] x86_64 - 128 bit structs not targeted to TImode

2012-08-11 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20020

--- Comment #15 from H.J. Lu  2012-08-11 14:37:30 
UTC ---
Do we have a run-time testcase?


[Bug target/54227] [4.8 Regression]: [alpha] Variable arguments handling broken by r190229

2012-08-11 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54227

Uros Bizjak  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
URL||http://gcc.gnu.org/ml/gcc-p
   ||atches/2012-08/msg00681.htm
   ||l
  Component|tree-optimization   |target
 Resolution||FIXED
 AssignedTo|unassigned at gcc dot   |ubizjak at gmail dot com
   |gnu.org |

--- Comment #6 from Uros Bizjak  2012-08-11 14:47:47 
UTC ---
Fixed.


[Bug other/54182] -fvisibility=hidden shouldn't be disabled with -fPIE -pie

2012-08-11 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54182

Andrew Pinski  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||WONTFIX

--- Comment #22 from Andrew Pinski  2012-08-11 
15:05:39 UTC ---
Again the defaults are this way because it has been this way since the
beginning of time.  It would be hard to change the defaults to hidden
visibility without changing a lot of program's makefiles.


[Bug other/54182] -fvisibility=hidden shouldn't be disabled with -fPIE -pie

2012-08-11 Thread wbrana at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54182

--- Comment #23 from wbrana  2012-08-11 15:17:04 UTC 
---
Why lot of program's makefiles have to be changed?
If this change breaks some program, developers of that program will fix it.
You don't have to.
New versions of GCC always break many programs
see https://bugs.gentoo.org/show_bug.cgi?id=390247
This change will just another one of many changes.


[Bug other/54182] -fvisibility=hidden shouldn't be disabled with -fPIE -pie

2012-08-11 Thread wbrana at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54182

wbrana  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|WONTFIX |

--- Comment #24 from wbrana  2012-08-11 15:37:47 UTC 
---
I compiled many programs with -fPIE -pie -fvisibility=hidden and almost all
work fine. Very few broken ones could be easily fixed by adding
-fvisibility=normal.


[Bug lto/54229] [4.8 Regression] LTO is broken

2012-08-11 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54229

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #1 from Jan Hubicka  2012-08-11 
15:58:42 UTC ---
Fixed by my patch.


[Bug libstdc++/54228] [4.6 Regression] 22_locale/num_put/put/char/9780-2.cc

2012-08-11 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54228

H.J. Lu  changed:

   What|Removed |Added

  Component|middle-end  |libstdc++

--- Comment #1 from H.J. Lu  2012-08-11 16:21:07 
UTC ---
It is cause by newer glibc.  We need to backport
revision 182385 from trunk.


[Bug debug/54230] New: g++.dg/debug/dwarf2/pubnames-2.C failures on darwin12

2012-08-11 Thread howarth at nitro dot med.uc.edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54230

 Bug #: 54230
   Summary: g++.dg/debug/dwarf2/pubnames-2.C failures on darwin12
Classification: Unclassified
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: howa...@nitro.med.uc.edu


Created attachment 27991
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27991
assembly from failing m32 test of g++.dg/debug/dwarf2/pubnames-2.C on darwin12

The new g++.dg/debug/dwarf2/pubnames-2.C testcase produces the failures...

Native configuration is x86_64-apple-darwin12.0.0

=== g++ tests ===

Schedule of variations:
unix/-m32
unix/-m64

Running target unix/-m32
Using /sw/share/dejagnu/baseboards/unix.exp as board description file for
target.
Using /sw/share/dejagnu/config/unix.exp as generic interface file for target.
Using /Users/howarth/gcc-4.8-20120810/gcc/testsuite/config/default.exp as
tool-and-target-specific interface file.
Running
/Users/howarth/gcc-4.8-20120810/gcc/testsuite/g++.dg/debug/dwarf2/dwarf2.exp
...
FAIL: g++.dg/debug/dwarf2/pubnames-2.C scan-assembler .section\t.debug_pubnames
FAIL: g++.dg/debug/dwarf2/pubnames-2.C scan-assembler
"_GLOBAL__sub_I__ZN3one3c1vE0"+[ \t]+[#;]+[ \t]+external name
FAIL: g++.dg/debug/dwarf2/pubnames-2.C scan-assembler .section\t.debug_pubtypes

=== g++ Summary for unix/-m32 ===

# of expected passes1250
# of unexpected failures3
# of unsupported tests6
Running target unix/-m64
Using /sw/share/dejagnu/baseboards/unix.exp as board description file for
target.
Using /sw/share/dejagnu/config/unix.exp as generic interface file for target.
Using /Users/howarth/gcc-4.8-20120810/gcc/testsuite/config/default.exp as
tool-and-target-specific interface file.
Running
/Users/howarth/gcc-4.8-20120810/gcc/testsuite/g++.dg/debug/dwarf2/dwarf2.exp
...
FAIL: g++.dg/debug/dwarf2/pubnames-2.C scan-assembler .section\t.debug_pubnames
FAIL: g++.dg/debug/dwarf2/pubnames-2.C scan-assembler
"_GLOBAL__sub_I__ZN3one3c1vE0"+[ \t]+[#;]+[ \t]+external name
FAIL: g++.dg/debug/dwarf2/pubnames-2.C scan-assembler .section\t.debug_pubtypes

=== g++ Summary for unix/-m64 ===

# of expected passes1250
# of unexpected failures3
# of unsupported tests6

=== g++ Summary ===

# of expected passes2500
# of unexpected failures6
# of unsupported tests12
/Users/howarth/work/gcc/testsuite/g++/../../g++  version 4.8.0 20120709
(experimental) (GCC) 

at both r189392 when the testcase was introduced as well as in current gcc
trunk.


[Bug target/54089] [SH] Refactor shift patterns

2012-08-11 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089

--- Comment #12 from Oleg Endo  2012-08-11 
20:25:45 UTC ---
(In reply to comment #9)
> (In reply to comment #8)
> > #define SH_DYNAMIC_SHIFT_COST (TARGET_DYNSHIFT ? 1 : 20)
> 
> Sounds reasonable.  Perhaps some historical reason for the original
> one, though I don't know why.  Could you run SCiBE tests with it?

I've checked result-size for -m2a-single -mb -O2 ...
In total there's a decrease of 1728 bytes, with a few cases where there are
increases.  I've briefly checked out why code gets bigger in those cases.  As
far as I can see at the moment, one reason is because right shifts are
dynamicalized (converted to dynamic shift insns) although it's not really
beneficial.

I think I'll try to brush up the right shifts patterns first, then try again.


[Bug c/54231] New: LTO generates code for the wrong CPU if different options used

2012-08-11 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

 Bug #: 54231
   Summary: LTO generates code for the wrong CPU if different
options used
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: thi...@kde.org


Created attachment 27992
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27992
Makefile

Summary:

Given the following code:

=
#include 

void BZERO(char *ptr, size_t count)
{
__m128i zero = _mm_set1_epi8(0);
while (count--) {
_mm_stream_si128((__m128i*)ptr, zero);
ptr += 16;
}
}
=

When compiled twice, once for SSE2 and once for AVX (so we get VEX-prefixed
code), under LTO gcc will generate both cases using VEX. See the attached
Makefile.

Long description:

A library or program that attempts to determine at runtime whether certain CPU
features, like AVX support, may need to compile different compilation units
with different compiler flags. In the example I am providing, a simple function
that zeroes out a segment of memory aligned to 16 bytes. It's provided by the
same compilation unit which is compiled twice, but that does not seem to be
relevant.

The idea is that each of these two functions would be called by a dispatcher
function, after verifying the result of CPUID.

However, if you compile the code with LTO (e.g., by make CFLAGS=-flto with the
attached Makefile), GCC will apply the highest CPU setting to all compilation
units. This defeats the runtime detection technique: in this example, both
functions will contain AVX code, which would end up being run on computers
without AVX support.

This might be intentional. If so, please close this bug report.

However, I would recommend that the behaviour be fixed: the ability to use LTO
with different CPU settings would allow for better inlining of the functions
and suppressing unnecessary function calls. The bzero example is a good one.


[Bug c/54231] LTO generates code for the wrong CPU if different options used

2012-08-11 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #1 from Thiago Macieira  2012-08-11 22:30:50 
UTC ---
Created attachment 27993
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27993
main.c


[Bug c/54231] LTO generates code for the wrong CPU if different options used

2012-08-11 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #2 from Thiago Macieira  2012-08-11 22:33:31 
UTC ---
When adding the following source file to the library build:

#include 
void bzero_sse2(char *, size_t);
void bzero_avx(char *, size_t);

extern int avx_supported;

void my_bzero(char *ptr, size_t n)
{
if (avx_supported)
bzero_avx(ptr, n);
else
bzero_sse2(ptr, n);
}


and compiling everything with -O2 -flto, GCC produces the following function:

02e0 :
 2e0:   mov0x200171(%rip),%rax# 200458 
 2e7:   mov(%rax),%eax
 2e9:   test   %eax,%eax
 2eb:   jne310 
 2ed:   test   %rsi,%rsi
 2f0:   vpxor  %xmm0,%xmm0,%xmm0
 2f4:   je 30e 
 2f6:   nopw   %cs:0x0(%rax,%rax,1)
 300:   vmovntdq %xmm0,(%rdi)
 304:   add$0x10,%rdi
 308:   sub$0x1,%rsi
 30c:   jne300 
 30e:   repz retq 
 310:   test   %rsi,%rsi
 313:   je 30e 
 315:   vpxor  %xmm0,%xmm0,%xmm0
 319:   nopl   0x0(%rax)
 320:   vmovntdq %xmm0,(%rdi)
 324:   add$0x10,%rdi
 328:   sub$0x1,%rsi
 32c:   jne320 
 32e:   repz retq 

As can be seen, VEX-prefixed instructions were used in both cases.


[Bug c/54231] LTO generates code for the wrong CPU if different options used

2012-08-11 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #3 from Thiago Macieira  2012-08-11 22:36:20 
UTC ---
Another note: it appears the Intel compiler has the same bug. It produces the
following code when compiling with -O2 -ipo:


0340 :
 340:   dec%rsi
 343:   mov0x2001ae(%rip),%rax# 2004f8 <_DYNAMIC+0xe0>
 34a:   vpxor  %xmm0,%xmm0,%xmm0
 34e:   cmpl   $0x0,(%rax)
 351:   je 36c 
 353:   cmp$0x,%rsi
 357:   je 383 
 359:   dec%rsi
 35c:   vmovntdq %xmm0,(%rdi)
 360:   add$0x10,%rdi
 364:   cmp$0x,%rsi
 368:   jne359 
 36a:   jmp383 
 36c:   cmp$0x,%rsi
 370:   je 383 
 372:   dec%rsi
 375:   vmovntdq %xmm0,(%rdi)
 379:   add$0x10,%rdi
 37d:   cmp$0x,%rsi
 381:   jne372 
 383:   retq   
 384:   nopl   0x0(%rax,%rax,1)
 389:   nopl   0x0(%rax)

Note, additionally, that there's an instruction-scheduling issue: a VPXOR
instruction was scheduled to before the test of the CPU features.


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-11 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

Andrew Pinski  changed:

   What|Removed |Added

  Component|c   |lto
   Severity|normal  |enhancement

--- Comment #4 from Andrew Pinski  2012-08-11 
22:39:48 UTC ---
Basically the target attribute should come into play but that is currently not
really supported even without LTO.


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-11 Thread steven at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #5 from Steven Bosscher  2012-08-11 
22:46:31 UTC ---
"Fixing" this in the compiler isn't straight-forward. The _mm_stream functions
are just wrappers around builtin functions. It may work correctly if you put
the bzero functions in two separate files or call the builtins directly (a
variant of __builtin_ia32_movntdq in this case), but the way your BZERO is
defined, I don't think it will ever work.

Have you considered using ifunc?


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-11 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #6 from Thiago Macieira  2012-08-11 23:23:39 
UTC ---
(In reply to comment #5)
> "Fixing" this in the compiler isn't straight-forward. The _mm_stream functions
> are just wrappers around builtin functions. It may work correctly if you put
> the bzero functions in two separate files or call the builtins directly (a
> variant of __builtin_ia32_movntdq in this case), but the way your BZERO is
> defined, I don't think it will ever work.

They *are* in separate files already. Calling the builtin directly instead of
the intrinsic wrapper might work, but I did not test it because it's not
acceptable, as the code would be GCC-specific.

> Have you considered using ifunc?

IFUNC is also irrelevant: in order to use it, I need to have two separate
source files which are compiled with different compiler settings, so we end up
where we started: the bzero_sse2() function will have AVX code.


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-11 Thread steven at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

Steven Bosscher  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-08-12
 CC||uros at gcc dot gnu.org
 Ever Confirmed|0   |1

--- Comment #7 from Steven Bosscher  2012-08-12 
00:27:46 UTC ---
Actually, using the builtins also doesn't work. The instruction patterns are
the same and GCC recog's the "best" available one. E.g.:

#(insn:TI 14 12 27 3 (set (reg:V2DI 21 xmm0 [66])
#(const_vector:V2DI [
#(const_int 0 [0])
#(const_int 0 [0])
#])) /home/stevenb/devel/build-test/gcc/include/emmintrin.h:1424
 {*avx_movv2di_internal}
# (expr_list:REG_EQUIV (const_vector:V2DI [
#(const_int 0 [0])
#(const_int 0 [0])
#])
#(nil)))
vpxor   %xmm0, %xmm0, %xmm0 # 14*avx_movv2di_internal/1 [length
= 4]

vs.

#(insn:TI 14 12 27 3 (set (reg:V2DI 21 xmm0 [66])
#(const_vector:V2DI [
#(const_int 0 [0])
#(const_int 0 [0])
#])) /home/stevenb/devel/build-test/gcc/include/emmintrin.h:1424
1124 {*movv2di_internal}
# (expr_list:REG_EQUIV (const_vector:V2DI [
#(const_int 0 [0])
#(const_int 0 [0])
#])
#(nil)))
pxor%xmm0, %xmm0# 14*movv2di_internal/1 [length = 4]

These insns just look the same to GCC, so even if the sse2 builtin expander is
used, the AVX instruction is selected.

Thus a bug, confirmed. Adding i386 guy to CC.


[Bug target/54232] New: For x86 PIC code, ebx should be spillable

2012-08-11 Thread bugdal at aerifal dot cx
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54232

 Bug #: 54232
   Summary: For x86 PIC code, ebx should be spillable
Classification: Unclassified
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bug...@aerifal.cx


When generating x86 position-independent code, GCC permanently reserves EBX as
the GOT register. Even in functions that make no use of global data, EBX cannot
be used as a general-purpose register. This both slows down code that's under
register pressure and forces inline asm that needs an argument in EBX (e.g.
syscalls) to use ugly temp register shuffling to make gcc happy.

My proposal, and I understand this may be difficult but I still think it's
worth stating, is that the GOT register EBX should be considered spillable like
any other register. In particular, the following consequences should result:

- If a function is not using the GOT (not accessing global or file-local static
symbols or making non-hidden function calls), all GP registers can be used just
like in non-PIC code. A pure function with no

- If a function is only using a "GOT register" for PC-relative data access, it
should not go to the trouble of actually adjusting the PC obtained to point to
the GOT. Instead it should generate addressing relative to the PC address that
gets loaded into the register.

- In a function that's not making calls through the PLT (i.e. a leaf function
or a function that only calls hidden/protected functions), the "GOT register"
need not be EBX. Any register could be used, and in fact in some trivial
functions, using a call-clobbered register would avoid having to save/restore
EBX on the stack.

- In any function where EBX or any other register is being used to store the
GOT address, it should be spillable (either pushed to stack, or simply
discarded and reloaded with the standard load sequence when it's needed again
later) just like a register caching any other data, so that under register
pressure or inline asm constraints, the register becomes temporarily available
for another use.

It seems like all of these very positive consequences would fall out of just
treating GOT and GOT-relative addressing as address expressions based on the
GOT address, which could be cached in registers just like any other expression,
instead of hard-coding the GOT register as a special reserved register. The
only remaining special-case/hard-coding would be treating the need for EBX to
contain the GOT address when making calls through the PLT as an extra
constraint of the function call ABI.


[Bug target/54232] For x86 PIC code, ebx should be spillable

2012-08-11 Thread bugdal at aerifal dot cx
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54232

--- Comment #1 from Rich Felker  2012-08-12 04:57:07 
UTC ---
By the way, the code that inspired this report is crypt_blowfish.c and the
corresponding asm by Solar Designer. We've been experimenting with performance
characteristics while integrating it into musl libc, and I found that the C
code is just as fast as the hand-optimized asm on the machine I was testing it
on when using static libraries without -fPIC, but takes over 30% more runtime
when built with -fPIC due to running out of registers.