Re: PC-relative TLS support

2019-11-10 Thread Alan Modra
On Wed, Aug 21, 2019 at 09:55:28PM +0930, Alan Modra wrote:
> On Mon, Aug 19, 2019 at 07:45:19AM -0500, Segher Boessenkool wrote:
> > But if you think we can remove the !TARGET_TLS_MARKERS everywhere it
> > is relevant at all, now is the time, patches very welcome, it would be
> > a nice cleanup :-)  Needs testing everywhere of course, but now is
> > stage 1 :-)
> 
> This patch removes !TARGET_TLS_MARKERS support.  -mtls-markers (and
> -mno-tls-markers) disappear as valid options too, because I figure
> they haven't been used too much except by people testing the
> compiler.  Bootstrapped and regression tested powerpc64le-linux and
> powerpc-ibm-aix7.1.3.0 (on gcc111).  I believe powerpc*-darwin doesn't
> support TLS.
> 
> Requiring an 8 year old binutils-2.20 shouldn't be that onerous.
> 
> Note that this patch doesn't remove the configure test to set
> HAVE_AS_TLS_MARKERS.  I was wondering whether I ought to hook that
> into a "sorry, your assembler is too old" error?

https://gcc.gnu.org/ml/gcc-patches/2019-08/msg01487.html

I should have pinged this before now, and really I think the following
additional patch makes more sense than any sort of sorry message.
Mostly people will be running the assembler anyway so will discover
quickly that their assembler is too old.

* configure.ac (HAVE_AS_TLS_MARKERS): Delete test.
* configure: Regenerate.
* config.in: Regenerate.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index 5f32fd4d5e4..44d816630e9 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4811,12 +4811,6 @@ LCF0:
   [AC_DEFINE(HAVE_AS_GNU_ATTRIBUTE, 1,
  [Define if your assembler supports .gnu_attribute.])])
 
-gcc_GAS_CHECK_FEATURE([tls marker support],
-  gcc_cv_as_powerpc_tls_markers, [2,20,0],,
-  [ bl __tls_get_addr(x@tlsgd)],,
-  [AC_DEFINE(HAVE_AS_TLS_MARKERS, 1,
- [Define if your assembler supports arg info for __tls_get_addr.])])
-
 gcc_GAS_CHECK_FEATURE([prologue entry point marker support],
   gcc_cv_as_powerpc_entry_markers, [2,26,0],-a64 --fatal-warnings,
   [ .reloc .,R_PPC64_ENTRY; nop],,

>   * config/rs6000/rs6000-protos.h (rs6000_output_tlsargs): Delete.
>   * config/rs6000/rs6000.c (rs6000_output_tlsargs): Delete.
>   (rs6000_legitimize_tls_address): Remove !TARGET_TLS_MARKERS code.
>   (rs6000_call_template_1): Delete TARGET_TLS_MARKERS test and
>   allow other UNSPECs besides UNSPEC_TLSGD and UNSPEC_TLSLD.
>   (rs6000_indirect_call_template_1): Likewise.
>   (rs6000_pltseq_template): Likewise.
>   (rs6000_opt_vars): Remove "tls-markers" entry.
>   * config/rs6000/rs6000.h (TARGET_TLS_MARKERS): Don't define.
>   (IS_NOMARK_TLSGETADDR): Likewise.
>   * config/rs6000/rs6000.md (tls_gd): Replace TARGET_TLS_MARKERS
>   with !TARGET_XCOFF.
>   (tls_gd_high, tls_gd_low): Likewise.
>   (tls_ld, tls_ld_high, tls_ld_low): Likewise.
>   (pltseq_plt_pcrel): Likewise.
>   (call_value_local32): Remove IS_NOMARK_TLSGETADDR predicate test.
>   (call_value_local64): Likewise.
>   (call_value_indirect_nonlocal_sysv): Remove IS_NOMARK_TLSGETADDR
>   output and length attribute sub-expression.
>   (call_value_nonlocal_sysv),
>   (call_value_nonlocal_sysv_secure),
>   (call_value_local_aix, call_value_nonlocal_aix),
>   (call_value_indirect_aix, call_value_indirect_elfv2),
>   (call_value_indirect_pcrel): Likewise.
>   * config/rs6000/rs6000.opt (mtls-markers): Delete.
>   * doc/install.texi (powerpc-*-*): Require binutils-2.20.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PC-relative TLS support

2019-11-11 Thread Alan Modra
On Mon, Nov 11, 2019 at 05:56:47AM -0600, Segher Boessenkool wrote:
> On Wed, Aug 21, 2019 at 09:55:28PM +0930, Alan Modra wrote:
> > This patch removes !TARGET_TLS_MARKERS support.  -mtls-markers (and
> > -mno-tls-markers) disappear as valid options too, because I figure
> > they haven't been used too much except by people testing the
> > compiler.
> 
> Okay.
> 
> > (rs6000_call_template_1): Delete TARGET_TLS_MARKERS test and
> > allow other UNSPECs besides UNSPEC_TLSGD and UNSPEC_TLSLD.
> 
> Why is that?  Should we allow the other code that can happen and keep
> the gcc_unreachable?  Or do we know that no other code can happen here
> ever, and the extra documentation isn't useful?

The code in question is just printing the @tlsgd or @tlsld arg.  I
don't see any point in asserting that no other UNSPEC could ever be
used in a call operand.  Other places dealing with UNSPEC_TLSGD
and UNSPEC_TLSLD don't check, and if another UNSPEC is invented for
some fancy future call insn it's quite unlikely to want to output
anything here.

(I don't think I found such an UNSPEC already extant..)

> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md
> > @@ -9413,7 +9413,7 @@ (define_insn_and_split "*tls_gd"
> > (unspec:P [(match_operand:P 1 "rs6000_tls_symbol_ref" "")
> >(match_operand:P 2 "gpc_reg_operand" "b")]
> >   UNSPEC_TLSGD))]
> > -  "HAVE_AS_TLS && TARGET_TLS_MARKERS"
> > +  "HAVE_AS_TLS && !TARGET_XCOFF"
> 
> Should that be TARGET_ELF instead?

Either should work.  So, yes, probably better with TARGET_ELF.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC V9 patches, Add the PCREL_OPT optimization

2019-12-04 Thread Alan Modra
On Mon, Dec 02, 2019 at 06:07:23PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Nov 15, 2019 at 07:17:34PM -0500, Michael Meissner wrote:
> > This series of patches adds the PCREL_OPT optimization for the PC-relative
> > support in the PowerPC compiler.
> > 
> > This optimization convert a single load or store of an external variable to 
> > use
> > the R_PPC64_PCREL_OPT relocation.
> > 
> > For example, a normal load of an external int variable (with -mcpu=future)
> > would generate:
> > 
> > PLD 9,ext_symbol@got@pcrel(0),1
> > LWA 10,0(9)
> > 
> > That is, load the address of 'ext_symbol' into register 9.  If 'ext_symbol' 
> > is

If you want to show a load of an int ext_symbol, the example should be
 pld 9,ext_symbol@got@pcrel
 lwz 10,0(9)

Add "(0),1" to the end of the pld line to show the optional operands.

> > defined in another module in the main program, and the current module is 
> > also
> > in the main program, the linker will optimize this to:
> 
> What does "module" mean?  Translation unit?  Object file?  And "main
> program" is what ELF calls "executable", right?

"relocatable object file" and "executable or shared library"
respectively.  If the linker is creating a shared library, and
ext_symbol is local to that library by virtue of non-default symbol
visibility or symbol versioning, then the optimisation will be done
for shared libraries too.

> You don't need to say that here, it is not something the compiler can do
> anything about.  You could just say "if possible, the linker will..." etc.
> 
> > PADDI 9,ext_symbol(0),1
> > LWZ 10,0(9)
> 
> I don't think it will change an lwa insn to an lwz?  Probably it should
> be lwz throughout?

Yes, see above.  Changing got indirect to pc-relative (or toc-relative
for that matter) is something the linker does even in the absence of
PCREL_OPT relocs.  That's what Mike was trying to show with the above
transformation, modulo typos.

> Is that "paddi" syntax correct?  I think you might mean
> "paddi 9,0,ext_symbol,1", aka "pla 9,ext_symbol"?

No, it's not correct but your corrections aren't correct either.  :)

 pla 9,ext_symbol@pcrel  # add (0),1 for optional operands
or
 paddi 9,0,ext_symbol@pcrel,1

You'll get the wrong reloc without @pcrel.

> > If either the definition is not in the main program or we are linking for a
> > shared library, the linker will create an address in the .got section and 
> > do a
> > PLD of it:
> > 
> > .section .got
> > .got.ext_symbol:
> > .quad ext_symbol
> > 
> > .section .text
> > PLD 9,.got.ext_symbol(0),1
> > LWZ 10,0(9)
> 
> Like what the user wrote, sure -- the linker does not optimise it, does
> not change it?  Or am I missing something?
> 
> > If the only use of the GOT address is a single load and store, we can 
> > optimize
> > this further:
> 
> A single load *or* store.
> 
> > PLD 9,ext_symbol@got@pcrel(0),1
> > .Lpcrel1:
> > .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8)
> > LWZ 10,0(9)
> > 
> > In this case, if the variable is defined in another module for the main
> > program, and we are linking for the main program, the linker will transform
> > this to:
> > 
> > PLWZ 10,ext_symbol@pcrel(0),1
> > NOP
> > 
> > There can be arbitrary instructions between the PLD and the LWA (or STW).
> 
> ... because that is what that relocation means.  The compiler still has
> to make sure that any such insns should not prevent this transform.

Right, and that's the hard part of this transformation.

> > For either loads or store, register 9 must only be used in the load or 
> > store,
> > and must die at that point.
> > 
> > For loads, there must be no reference to register 10 between the PLD and the
> > LWZ.  For a store, register 10 must be live at the PLD instruction, and must
> > not be modified between the PLD and the STW instructions.
> 
> "No reference"...  Nothing indirect either (like from a function call,
> or simply some insn that does not name the register directly).  Or code
> like
> 
>   pld 9,ext_symbol@got@pcrel(0),1 ; .Lpcrel1:
>   .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8)
>   b 2f
> 
> here: # some code that does not explicitly reference r10 here,
>   # but r10 is live here nevertheless, and is used later
>   b somewhere_else
> 
> 2:lwz 10,0(9)
> 
> complicates your analysis, too.  So something DF is needed here, or
> there are lots and lots and lots of cases to look out for.
> 
> 
> Segher

-- 
Alan Modra
Australia Development Lab, IBM


Re: PowerPC V9 patches, Add the PCREL_OPT optimization

2019-12-04 Thread Alan Modra
On Wed, Dec 04, 2019 at 05:16:05PM -0600, Segher Boessenkool wrote:
> >  pla 9,ext_symbol@pcrel  # add (0),1 for optional operands
> 
> pla does not have optional operands like that?

It does, just like load/store insns.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 0/2] mmap: Avoid the sanitizer configure check failure

2024-04-09 Thread Alan Modra
On Tue, Apr 09, 2024 at 07:24:33AM -0700, H.J. Lu wrote:
> Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid
> the sanitizer configure check failure.

OK for binutils.  (I just fixed my local copy of autoconf so I
wouldn't run into this again.)  The proper fix of course is to update
autotools to something more recent.

-- 
Alan Modra
Australia Development Lab, IBM


powerpc64le multilibs and multiarch dir

2013-08-21 Thread Alan Modra
ion 0)
+++ gcc/config/rs6000/t-linux64bele (revision 0)
@@ -0,0 +1,7 @@
+#rs6000/t-linux64end
+
+MULTILIB_OPTIONS+= mlittle
+MULTILIB_DIRNAMES   += le
+MULTILIB_OSDIRNAMES += $(subst =,.mlittle=,$(subst lible32,lib32le,$(subst 
lible64,lib64le,$(subst lib,lible,$(subst 
-linux,le-linux,$(MULTILIB_OSDIRNAMES))
+MULTILIB_OSDIRNAMES += $(subst $(if $(findstring 
64,$(target)),m64,m32).,,$(filter $(if $(findstring 
64,$(target)),m64,m32).mlittle%,$(MULTILIB_OSDIRNAMES)))
+MULTILIB_MATCHES:= ${MULTILIB_MATCHES_ENDIAN}
Index: gcc/config/rs6000/t-linux64lebe
===
--- gcc/config/rs6000/t-linux64lebe (revision 0)
+++ gcc/config/rs6000/t-linux64lebe (revision 0)
@@ -0,0 +1,7 @@
+#rs6000/t-linux64leend
+
+MULTILIB_OPTIONS+= mbig
+MULTILIB_DIRNAMES   += be
+MULTILIB_OSDIRNAMES += $(subst =,.mbig=,$(subst libbe32,lib32be,$(subst 
libbe64,lib64be,$(subst lib,libbe,$(subst 
le-linux,-linux,$(MULTILIB_OSDIRNAMES))
+MULTILIB_OSDIRNAMES += $(subst $(if $(findstring 
64,$(target)),m64,m32).,,$(filter $(if $(findstring 
64,$(target)),m64,m32).mbig%,$(MULTILIB_OSDIRNAMES)))
+MULTILIB_MATCHES:= ${MULTILIB_MATCHES_ENDIAN}

-- 
Alan Modra
Australia Development Lab, IBM


Re: powerpc64le multilibs and multiarch dir

2013-08-22 Thread Alan Modra
On Thu, Aug 22, 2013 at 10:06:48AM -0400, David Edelsohn wrote:
> On Wed, Aug 21, 2013 at 11:57 PM, Alan Modra  wrote:
> 
> > Index: gcc/config/rs6000/t-linux64
> > ===
> > --- gcc/config/rs6000/t-linux64 (revision 201834)
> > +++ gcc/config/rs6000/t-linux64 (working copy)
> > @@ -25,8 +25,8 @@
> >  # it doesn't tell anything about the 32bit libraries on those systems.  Set
> >  # MULTILIB_OSDIRNAMES according to what is found on the target.
> >
> > -MULTILIB_OPTIONS= m64/m32
> > -MULTILIB_DIRNAMES   = 64 32
> > -MULTILIB_EXTRA_OPTS = fPIC
> > -MULTILIB_OSDIRNAMES= ../lib64$(call if_multiarch,:powerpc64-linux-gnu)
> > -MULTILIB_OSDIRNAMES+= $(if $(wildcard $(shell echo 
> > $(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
> > if_multiarch,:powerpc-linux-gnu)
> > +MULTILIB_OPTIONS:= m64/m32
> > +MULTILIB_DIRNAMES   := 64 32
> > +MULTILIB_EXTRA_OPTS :=
> > +MULTILIB_OSDIRNAMES := m64=../lib64$(call 
> > if_multiarch,:powerpc64-linux-gnu)
> > +MULTILIB_OSDIRNAMES += m32=$(if $(wildcard $(shell echo 
> > $(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
> > if_multiarch,:powerpc-linux-gnu)
> 
> What is the purpose of the change to MULTILIB_OSDIRNAMES? Why the
> addition of m64= and m32=? A secondary tmake file is not always set to
> post-process those macros, AFAICT.

That m64= is the newer syntax that specifies a mapping from a
MUTLILIB_OPTIONS selection.  And, yes, without another tmake file this
gives us exactly the same result as before.

I needed to use the new syntax to specify the correct os dirs when
adding cross-endian multilibs.

-- 
Alan Modra
Australia Development Lab, IBM


Re: powerpc64le multilibs and multiarch dir

2013-08-22 Thread Alan Modra
On Fri, Aug 23, 2013 at 01:45:14AM +0930, Alan Modra wrote:
> On Thu, Aug 22, 2013 at 10:06:48AM -0400, David Edelsohn wrote:
> > What is the purpose of the change to MULTILIB_OSDIRNAMES? Why the
> > addition of m64= and m32=? A secondary tmake file is not always set to
> > post-process those macros, AFAICT.
> 
> That m64= is the newer syntax that specifies a mapping from a
> MUTLILIB_OPTIONS selection.  And, yes, without another tmake file this
> gives us exactly the same result as before.
> 
> I needed to use the new syntax to specify the correct os dirs when
> adding cross-endian multilibs.

Without another tmake file it really is exactly the same as before.

old, after applying "make" string gunk
MULTILIB_OPTIONS = m64/m32
MULTILIB_OSDIRNAMES = ../lib64 ../lib

new
MULTILIB_OPTIONS = m64/m32
MULTILIB_OSDIRNAMES = m64=../lib64 m32=../lib

Either way, -m64 objects use ../lib64 and -m32 ../lib.


new with le multilib
MULTILIB_OPTIONS = m64/m32 mlittle
MULTILIB_OSDIRNAMES = m64=../lib64 m32=../lib m64.mlittle=../lib64le 
m32.mlittle=../lible

Trying to do the same with the old syntax isn't possible, because you
must specify 3 elements in MULTILIB_OSDIRNAMES to match the
combinations given in MULTILIB_OPTIONS, then the multilib machinery
mashes them together.  For instance

MULTILIB_OPTIONS = m64/m32 le
MULTILIB_OSDIRNAMES = ../lib64 ../lib le

results in
-m64 code to ../lib64
-m32 code to ../lib
-m64 -mlittle code to le
-m32 -mlittle code to ../lible

The reason you don't get ../lib64le for -m64 -mlittle is related to
-m64 being the default, -m64 is therefore omitted when compiling, and
just the "le" string used for the os multilib dir.

-- 
Alan Modra
Australia Development Lab, IBM


Re: libtool update for powerpc64le-linux

2013-08-22 Thread Alan Modra
On Fri, Aug 16, 2013 at 06:18:05PM +0930, Alan Modra wrote:
> I'd like to apply the following patch to the gcc repository (well,
> excluding the libgo part which I'm hoping someone will apply for me to
> the master go repository).  I know the normal procedure for autotools
> is to submit upstream then update when the patch is in the upstream
> autotools repository, but this simple libtool patch has been awaiting
> review for over two months.

The libtool.m4 patch has finally been reviewed and accepted upstream.

I'd like to import upstream libtool into gcc to support powerpc64le,
and please, can someone do the same for upstream libgo?  The reason we
need this patch is that on a powerpc64le linux host where the compiler
defaulted to producing 64-bit objects (which is how we generally build
compilers nowadays) libtool added -m elf64ppc to $LD.  Being the
option for 64-bit big-endian, that caused complete failure for
64-bit little-endian.

libjava is using a rather old version of libtool.  Importing doesn't
seem an option there, so for libjava patch acinclude.m4.  Please don't
ask me to modify libjava configure and makefiles to use current
libtool.  I tried, and failed.

* libtool.m4: Import upstream version.
libgo/
* config/libtool.m4: Import upstream version.
* libgo/configure: Regenerate.
libjava/libltdl/
* acinclude.m4 (_LT_ENABLE_LOCK ): Remove non-canonical
ppc host match.  Support little-endian powerpc linux hosts.
* configure: Regenerate.
gcc/
* configure: Regenerate.
libobjc/
* configure: Regenerate.
libgfortran/
* configure: Regenerate.
libffi/
* configure: Regenerate.
libssp/
* configure: Regenerate.
libitm/
* configure: Regenerate.
libgomp/
* configure: Regenerate.
libquadmath/
* configure: Regenerate.
libsanitizer/
* configure: Regenerate.
zlib/
* configure: Regenerate.
libstdc++-v3/
* configure: Regenerate.
libmudflap/
* configure: Regenerate.
boehm-gc/
* configure: Regenerate.
lto-plugin/
* configure: Regenerate.
libatomic/
* configure: Regenerate.
libbacktrace/
* configure: Regenerate.
libjava/
* configure: Regenerate.
libjava/classpath/
* configure: Regenerate.

Index: libjava/libltdl/acinclude.m4
===
--- libjava/libltdl/acinclude.m4(revision 200501)
+++ libjava/libltdl/acinclude.m4(working copy)
@@ -519,7 +519,7 @@
   rm -rf conftest*
   ;;
 
-x86_64-*linux*|ppc*-*linux*|powerpc*-*linux*|s390*-*linux*|sparc*-*linux*)
+x86_64-*linux*|powerpc*-*linux*|s390*-*linux*|sparc*-*linux*)
   # Find out which ABI we are using.
   echo 'int i;' > conftest.$ac_ext
   if AC_TRY_EVAL(ac_compile); then
@@ -529,7 +529,10 @@
 x86_64-*linux*)
   LD="${LD-ld} -m elf_i386"
   ;;
-ppc64-*linux*|powerpc64-*linux*)
+powerpc64le-*linux*)
+  LD="${LD-ld} -m elf32lppclinux"
+  ;;
+powerpc64-*linux*)
   LD="${LD-ld} -m elf32ppclinux"
   ;;
 s390x-*linux*)
@@ -545,7 +548,10 @@
 x86_64-*linux*)
   LD="${LD-ld} -m elf_x86_64"
   ;;
-ppc*-*linux*|powerpc*-*linux*)
+powerpcle-*linux*)
+  LD="${LD-ld} -m elf64lppc"
+  ;;
+    powerpc-*linux*)
   LD="${LD-ld} -m elf64ppc"
   ;;
 s390*-*linux*)

-- 
Alan Modra
Australia Development Lab, IBM


Re: [go-nuts] Re: libtool update for powerpc64le-linux

2013-08-23 Thread Alan Modra
On Thu, Aug 22, 2013 at 06:09:40PM -0700, Ian Lance Taylor wrote:
> On Thu, Aug 22, 2013 at 5:35 PM, Alan Modra  wrote:
> > On Fri, Aug 16, 2013 at 06:18:05PM +0930, Alan Modra wrote:
> >> I'd like to apply the following patch to the gcc repository (well,
> >> excluding the libgo part which I'm hoping someone will apply for me to
> >> the master go repository).  I know the normal procedure for autotools
> >> is to submit upstream then update when the patch is in the upstream
> >> autotools repository, but this simple libtool patch has been awaiting
> >> review for over two months.
> >
> > The libtool.m4 patch has finally been reviewed and accepted upstream.
> >
> > I'd like to import upstream libtool into gcc to support powerpc64le,
> > and please, can someone do the same for upstream libgo?  The reason we
> > need this patch is that on a powerpc64le linux host where the compiler
> > defaulted to producing 64-bit objects (which is how we generally build
> > compilers nowadays) libtool added -m elf64ppc to $LD.  Being the
> > option for 64-bit big-endian, that caused complete failure for
> > 64-bit little-endian.
> 
> I've updated libgo/config/libtool.m4 and libgo/configure on mainline.

Thanks Ian, and I'm glad to see you patched libtool.m4 rather than
importing upstream.  Testing showed that importing libtool.m4 from
upstream isn't such a great idea.  For starters, we need to also import
a new ltmain.sh, and then I ran into build problems in libsanitizer.

libtool: compile: unable to infer tagged configuration
libtool:   error: specify a tag with '--tag'
make[4]: *** [tsan_rtl_amd64.lo] Error 1

After rather a lot of time digging into libtool, I think this is due
to a bad IFS init.  The function that emits the error, func_infer_tag
tries to see whether libtool is using $CC and fails because $CC has
trailing spaces and a calculated $CC_expanded has tabs between the
words of $CC.

-- 
Alan Modra
Australia Development Lab, IBM


Re: powerpc64le multilibs and multiarch dir

2013-08-25 Thread Alan Modra
On Fri, Aug 23, 2013 at 09:41:28PM +, Joseph S. Myers wrote:
> On Thu, 22 Aug 2013, Alan Modra wrote:
> 
> > For multiarch, powerpc64le-linux now will use powerpc64le-linux-gnu.
> > Given a typical big-endian native toolchain with os dirs /lib and
> > /lib64, we'll use /lible and /lib64le if supporting little-endian as
> > well.  If you happen to use /lib and /lib32, then the little-endian
> > variants are /lible and /lib32le.  For completeness I also support
> > building big-endian multilibs on a little-endian host.
> 
> Given those directory names, what are the defined dynamic linker paths for 
> 32-bit and 64-bit little-endian (which can't depend on which of the 
> various directory arrangements may be in use)?

We haven't defined the little-endian ld.so variants yet.

> Does the Power Architecture support, in principle, a single system running 
> both big-endian and little-endian processes at the same time

It does.  

>, or is it a 
> matter of hardware configuration or boot-time setup?  Unless both can run 
> at once, it doesn't seem particularly useful to define separate 
> directories for big and little endian since a particular system would be 
> just one or the other.

We (IBM) don't intend to support running both big and little-endian
processes on the same system in the near future.  Perhaps I'm jumping
the gun in defining the multi-os dirs like /lible and /lib64le.  I did
that to make it easier for people ideologically opposed to multiarch
to set up a native powerpc64 compiler that supports both big and
little-endian compilation.  I know the multi-os dirs aren't strictly
needed to do that..  Should I not be defining them yet?

-- 
Alan Modra
Australia Development Lab, IBM


Re: libtool update for powerpc64le-linux

2013-08-25 Thread Alan Modra
On Fri, Aug 23, 2013 at 10:08:29PM +, Joseph S. Myers wrote:
> On Fri, 23 Aug 2013, Alan Modra wrote:
> 
> > I'd like to import upstream libtool into gcc to support powerpc64le,
> 
> Has the sysroot semantics issue been resolved in upstream libtool, or do 
> you mean "import with 3334f7ed5851ef1e96b052f2984c4acdbf39e20c reverted"?

As far as I can tell, upstream libtool hasn't changed its sysroot
support since that patch went in.  I wasn't even aware of the issue..

How did the gcc project get to the place where we aren't following our
own rules http://gcc.gnu.org/codingconventions.html regarding libtool?
We're supposed to get a patch reviewed upstream, applied, then import
the whole lot.  From the top-level ChangeLog, that hasn't happened
since 2009-12-05!  It must be a little disheartening to be a libtool
maintainer, when a major GNU project like gcc treats your work like
this.

-- 
Alan Modra
Australia Development Lab, IBM


Re: powerpc64le multilibs and multiarch dir

2013-08-26 Thread Alan Modra
On Sun, Aug 25, 2013 at 10:40:30PM -0700, Mike Stump wrote:
> On Aug 25, 2013, at 8:32 PM, Alan Modra  wrote:
> > We (IBM) don't intend to support running both big and little-endian
> > processes on the same system in the near future.  Perhaps I'm jumping
> > the gun in defining the multi-os dirs like /lible and /lib64le.
> 
> I'd recommend against multilibs, unless you have a need for them…

These multlibs are only added if you ask for them via --enable-targets.

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] powerpc64 -mcmodel=medium large symbol offsets

2013-09-06 Thread Alan Modra
The following testcase taken from the linux kernel is miscompiled on
powerpc64-linux.

/* -m64 -mcmodel=medium -O -S -fno-section-anchors */
static int x;

unsigned long
foo (void)
{
  return ((unsigned long) &x) - 0xc000;
}

generates
addis 3,2,x+4611686018427387904@toc@ha
addi 3,3,x+4611686018427387904@toc@l
blr

losing the top 32 bits of the offset.  Sadly, the assembler and linker
do not complain, which is a hole in the ABI.  (@ha and _HA relocs as
per the ABI won't complain about overflow since they might be used in
a @highesta, @highera sequence loading a 64-bit value.)

This patch stops combine merging large offsets into a symbol addend
by copying code from reg_or_add_cint_operand to a new predicate,
add_cint_operand, and using that to restrict the range of offsets.
Bootstrapped and regression tested powerpc64-linux.  OK to apply?

* config/rs6000/predicates.md (add_cint_operand): New.
* config/rs6000/rs6000.md (largetoc_high_plus): Restrict offset
using add_cint_operand.
(largetoc_high_plus_aix): Likewise.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 202264)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -376,6 +376,12 @@
   (ior (match_code "const_int")
(match_operand 0 "gpc_reg_operand")))
 
+;; Return 1 if op is a constant integer valid for addition with addis, addi.
+(define_predicate "add_cint_operand"
+  (and (match_code "const_int")
+   (match_test "(unsigned HOST_WIDE_INT) (INTVAL (op) + 0x80008000)
+   < (unsigned HOST_WIDE_INT) 0x1ll")))
+
 ;; Return 1 if op is a constant integer valid for addition
 ;; or non-special register.
 (define_predicate "reg_or_add_cint_operand"
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 202264)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -12207,7 +12209,7 @@
(unspec [(match_operand:DI 1 "" "")
 (match_operand:DI 2 "gpc_reg_operand" "b")]
UNSPEC_TOCREL)
-   (match_operand 3 "const_int_operand" "n"]
+   (match_operand 3 "add_cint_operand" "n"]
"TARGET_ELF && TARGET_CMODEL != CMODEL_SMALL"
"addis %0,%2,%1+%3@toc@ha")
 
@@ -12218,7 +12220,7 @@
(unspec [(match_operand:P 1 "" "")
 (match_operand:P 2 "gpc_reg_operand" "b")]
    UNSPEC_TOCREL)
-   (match_operand 3 "const_int_operand" "n"]
+   (match_operand 3 "add_cint_operand" "n"]
"TARGET_XCOFF && TARGET_CMODEL != CMODEL_SMALL"
"addis %0,%1+%3@u(%2)")
 

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] powerpc64 -mcmodel=medium large symbol offsets

2013-09-06 Thread Alan Modra
On Fri, Sep 06, 2013 at 02:18:49PM -0400, David Edelsohn wrote:
> On Fri, Sep 6, 2013 at 3:13 AM, Alan Modra  wrote:
> > The following testcase taken from the linux kernel is miscompiled on
> > powerpc64-linux.
> >
> > /* -m64 -mcmodel=medium -O -S -fno-section-anchors */
> > static int x;
> >
> > unsigned long
> > foo (void)
> > {
> >   return ((unsigned long) &x) - 0xc000;
> > }
> >
> > generates
> > addis 3,2,x+4611686018427387904@toc@ha
> > addi 3,3,x+4611686018427387904@toc@l
> > blr
> >
> > losing the top 32 bits of the offset.  Sadly, the assembler and linker
> > do not complain, which is a hole in the ABI.  (@ha and _HA relocs as
> > per the ABI won't complain about overflow since they might be used in
> > a @highesta, @highera sequence loading a 64-bit value.)
> >
> > This patch stops combine merging large offsets into a symbol addend
> > by copying code from reg_or_add_cint_operand to a new predicate,
> > add_cint_operand, and using that to restrict the range of offsets.
> > Bootstrapped and regression tested powerpc64-linux.  OK to apply?
> >
> > * config/rs6000/predicates.md (add_cint_operand): New.
> > * config/rs6000/rs6000.md (largetoc_high_plus): Restrict offset
> > using add_cint_operand.
> > (largetoc_high_plus_aix): Likewise.
> 
> This patch should include a testcase.
> 
> But what user feedback are you expecting if the offset is too large,
> such as your example? In my test with the patch, it produces an
> unrecognizable insn error, which seems less than friendly.

The testcase gives me

.L.foo:
lis 9,0x4000
    sldi 9,9,32
addis 3,2,x@toc@ha
addi 3,3,x@toc@l
add 3,3,9
blr

How did you manage to get an unrecognizable insn?  I can't see how we
generate the pattern except in combine.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] powerpc64 -mcmodel=medium large symbol offsets

2013-09-06 Thread Alan Modra
On Sat, Sep 07, 2013 at 09:06:08AM +0930, Alan Modra wrote:
> The testcase gives me
> 
> .L.foo:
>   lis 9,0x4000
>   sldi 9,9,32
>   addis 3,2,x@toc@ha
>   addi 3,3,x@toc@l
>   add 3,3,9
>   blr
> 
> How did you manage to get an unrecognizable insn?  I can't see how we
> generate the pattern except in combine.

Never mind.  I updated and rebuilt from a clean tree and now see the
failure too.  "tocrefdi" is where combine is still munging together
the large offset.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] powerpc64 -mcmodel=medium large symbol offsets

2013-09-09 Thread Alan Modra
Revised patch with testcase.  This one also fixes a small problem with
reg_or_add_cint_operand in that any 32-bit value is valid for SImode.
Compare with reg_or_sub_cint_operand.

Bootstrapped and regression tested powerpc64-linux.  OK to apply?

gcc/
* config/rs6000/predicates.md (add_cint_operand): New.
(reg_or_add_cint_operand): Use add_cint_operand.
* config/rs6000/rs6000.md (largetoc_high_plus): Restrict offset
using add_cint_operand.
(largetoc_high_plus_aix, small_toc_ref): Likewise.
gcc/testsuite/
* gcc.target/powerpc/medium_offset.c: New.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 202351)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -376,12 +376,18 @@
   (ior (match_code "const_int")
(match_operand 0 "gpc_reg_operand")))
 
+;; Return 1 if op is a constant integer valid for addition with addis, addi.
+(define_predicate "add_cint_operand"
+  (and (match_code "const_int")
+   (match_test "(unsigned HOST_WIDE_INT)
+ (INTVAL (op) + (mode == SImode ? 0x8000 : 0x80008000))
+   < (unsigned HOST_WIDE_INT) 0x1ll")))
+
 ;; Return 1 if op is a constant integer valid for addition
 ;; or non-special register.
 (define_predicate "reg_or_add_cint_operand"
   (if_then_else (match_code "const_int")
-(match_test "(unsigned HOST_WIDE_INT) (INTVAL (op) + 0x80008000)
-< (unsigned HOST_WIDE_INT) 0x1ll")
+(match_operand 0 "add_cint_operand")
 (match_operand 0 "gpc_reg_operand")))
 
 ;; Return 1 if op is a constant integer valid for subtraction
@@ -1697,7 +1703,7 @@
 (define_predicate "small_toc_ref"
   (match_code "unspec,plus")
 {
-  if (GET_CODE (op) == PLUS && CONST_INT_P (XEXP (op, 1)))
+  if (GET_CODE (op) == PLUS && add_cint_operand (XEXP (op, 1), mode))
 op = XEXP (op, 0);
 
   return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL;
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 202351)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -12207,7 +12209,7 @@
(unspec [(match_operand:DI 1 "" "")
 (match_operand:DI 2 "gpc_reg_operand" "b")]
UNSPEC_TOCREL)
-   (match_operand 3 "const_int_operand" "n"]
+   (match_operand:DI 3 "add_cint_operand" "n"]
"TARGET_ELF && TARGET_CMODEL != CMODEL_SMALL"
"addis %0,%2,%1+%3@toc@ha")
 
@@ -12218,7 +12220,7 @@
(unspec [(match_operand:P 1 "" "")
 (match_operand:P 2 "gpc_reg_operand" "b")]
UNSPEC_TOCREL)
-   (match_operand 3 "const_int_operand" "n"]
+   (match_operand:P 3 "add_cint_operand" "n"]
"TARGET_XCOFF && TARGET_CMODEL != CMODEL_SMALL"
"addis %0,%1+%3@u(%2)")
 
Index: gcc/testsuite/gcc.target/powerpc/medium_offset.c
===
--- gcc/testsuite/gcc.target/powerpc/medium_offset.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/medium_offset.c(revision 0)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O" } */
+/* { dg-final { scan-assembler-not "\\+4611686018427387904" } } */
+
+static int x;
+
+unsigned long
+foo (void)
+{
+  return ((unsigned long) &x) - 0xc000;
+}

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Fix PR58330 powerpc64 atomic store split in two

2013-09-09 Thread Alan Modra
This patch prevents the powerpc backend from combining a 64-bit
volatile load or store with a bswap insn when the resulting combined
insn will be implemented as two lwbrx or stwbrx machine insns.
Bootstrapped and regression tested powerpc64-linux.

PR target/58330
* config/rs6000/rs6000.md (bswapdi2_64bit): Disable for volatile mems.
gcc/testsuite/
* gcc.target/powerpc/pr58330.c: New.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 202351)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -2376,7 +2376,9 @@
(clobber (match_scratch:DI 3 "=&r,&r,&r"))
(clobber (match_scratch:DI 4 "=&r,X,&r"))]
   "TARGET_POWERPC64 && !TARGET_LDBRX
-   && (REG_P (operands[0]) || REG_P (operands[1]))"
+   && (REG_P (operands[0]) || REG_P (operands[1]))
+   && !(MEM_P (operands[0]) && MEM_VOLATILE_P (operands[0]))
+   && !(MEM_P (operands[1]) && MEM_VOLATILE_P (operands[1]))"
   "#"
   [(set_attr "length" "16,12,36")])
 
Index: gcc/testsuite/gcc.target/powerpc/pr58330.c
===
--- gcc/testsuite/gcc.target/powerpc/pr58330.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr58330.c  (revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O -mno-popcntb" } */
+/* { dg-final { scan-assembler-not "stwbrx" } } */
+
+void
+write_reverse (unsigned long *addr, unsigned long val)
+{
+  unsigned long reverse = __builtin_bswap64 (val);
+  __atomic_store_n (addr, reverse, __ATOMIC_RELAXED);
+}

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] powerpc64 -mcmodel=medium large symbol offsets

2013-09-11 Thread Alan Modra
On Mon, Sep 09, 2013 at 06:37:03PM +0930, Alan Modra wrote:
> gcc/
>   * config/rs6000/predicates.md (add_cint_operand): New.
>   (reg_or_add_cint_operand): Use add_cint_operand.
>   * config/rs6000/rs6000.md (largetoc_high_plus): Restrict offset
>   using add_cint_operand.
>   (largetoc_high_plus_aix, small_toc_ref): Likewise.
> gcc/testsuite/
>   * gcc.target/powerpc/medium_offset.c: New.

I missed seeing one testcase regression caused by this patch.  :-(
gcc.c-torture/compile/pr41634.c at -O3 gets an "insn does not satisfy
its constraints".  Fixed with the following.  OK to apply?

* config/rs6000/rs6000.c (toc_relative_expr_p): Use add_cint_operand.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 202428)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -5926,7 +5906,7 @@ toc_relative_expr_p (const_rtx op, bool strict)
 
   tocrel_base = op;
   tocrel_offset = const0_rtx;
-  if (GET_CODE (op) == PLUS && CONST_INT_P (XEXP (op, 1)))
+  if (GET_CODE (op) == PLUS && add_cint_operand (XEXP (op, 1), GET_MODE (op)))
 {
   tocrel_base = XEXP (op, 0);
   tocrel_offset = XEXP (op, 1);

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, PowerPC] Fix PR57949 (ABI alignment issue)

2013-09-11 Thread Alan Modra
On Wed, Aug 14, 2013 at 10:32:01AM -0500, Bill Schmidt wrote:
> This fixes a long-standing problem with GCC's implementation of the
> PPC64 ELF ABI.  If a structure contains a member requiring 128-bit
> alignment, and that structure is passed as a parameter, the parameter
> currently receives only 64-bit alignment.  This is an error, and is
> incompatible with correct code generated by the IBM XL compilers.

This caused multiple failures in the libffi testsuite:
libffi.call/cls_align_longdouble.c
libffi.call/cls_align_longdouble_split.c
libffi.call/cls_align_longdouble_split2.c
libffi.call/nested_struct5.c

Fixed by making the same alignment adjustment in libffi to structures
passed by value.  Bill, I think your patch needs to go on all active
gcc branches as otherwise we'll need different versions of libffi for
the next gcc releases.

The following was bootstrapped and regression checked powerpc64-linux.
OK for mainline, and the 4.7 and 4.8 branches when/if Bill's patch
goes in there?

* src/powerpc/ffi.c (ffi_prep_args64): Align FFI_TYPE_STRUCT.
(ffi_closure_helper_LINUX64): Likewise.

Index: libffi/src/powerpc/ffi.c
===
--- libffi/src/powerpc/ffi.c(revision 202428)
+++ libffi/src/powerpc/ffi.c(working copy)
@@ -462,6 +462,7 @@ ffi_prep_args64 (extended_cif *ecif, unsigned long
 double **d;
   } p_argv;
   unsigned long gprvalue;
+  unsigned long align;
 
   stacktop.c = (char *) stack + bytes;
   gpr_base.ul = stacktop.ul - ASM_NEEDS_REGISTERS64 - NUM_GPR_ARG_REGISTERS64;
@@ -532,6 +533,10 @@ ffi_prep_args64 (extended_cif *ecif, unsigned long
 #endif
 
case FFI_TYPE_STRUCT:
+ align = (*ptr)->alignment;
+ if (align > 16)
+   align = 16;
+ next_arg.ul = ALIGN (next_arg.ul, align);
  words = ((*ptr)->size + 7) / 8;
  if (next_arg.ul >= gpr_base.ul && next_arg.ul + words > gpr_end.ul)
{
@@ -1349,6 +1354,7 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure,
   long i, avn;
   ffi_cif *cif;
   ffi_dblfl *end_pfr = pfr + NUM_FPR_ARG_REGISTERS64;
+  unsigned long align;
 
   cif = closure->cif;
   avalue = alloca (cif->nargs * sizeof (void *));
@@ -1399,6 +1405,10 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure,
  break;
 
case FFI_TYPE_STRUCT:
+ align = arg_types[i]->alignment;
+ if (align > 16)
+   align = 16;
+ pst = ALIGN (pst, align);
 #ifndef __LITTLE_ENDIAN__
  /* Structures with size less than eight bytes are passed
 left-padded.  */


-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, PowerPC] Fix PR57949 (ABI alignment issue)

2013-09-11 Thread Alan Modra
On Wed, Sep 11, 2013 at 07:55:43AM -0500, Bill Schmidt wrote:
> On Wed, 2013-09-11 at 21:08 +0930, Alan Modra wrote:
> > On Wed, Aug 14, 2013 at 10:32:01AM -0500, Bill Schmidt wrote:
> > > This fixes a long-standing problem with GCC's implementation of the
> > > PPC64 ELF ABI.  If a structure contains a member requiring 128-bit
> > > alignment, and that structure is passed as a parameter, the parameter
> > > currently receives only 64-bit alignment.  This is an error, and is
> > > incompatible with correct code generated by the IBM XL compilers.
> > 
> > This caused multiple failures in the libffi testsuite:
> > libffi.call/cls_align_longdouble.c
> > libffi.call/cls_align_longdouble_split.c
> > libffi.call/cls_align_longdouble_split2.c
> > libffi.call/nested_struct5.c
> > 
> > Fixed by making the same alignment adjustment in libffi to structures
> > passed by value.  Bill, I think your patch needs to go on all active
> > gcc branches as otherwise we'll need different versions of libffi for
> > the next gcc releases.
> 
> Hm, the libffi case is unfortunate. :(
> 
> The alternative is to leave libffi alone, and require code that calls
> these interfaces with "bad" structs passed by value to be built using
> -mcompat-align-parm, which was provided for such compatibility issues.
> Hopefully there is a small number of cases where this can happen, and
> this could be documented with libffi and gcc.  What do you think?

We have precedent for compiling libffi based on gcc preprocessor
defines, eg. __NO_FPRS__, so here's a way of making upstream libffi
compatible with the various versions of gcc out there.  I've taken the
condition under which we align aggregates from
rs6000_function_arg_boundary, and defined a macro with a value of the
maximum alignment.

Bootstrapped and regression tested powerpc64-linux.  OK for mainline?

gcc/
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Define
__STRUCT_PARM_ALIGN__.
libffi/
* src/powerpc/ffi.c (ffi_prep_args64): Align FFI_TYPE_STRUCT.
(ffi_closure_helper_LINUX64): Likewise.

Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 202428)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -473,6 +473,12 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
   if (TARGET_SOFT_FLOAT || !TARGET_FPRS)
 builtin_define ("__NO_FPRS__");
 
+  /* Whether aggregates passed by value are aligned to a 16 byte boundary
+ if their alignment is 16 bytes or larger.  */
+  if ((TARGET_MACHO && rs6000_darwin64_abi)
+  || (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm))
+builtin_define ("__STRUCT_PARM_ALIGN__=16");
+
   /* Generate defines for Xilinx FPU. */
   if (rs6000_xilinx_fpu) 
 {
Index: libffi/src/powerpc/ffi.c
===
--- libffi/src/powerpc/ffi.c(revision 202428)
+++ libffi/src/powerpc/ffi.c(working copy)
@@ -462,6 +462,9 @@ ffi_prep_args64 (extended_cif *ecif, unsigned long
 double **d;
   } p_argv;
   unsigned long gprvalue;
+#ifdef __STRUCT_PARM_ALIGN__
+  unsigned long align;
+#endif
 
   stacktop.c = (char *) stack + bytes;
   gpr_base.ul = stacktop.ul - ASM_NEEDS_REGISTERS64 - NUM_GPR_ARG_REGISTERS64;
@@ -532,6 +535,12 @@ ffi_prep_args64 (extended_cif *ecif, unsigned long
 #endif
 
case FFI_TYPE_STRUCT:
+#ifdef __STRUCT_PARM_ALIGN__
+ align = (*ptr)->alignment;
+ if (align > __STRUCT_PARM_ALIGN__)
+   align = __STRUCT_PARM_ALIGN__;
+ next_arg.ul = ALIGN (next_arg.ul, align);
+#endif
  words = ((*ptr)->size + 7) / 8;
  if (next_arg.ul >= gpr_base.ul && next_arg.ul + words > gpr_end.ul)
{
@@ -1349,6 +1358,9 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure,
   long i, avn;
   ffi_cif *cif;
   ffi_dblfl *end_pfr = pfr + NUM_FPR_ARG_REGISTERS64;
+#ifdef __STRUCT_PARM_ALIGN__
+  unsigned long align;
+#endif
 
   cif = closure->cif;
   avalue = alloca (cif->nargs * sizeof (void *));
@@ -1399,6 +1411,12 @@ ffi_closure_helper_LINUX64 (ffi_closure *closure,
  break;
 
case FFI_TYPE_STRUCT:
+#ifdef __STRUCT_PARM_ALIGN__
+ align = arg_types[i]->alignment;
+ if (align > __STRUCT_PARM_ALIGN__)
+   align = __STRUCT_PARM_ALIGN__;
+ pst = ALIGN (pst, align);
+#endif
 #ifndef __LITTLE_ENDIAN__
  /* Structures with size less than eight bytes are passed
 left-padded.  */

-- 
Alan Modra
Australia Development Lab, IBM


Re: expand_expr tweaks to fix PR57134

2013-09-12 Thread Alan Modra
This is a followup to
http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00837.html
which is still lacking an OK.  Apologies for dropping this patch on
the floor.

PR middle-end/57586
* stmt.c (expand_asm_operands): Call expand_expr with
EXPAND_MEMORY for output operands that disallow regs.  Don't
use EXPAND_WRITE on inout operands.

Index: gcc/stmt.c
===
--- gcc/stmt.c  (revision 202428)
+++ gcc/stmt.c  (working copy)
@@ -806,7 +806,10 @@ expand_asm_operands (tree string, tree outputs, tr
  || ! allows_reg
  || is_inout)
{
- op = expand_expr (val, NULL_RTX, VOIDmode, EXPAND_WRITE);
+ op = expand_expr (val, NULL_RTX, VOIDmode,
+   !allows_reg ? EXPAND_MEMORY
+   : !is_inout ? EXPAND_WRITE
+   : EXPAND_NORMAL);
  if (MEM_P (op))
op = validize_mem (op);
 

On Fri, Jun 14, 2013 at 11:38:58AM +0200, Richard Biener wrote:
> On Fri, Jun 14, 2013 at 10:38 AM, Alan Modra  wrote:
> > On Thu, Jun 13, 2013 at 10:45:38AM +0200, Richard Biener wrote:
> >> On Wed, Jun 12, 2013 at 4:48 AM, Alan Modra  wrote:
> >> > The following patch fixes PR57134 by
> >> > a) excluding bitfield expansion when EXPAND_MEMORY, and
> >> > b) passing down the EXPAND_MEMORY modifier in a couple of places where
> >> > this does not currently happen on recursive calls to expand_expr().
> >>
> >> I suppose it also fixes PR57586 which looks similar?
> >
> > Not completely.  It cures the ICE, but you still get "output number 0
> > not directly addressable".  The reason being that expand_asm_operands
> > isn't asking for a mem.  It should I guess, and also not specify
> > EXPAND_WRITE on an inout parameter.  Bootstrapped and regression
> > tested powerpc64-linux.
> 
> It looks reasonable to me, but I'm not too familiar with EXPAND_MEMORY
> vs. EXPAND_WRITE.

For the expr.h comment
"EXPAND_WRITE means we are only going to write to the resulting rtx."
So fairly obviously we shouldn't use that with inout asm args.

> Btw, I wonder if for strict-alignment targets asm()s can expect "aligned"
> memory if they request an asm input with "m"?  Thus, do we eventually
> have to copy a known unaligned mem to aligned scratch memory before
> passing it to a "m" input?  Do we maybe have to do the same even for
> "m" outputs?  Or is this all simply undefined and asm()s have to handle
> arbitrary alignment of memory operands (well, those that appear
> at runtime, of course).

I'm sure the kernel people would rather *not* have copies to scratch
memory.  The testcase in pr57586 was derived from kernel code that
munges pointers.  A testcase that better shows what is going on,
probably from the same kernel code, is here
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57134#c3
(The pointer munging is what causes gcc to lose track of alignment.)

-- 
Alan Modra
Australia Development Lab, IBM


Re: libtool update for powerpc64le-linux

2013-09-16 Thread Alan Modra
I guess I can't really expect to gain an approval to import the
upstream libtool into gcc.  Even *I* don't really trust me, although
having looked at it a little I think I could even update
libjava/libltdl.  So how about just continuing the status quo and
applying a libtool patch that is already upstream?  Bootstrapped
powerpc64le-linux and powerpc64-linux.  OK to apply?

* libtool.m4 (_LT_ENABLE_LOCK ): Remove non-canonical
ppc host match.  Support little-endian powerpc linux hosts.
libjava/libltdl/
* acinclude.m4 (_LT_ENABLE_LOCK ): Remove non-canonical
ppc host match.  Support little-endian powerpc linux hosts.
* configure: Regenerate.
boehm-gc/
* configure: Regenerate.
gcc/
* configure: Regenerate.
* aclocal.m4: Regenerate.
fixincludes/
* configure: Regenerate.
libatomic/
* configure: Regenerate.
libbacktrace/
* configure: Regenerate.
libffi/
* configure: Regenerate.
libgfortran/
* configure: Regenerate.
libgomp/
* configure: Regenerate.
libitm/
* configure: Regenerate.
libjava/
* configure: Regenerate.
libjava/classpath/
* configure: Regenerate.
libmudflap/
* configure: Regenerate.
libobjc/
* configure: Regenerate.
libquadmath/
* configure: Regenerate.
libsanitizer/
* configure: Regenerate.
libssp/
* configure: Regenerate.
libstdc++-v3/
* configure: Regenerate.
libvtv/
* configure: Regenerate.
lto-plugin/
* configure: Regenerate.
zlib/
* configure: Regenerate.

Index: libtool.m4
===
--- libtool.m4  (revision 202428)
+++ libtool.m4  (working copy)
@@ -1220,7 +1220,7 @@ ia64-*-hpux*)
   rm -rf conftest*
   ;;
 
-x86_64-*kfreebsd*-gnu|x86_64-*linux*|ppc*-*linux*|powerpc*-*linux*| \
+x86_64-*kfreebsd*-gnu|x86_64-*linux*|powerpc*-*linux*| \
 s390*-*linux*|s390*-*tpf*|sparc*-*linux*)
   # Find out which ABI we are using.
   echo 'int i;' > conftest.$ac_ext
@@ -1241,7 +1241,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*)
;;
esac
;;
- ppc64-*linux*|powerpc64-*linux*)
+ powerpc64le-*linux*)
+   LD="${LD-ld} -m elf32lppclinux"
+   ;;
+ powerpc64-*linux*)
LD="${LD-ld} -m elf32ppclinux"
;;
  s390x-*linux*)
@@ -1260,7 +1263,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*)
  x86_64-*linux*)
LD="${LD-ld} -m elf_x86_64"
;;
- ppc*-*linux*|powerpc*-*linux*)
+ powerpcle-*linux*)
+   LD="${LD-ld} -m elf64lppc"
+   ;;
+ powerpc-*linux*)
LD="${LD-ld} -m elf64ppc"
;;
  s390*-*linux*|s390*-*tpf*)
Index: libjava/libltdl/acinclude.m4
===
--- libjava/libltdl/acinclude.m4(revision 202428)
+++ libjava/libltdl/acinclude.m4(working copy)
@@ -519,7 +519,7 @@ ia64-*-hpux*)
   rm -rf conftest*
   ;;
 
-x86_64-*linux*|ppc*-*linux*|powerpc*-*linux*|s390*-*linux*|sparc*-*linux*)
+x86_64-*linux*|powerpc*-*linux*|s390*-*linux*|sparc*-*linux*)
   # Find out which ABI we are using.
   echo 'int i;' > conftest.$ac_ext
   if AC_TRY_EVAL(ac_compile); then
@@ -529,7 +529,10 @@ ia64-*-hpux*)
 x86_64-*linux*)
   LD="${LD-ld} -m elf_i386"
   ;;
-ppc64-*linux*|powerpc64-*linux*)
+powerpc64le-*linux*)
+  LD="${LD-ld} -m elf32lppclinux"
+  ;;
+powerpc64-*linux*)
   LD="${LD-ld} -m elf32ppclinux"
   ;;
 s390x-*linux*)
@@ -545,7 +548,10 @@ ia64-*-hpux*)
 x86_64-*linux*)
   LD="${LD-ld} -m elf_x86_64"
   ;;
-ppc*-*linux*|powerpc*-*linux*)
+powerpcle-*linux*)
+  LD="${LD-ld} -m elf64lppc"
+  ;;
+powerpc-*linux*)
   LD="${LD-ld} -m elf64ppc"
   ;;
 s390*-*linux*)
 
-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, PowerPC] Fix PR57949 (ABI alignment issue)

2013-09-16 Thread Alan Modra
On Thu, Sep 12, 2013 at 09:33:05AM +0100, Andrew Haley wrote:
> On 09/12/2013 03:11 AM, Alan Modra wrote:
> > We have precedent for compiling libffi based on gcc preprocessor
> > defines, eg. __NO_FPRS__, so here's a way of making upstream libffi
> > compatible with the various versions of gcc out there.  I've taken the
> > condition under which we align aggregates from
> > rs6000_function_arg_boundary, and defined a macro with a value of the
> > maximum alignment.
> 
> I would have thought that a runtime or configure-time test in libffi
> would be better.

I don't see how a runtime check can be done, and a configure test just
adds another layer of indirection.  How is testing a cpp macro at
configure time better than testing the same at compile time?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, committed] PR 57589: Fix logic error in driver-rs6000.c

2013-09-16 Thread Alan Modra
On Tue, Jun 11, 2013 at 07:43:05PM -0400, Michael Meissner wrote:
> Ah, you are correct.  I misread the code, thinking it was returning a string,
> and not a pointer to the string in memory.
http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00640.html

Patch reverted.

-- 
Alan Modra
Australia Development Lab, IBM


Re: libtool update for powerpc64le-linux

2013-09-18 Thread Alan Modra
On Wed, Sep 18, 2013 at 10:14:16AM -0400, David Edelsohn wrote:
> Please ensure that you use the correct version of autoconf to
> regenerate the files. I don't think that the same version of autoconf
> is used in all directories.

gcc is OK in this respect.  autoconf-2.64 is used throughout.  It's
src that is a hodge-podge of different versions.

-- 
Alan Modra
Australia Development Lab, IBM


Re: expand_expr tweaks to fix PR57134

2013-09-24 Thread Alan Modra
On Fri, Sep 13, 2013 at 12:37:20PM +0930, Alan Modra wrote:
>   PR middle-end/57586
>   * stmt.c (expand_asm_operands): Call expand_expr with
>   EXPAND_MEMORY for output operands that disallow regs.  Don't
>   use EXPAND_WRITE on inout operands.

Ping?

-- 
Alan Modra
Australia Development Lab, IBM


Re: expand_expr tweaks to fix PR57134

2013-10-01 Thread Alan Modra
I'm committing this cleanup patch to my PR 57134,57586 changes as
obvious.  That it is obvious can be seen from an assert in
tree-ssa-operands.c get_asm_expr_operands().

  /* This should have been split in gimplify_asm_expr.  */
  gcc_assert (!allows_reg || !is_inout);

Bootstrapped, etc. powerpc64-linux.

* stmt.c (expand_asm_operands): Revert part of 2013-09-24 special
casing inout operands.

Index: gcc/stmt.c
===
--- gcc/stmt.c  (revision 203053)
+++ gcc/stmt.c  (working copy)
@@ -807,9 +807,7 @@ expand_asm_operands (tree string, tree outputs, tr
  || is_inout)
{
  op = expand_expr (val, NULL_RTX, VOIDmode,
-   !allows_reg ? EXPAND_MEMORY
-   : !is_inout ? EXPAND_WRITE
-   : EXPAND_NORMAL);
+   !allows_reg ? EXPAND_MEMORY : EXPAND_WRITE);
  if (MEM_P (op))
op = validize_mem (op);
 

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] VSX splat fix

2012-10-09 Thread Alan Modra
This fixes a problem with my PR45844 fix.  PR45844 was due to rs6000.c
reg_offset_addressing_ok_p testing the operand mode to determine
whether an insn supports reg+offset addressing, but the VSX splat insn
uses a DF/DI mode input operand.  So the memory form of this insn was
wrongly seen to support reg+offset addressing.  I hacked around this
by adjusting the mode in the insn predicate, which happened to work
for the PR45844 testcase, but actually causes the predicate to reject
all MEMs since general_operand checks that the mode matches.  (Oddly,
this does not stop reload using the memory form of the insn!
const_double passes the predicate, reload forces to mem which matches
one of the constraints, and the predicate is not checked again.)

This avoids the general_operand mode check by expanding code from
there relevant to MEMs.  Bootstrapped and regression tested
powerpc64-linux.  OK for mainline and 4.6/4.7?

* config/rs6000/predicates.md (splat_input_operand): Don't call
input_operand for MEMs.  Instead check for volatile and call
memory_address_addr_space_p with modified mode.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 192236)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -941,12 +941,16 @@
 {
   if (MEM_P (op))
 {
+  if (! volatile_ok && MEM_VOLATILE_P (op))
+   return 0;
   if (mode == DFmode)
mode = V2DFmode;
   else if (mode == DImode)
mode = V2DImode;
   else
-   gcc_unreachable ();
+   gcc_unreachable ();
+  return memory_address_addr_space_p (mode, XEXP (op, 0),
+ MEM_ADDR_SPACE (op));
 }
   return input_operand (op, mode);
 })

-- 
Alan Modra
Australia Development Lab, IBM


building gcc with powerpc gold

2012-10-17 Thread Alan Modra
These two tests currently fail if using gold, in the first instance
because powerpc64 gold doesn't support mixing old dot-sym objects
with new objects, and in the second instance because gold doesn't have
a --no-toc-sort option.  Both macros ought to be defined for gold.
Tested etc.  OK to apply everywhere?

* configure.ac (HAVE_LD_NO_DOT_SYMS): Set if using gold.
(HAVE_LD_LARGE_TOC): Likewise.
* configure: Regenerate.

Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 192236)
+++ gcc/configure.ac(working copy)
@@ -4379,7 +4379,9 @@
 AC_CACHE_CHECK(linker support for omitting dot symbols,
 gcc_cv_ld_no_dot_syms,
 [gcc_cv_ld_no_dot_syms=no
-if test $in_tree_ld = yes ; then
+if test x"$ld_is_gold" = xyes; then
+  gcc_cv_ld_no_dot_syms=yes
+elif test $in_tree_ld = yes ; then
   if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" 
-ge 16 -o "$gcc_cv_gld_major_version" -gt 2; then
 gcc_cv_ld_no_dot_syms=yes
   fi
@@ -4416,7 +4418,9 @@
 AC_CACHE_CHECK(linker large toc support,
 gcc_cv_ld_large_toc,
 [gcc_cv_ld_large_toc=no
-if test $in_tree_ld = yes ; then
+if test x"$ld_is_gold" = xyes; then
+  gcc_cv_ld_large_toc=yes
+elif test $in_tree_ld = yes ; then
   if test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" 
-ge 21 -o "$gcc_cv_gld_major_version" -gt 2; then
 gcc_cv_ld_large_toc=yes
   fi

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] libffi ppc64 assembly

2012-10-23 Thread Alan Modra
Gold on powerpc64 doesn't support old ABI objects, but libffi contains
old ABI assembly.  This patch modifies those files to support both old
and new ABI, and adds a builtin define to powerpc64 gcc that can be
used to select between the ABIs in assembly.  I figure a define is
generally useful, and more robust than trying to duplicate gcc
configury in libffi (which could be overridden by CFLAGS anyway).
Bootstrapped and regression tested powerpc64-linux.  OK to apply
mainline?

gcc/
* config/rs6000/linux64.h (TARGET_OS_CPP_BUILTINS): Define _CALL_LINUX.
libffi/
* src/powerpc/linux64_closure.S: Add new ABI support.
* src/powerpc/linux64.S: Likewise.

Index: gcc/config/rs6000/linux64.h
===
--- gcc/config/rs6000/linux64.h (revision 192660)
+++ gcc/config/rs6000/linux64.h (working copy)
@@ -318,6 +318,8 @@
  builtin_define ("__PPC64__"); \
  builtin_define ("__powerpc__");   \
  builtin_define ("__powerpc64__"); \
+ if (!DOT_SYMBOLS) \
+   builtin_define ("_CALL_LINUX"); \
  builtin_assert ("cpu=powerpc64"); \
  builtin_assert ("machine=powerpc64"); \
}   \
Index: libffi/src/powerpc/linux64_closure.S
===
--- libffi/src/powerpc/linux64_closure.S(revision 192660)
+++ libffi/src/powerpc/linux64_closure.S(working copy)
@@ -32,16 +32,24 @@
 
 #ifdef __powerpc64__
FFI_HIDDEN (ffi_closure_LINUX64)
-   FFI_HIDDEN (.ffi_closure_LINUX64)
-   .globl  ffi_closure_LINUX64, .ffi_closure_LINUX64
+   .globl  ffi_closure_LINUX64
.section".opd","aw"
.align  3
 ffi_closure_LINUX64:
+#ifdef _CALL_LINUX
+   .quad   .L.ffi_closure_LINUX64,.TOC.@tocbase,0
+   .type   ffi_closure_LINUX64,@function
+   .text
+.L.ffi_closure_LINUX64:
+#else
+   FFI_HIDDEN (.ffi_closure_LINUX64)
+   .globl  .ffi_closure_LINUX64
.quad   .ffi_closure_LINUX64,.TOC.@tocbase,0
.size   ffi_closure_LINUX64,24
.type   .ffi_closure_LINUX64,@function
.text
 .ffi_closure_LINUX64:
+#endif
 .LFB1:
# save general regs into parm save area
std %r3, 48(%r1)
@@ -91,7 +99,11 @@
addi %r6, %r1, 128
 
# make the call
+#ifdef _CALL_LINUX
+   bl ffi_closure_helper_LINUX64
+#else
bl .ffi_closure_helper_LINUX64
+#endif
 .Lret:
 
# now r3 contains the return type
@@ -194,7 +206,11 @@
 .LFE1:
.long   0
.byte   0,12,0,1,128,0,0,0
+#ifdef _CALL_LINUX
+   .size   ffi_closure_LINUX64,.-.L.ffi_closure_LINUX64
+#else
.size   .ffi_closure_LINUX64,.-.ffi_closure_LINUX64
+#endif
 
.section.eh_frame,EH_FRAME_FLAGS,@progbits
 .Lframe1:
Index: libffi/src/powerpc/linux64.S
===
--- libffi/src/powerpc/linux64.S(revision 192660)
+++ libffi/src/powerpc/linux64.S(working copy)
@@ -30,16 +30,25 @@
 #include 
 
 #ifdef __powerpc64__
-   .hidden ffi_call_LINUX64, .ffi_call_LINUX64
-   .globl  ffi_call_LINUX64, .ffi_call_LINUX64
+   .hidden ffi_call_LINUX64
+   .globl  ffi_call_LINUX64
.section".opd","aw"
.align  3
 ffi_call_LINUX64:
+#ifdef _CALL_LINUX
+   .quad   .L.ffi_call_LINUX64,.TOC.@tocbase,0
+   .type   ffi_call_LINUX64,@function
+   .text
+.L.ffi_call_LINUX64:
+#else
+   .hidden .ffi_call_LINUX64
+   .globl  .ffi_call_LINUX64
.quad   .ffi_call_LINUX64,.TOC.@tocbase,0
.size   ffi_call_LINUX64,24
.type   .ffi_call_LINUX64,@function
.text
 .ffi_call_LINUX64:
+#endif
 .LFB1:
mflr%r0
std %r28, -32(%r1)
@@ -58,7 +67,11 @@
 
/* Call ffi_prep_args64.  */
mr  %r4, %r1
+#ifdef _CALL_LINUX
+   bl  ffi_prep_args64
+#else
bl  .ffi_prep_args64
+#endif
 
ld  %r0, 0(%r29)
ld  %r2, 8(%r29)
@@ -137,7 +150,11 @@
 .LFE1:
.long   0
.byte   0,12,0,1,128,4,0,0
+#ifdef _CALL_LINUX
+   .size   ffi_call_LINUX64,.-.L.ffi_call_LINUX64
+#else
.size   .ffi_call_LINUX64,.-.ffi_call_LINUX64
+#endif
 
.section.eh_frame,EH_FRAME_FLAGS,@progbits
 .Lframe1:

-- 
Alan Modra
Australia Development Lab, IBM


Re: [Patch] Potential fix for PR55033

2012-10-24 Thread Alan Modra
On Tue, Oct 23, 2012 at 06:25:43PM +0200, Sebastian Huber wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55033
> 
> This patch fixes my problem, but I am absolutely not sure if this is the
> right way.
[snip]

This is http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9571 all over again.

IMHO your patch is not wrong, but a better idea is that if we've used
categorize_decl_for_section to choose a section name, then we ought to
also use categorize_decl_for_section to choose section flags.

My original fix for pr9571 was to pass the decl down to
get_named_section.
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02487.html
That hit the assert in get_named_section when building libgfortran for
ia64, and my knee-jerk patch to fix that was to simply satisfy the
assert.  After looking at all the target section_type_flags functions,
I believe the following is what I should have done way back then.
Bootstrapped and regression tested powerpc64-linux.

* varasm.c (default_elf_select_section): Move !DECL_P check..
(get_named_section): ..to here before calling get_section_name.
Adjust assertion.
(default_section_type_flags): Add DECL_P check.
* config/i386/winnt.c (i386_pe_section_type_flags): Likewise.
* config/rs6000/rs6000.c (rs6000_xcoff_section_type_flags): Likewise.

Index: gcc/varasm.c
===
--- gcc/varasm.c(revision 192660)
+++ gcc/varasm.c(working copy)
@@ -403,12 +403,16 @@ get_named_section (tree decl, const char *name, in
 {
   unsigned int flags;
 
-  gcc_assert (!decl || DECL_P (decl));
   if (name == NULL)
-name = TREE_STRING_POINTER (DECL_SECTION_NAME (decl));
+{
+  gcc_assert (decl && DECL_P (decl) && DECL_SECTION_NAME (decl));
+  name = TREE_STRING_POINTER (DECL_SECTION_NAME (decl));
+}
 
   flags = targetm.section_type_flags (decl, name, reloc);
 
+  if (decl && !DECL_P (decl))
+decl = NULL_TREE;
   return get_section (name, flags, decl);
 }
 
@@ -5943,7 +5947,7 @@ default_section_type_flags (tree decl, const char
flags |= SECTION_RELRO;
 }
 
-  if (decl && DECL_ONE_ONLY (decl))
+  if (decl && DECL_P (decl) && DECL_ONE_ONLY (decl))
 flags |= SECTION_LINKONCE;
 
   if (decl && TREE_CODE (decl) == VAR_DECL && DECL_THREAD_LOCAL_P (decl))
@@ -6299,8 +6303,6 @@ default_elf_select_section (tree decl, int reloc,
   gcc_unreachable ();
 }
 
-  if (!DECL_P (decl))
-decl = NULL_TREE;
   return get_named_section (decl, sname, reloc);
 }
 
Index: gcc/config/i386/winnt.c
===
--- gcc/config/i386/winnt.c (revision 192660)
+++ gcc/config/i386/winnt.c (working copy)
@@ -476,7 +476,7 @@ i386_pe_section_type_flags (tree decl, const char
flags |= SECTION_PE_SHARED;
 }
 
-  if (decl && DECL_ONE_ONLY (decl))
+  if (decl && DECL_P (decl) && DECL_ONE_ONLY (decl))
 flags |= SECTION_LINKONCE;
 
   /* See if we already have an entry for this section.  */
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 192660)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -25689,7 +25689,7 @@ rs6000_xcoff_section_type_flags (tree decl, const
   unsigned int flags = default_section_type_flags (decl, name, reloc);
 
   /* Align to at least UNIT size.  */
-  if (flags & SECTION_CODE || !decl)
+  if ((flags & SECTION_CODE) != 0 || !decl || !DECL_P (decl))
 align = MIN_UNITS_PER_WORD;
   else
 /* Increase alignment of large objects if not already stricter.  */

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] rs6000: Disable generation of lwa in 32-bit mode

2012-10-25 Thread Alan Modra
On Thu, Oct 25, 2012 at 03:57:38PM -0700, Segher Boessenkool wrote:
> for most others.  This patch disables all lwa insns in 32-bit mode.
> We can later re-enable it if the assembler used handles it properly,

Well, you can now do that.  Mainline gas and ld are now fixed.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [Patch] Potential fix for PR55033

2012-10-30 Thread Alan Modra
On Tue, Oct 30, 2012 at 02:45:40PM +0100, Sebastian Huber wrote:
> On 10/26/2012 02:22 PM, Sebastian Huber wrote:
> >Hello,
> >
> >here is a test case for PR55033.
> >
> 
> Is there something wrong with this test case?  It compiles well with Alan's 
> patch.

It looks OK to me if you replace your "gd-do compile" line with the
following two lines to avoid failures on powerpc targets that don't
support -meabi -msdata.

/* { dg-do compile { target powerpc*-*-eabi* powerpc*-*-elf* powerpc*-*-linux* 
} } */
/* { dg-require-effective-target ilp32 } */


-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH][i386]Fix PR 57756

2013-10-15 Thread Alan Modra
On Tue, Oct 15, 2013 at 02:45:23PM -0700, Sriraman Tallam wrote:
> I committed this patch after making the above change.

/src/gcc-virgin/gcc/config/rs6000/rs6000.c: At global scope:
/src/gcc-virgin/gcc/config/rs6000/rs6000.c:31122:29: error: invalid conversion 
from ‘void (*)(cl_target_option*)’ to ‘void (*)(cl_target_option*, 
gcc_options*)’ [-fpermissive]
/src/gcc-virgin/gcc/config/rs6000/rs6000.c:31122:29: error: invalid conversion 
from ‘void (*)(cl_target_option*)’ to ‘void (*)(gcc_options*, 
cl_target_option*)’ [-fpermissive]


-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] ABI_V4 init of toc section

2016-01-29 Thread Alan Modra
, "\t.long ");
   assemble_name (file, toc_label_name);
+  need_toc_init = 1;
   putc ('-', file);
   ASM_GENERATE_INTERNAL_LABEL (buf, "LCF", rs6000_pic_labelno);
   assemble_name (file, buf);
@@ -32137,6 +32148,7 @@ rs6000_xcoff_file_start (void)
   fputc ('\n', asm_out_file);
   if (write_symbols != NO_DEBUG)
 switch_to_section (private_data_section);
+  switch_to_section (toc_section);
   switch_to_section (text_section);
   if (profile_flag)
 fprintf (asm_out_file, "\t.extern %s\n", RS6000_MCOUNT);
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6e22c52..fe19853 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -9475,6 +9475,8 @@
   "*
 {
   char buf[30];
+  extern int need_toc_init;
+  need_toc_init = 1;
   ASM_GENERATE_INTERNAL_LABEL (buf, \"LCTOC\", 1);
   operands[1] = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (buf));
   operands[2] = gen_rtx_REG (Pmode, 2);
@@ -9492,6 +9494,8 @@
   "*
 {
   char buf[30];
+  extern int need_toc_init;
+  need_toc_init = 1;
 #ifdef TARGET_RELOCATABLE
   ASM_GENERATE_INTERNAL_LABEL (buf, \"LCTOC\",
   !TARGET_MINIMAL_TOC || TARGET_RELOCATABLE);

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] ABI_V4 init of toc section

2016-01-30 Thread Alan Modra
On Fri, Jan 29, 2016 at 01:20:08PM -0500, David Edelsohn wrote:
> On Fri, Jan 29, 2016 at 11:38 AM, Alan Modra  wrote:
> > PR target/68662
> > * config/rs6000/rs6000.c (need_toc_init): New var, set it
> > whenever toc_label_name used.
> > (rs6000_file_start): Don't set up toc section here,
> > (rs6000_output_function_epilogue): do so here instead,
> > (rs6000_xcoff_file_start): and here.
> > * config/rs6000/rs6000.md (load_toc_aix_si): Set need_toc_init.
> > (load_toc_aix_di): Likewise.
> 
> I'm worried about how this is going to interact with AIX.  AIX
> assembler is single pass and this patch moves the initialization from
> the beginning of the file to the end of the file, which means there
> will be references to a label whose definition is delayed until the
> end.

AIX toc init is still done at the start of the file.  The code to emit
.toc or set .LCTOC..1 has moved from rs6000_file_start to
rs6000_xcoff_file_start.

-- 
Alan Modra
Australia Development Lab, IBM


Combine simplify_set WORD_REGISTER_OPERATIONS

2016-01-31 Thread Alan Modra
The comment says this test is supposed to prevent "a narrower
operation than requested", but it actually only allows a larger
subreg, not one the same size.  Fix that.

Bootstrapped and regression tested powerpc64-linux.  OK for stage1?

Note that this bug was found when investigating why gcc-6 does not
suffer from pr69548, ie. this bug was masking a powerpc backend bug.

* combine.c (simplify_set): Correct WORD_REGISTER_OPERATIONS test.

diff --git a/gcc/combine.c b/gcc/combine.c
index 858552d..9f284a7 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -6736,7 +6736,7 @@ simplify_set (rtx x)
   + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD))
   && (WORD_REGISTER_OPERATIONS
  || (GET_MODE_SIZE (GET_MODE (src))
- < GET_MODE_SIZE (GET_MODE (SUBREG_REG (src)
+ <= GET_MODE_SIZE (GET_MODE (SUBREG_REG (src)
 #ifdef CANNOT_CHANGE_MODE_CLASS
   && ! (REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER
&& REG_CANNOT_CHANGE_MODE_P (REGNO (dest),

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] lqarx and stqcx. registers

2016-01-31 Thread Alan Modra
lqarx RT and stqcx. RS are valid only with even numbered gprs.  The
predicate to enforce this happens to allow a loophole, closed by this
patch.

This pattern created by combine:
Trying 8 -> 9:
Successfully matched this instruction:
(set (subreg:PTI (reg:TI 155 [ D.2357 ]) 0)
(unspec_volatile:PTI [
(mem/v:TI (reg/v/f:DI 157 [ mptr ]) [-1  S16 A128])
] UNSPECV_LL))

is seen by reload as needing to reload pseudo 155 in TI mode, which
has no requirement that the reg be even.  Apparently, nothing checks
the predicate again after reload.

We only see this problem on gcc-5 and gcc-4.9, because on gcc-6 we
don't define WORD_REGISTER_OPERATIONS and combine happens to have a
bug in simplify_set that prevents it creating the problem subregs.
See https://gcc.gnu.org/ml/gcc-patches/2016-01/msg02377.html

Bootstrapped and regression tested powerpc64-linux biarch on master
both with and without the combine bug, and on gcc-5.  OK for master
and active branches?

gcc/
PR target/69548
* config/rs6000/predicates.md (quad_int_reg_operand): Don't
allow subregs.
gcc/testsuite/
* gcc.target/powerpc/pr69548.c: New test.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index b8f14fd..302303c 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -375,20 +375,19 @@
 
 ;; Return 1 if op is a general purpose register that is an even register
 ;; which suitable for a load/store quad operation
+;; Subregs are not allowed here because when they are combine can
+;; create (subreg:PTI (reg:TI pseudo)) which will cause reload to
+;; think the innermost reg needs reloading, in TImode instead of
+;; PTImode.  So reload will choose a reg in TImode which has no
+;; requirement that the reg be even.
 (define_predicate "quad_int_reg_operand"
-  (match_operand 0 "register_operand")
+  (match_code "reg")
 {
   HOST_WIDE_INT r;
 
   if (!TARGET_QUAD_MEMORY && !TARGET_QUAD_MEMORY_ATOMIC)
 return 0;
 
-  if (GET_CODE (op) == SUBREG)
-op = SUBREG_REG (op);
-
-  if (!REG_P (op))
-return 0;
-
   r = REGNO (op);
   if (r >= FIRST_PSEUDO_REGISTER)
 return 1;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr69548.c 
b/gcc/testsuite/gcc.target/powerpc/pr69548.c
new file mode 100644
index 000..439f588
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr69548.c
@@ -0,0 +1,11 @@
+/* { dg-do assemble { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -Os -mbig" } */
+
+__int128
+quad_exchange (__int128 *ptr, __int128 newval)
+{
+  return __atomic_exchange_n (ptr, newval, __ATOMIC_RELAXED);
+}

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] ABI_V4 init of toc section

2016-01-31 Thread Alan Modra
   \
-  flag_pic = 2;\
+  if (TARGET_RELOCATABLE)  \
+{  \
+  if (!flag_pic)   \
+   flag_pic = 2;   \
+  TARGET_NO_FP_IN_TOC = 1; \
+}  \
 } while (0)
 
 #ifndef RS6000_BI_ARCH
@@ -317,8 +326,7 @@ do {
\
 
 /* Put PC relative got entries in .got2.  */
 #defineMINIMAL_TOC_SECTION_ASM_OP \
-  (TARGET_RELOCATABLE || (flag_pic && DEFAULT_ABI == ABI_V4)   \
-   ? "\t.section\t\".got2\",\"aw\"" : "\t.section\t\".got1\",\"aw\"")
+  (flag_pic ? "\t.section\t\".got2\",\"aw\"" : "\t.section\t\".got1\",\"aw\"")
 
 #defineSDATA_SECTION_ASM_OP "\t.section\t\".sdata\",\"aw\""
 #defineSDATA2_SECTION_ASM_OP "\t.section\t\".sdata2\",\"a\""
@@ -352,7 +360,6 @@ do {
\
|| (GET_CODE (X) == CONST_INT   \
   && GET_MODE_BITSIZE (MODE) <= GET_MODE_BITSIZE (Pmode))  \
|| (!TARGET_NO_FP_IN_TOC
\
-  && !TARGET_RELOCATABLE   \
   && GET_CODE (X) == CONST_DOUBLE  \
   && SCALAR_FLOAT_MODE_P (GET_MODE (X))\
   && BITS_PER_WORD == HOST_BITS_PER_INT)))
@@ -941,9 +948,10 @@ ncrtn.o%s"
 /* Select a format to encode pointers in exception handling data.  CODE
is 0 for data, 1 for code labels, 2 for function pointers.  GLOBAL is
true if the symbol may be affected by dynamic relocations.  */
-#define ASM_PREFERRED_EH_DATA_FORMAT(CODE,GLOBAL)   \
-  ((flag_pic || TARGET_RELOCATABLE) \
-   ? (((GLOBAL) ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel | DW_EH_PE_sdata4) \
+#define ASM_PREFERRED_EH_DATA_FORMAT(CODE, GLOBAL) \
+  (flag_pic\
+   ? (((GLOBAL) ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel  \
+  | DW_EH_PE_sdata4)   \
: DW_EH_PE_absptr)
 
 #define DOUBLE_INT_ASM_OP "\t.quad\t"

-- 
Alan Modra
Australia Development Lab, IBM


Re: Combine simplify_set WORD_REGISTER_OPERATIONS

2016-01-31 Thread Alan Modra
On Sun, Jan 31, 2016 at 06:02:35PM -0600, Segher Boessenkool wrote:
> On Mon, Feb 01, 2016 at 08:46:42AM +1030, Alan Modra wrote:
> > The comment says this test is supposed to prevent "a narrower
> > operation than requested", but it actually only allows a larger
> > subreg, not one the same size.  Fix that.
> > 
> > Bootstrapped and regression tested powerpc64-linux.  OK for stage1?
> > 
> > Note that this bug was found when investigating why gcc-6 does not
> > suffer from pr69548, ie. this bug was masking a powerpc backend bug.
> 
> It sounds like you have a testcase, can we see it please?

The testcase in pr69548 will show changes in rtl..

> And, just a missed optimisation, not a bug, right?

Yes, not a bug, and only presumed a missed optimisation.  I don't
actually have a testcase that shows worse code.  All I have is a
comment that makes sense to me, that doesn't agree exactly with the
code, and some understanding how the code may have been accidentally
written the way it is.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix -mcpu=power8 atomic expansion (PR target/69644)

2016-02-04 Thread Alan Modra
On Wed, Feb 03, 2016 at 05:34:17PM -0500, David Edelsohn wrote:
> On Wed, Feb 3, 2016 at 5:28 PM, Jakub Jelinek  wrote:
> > Hi!
> >
> > rs6000_expand_atomic_compare_and_swap uses oldval directly in
> > a comparison instruction, but oldval might be a CONST_INT not suitable
> > for the instruction (such as in the testcase below in SImode comparison
> > 0x8000 constant).  We need to force those into register if they don't
> > satisfy the predicate.
> >
> > Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?
> >
> > 2016-02-03  Jakub Jelinek  
> >
> > PR target/69644
> > * config/rs6000/rs6000.c (rs6000_expand_atomic_compare_and_swap):
> > Force oldval into register if it does not satisfy 
> > reg_or_short_operand
> > predicate.  Fix up formatting.
> >
> > * gcc.dg/pr69644.c: New test.
> 
> Okay.

This needs to go on gcc-5 and gcc-4.9 branches too, where it fixes
pr69146.  pr69146 and pr69644 are dups.  OK to apply to the branches?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix -mcpu=power8 atomic expansion (PR target/69644)

2016-02-04 Thread Alan Modra
On Thu, Feb 04, 2016 at 02:42:38PM +0100, Jakub Jelinek wrote:
> On Thu, Feb 04, 2016 at 08:40:22AM -0500, David Edelsohn wrote:
> > On Thu, Feb 4, 2016 at 6:33 AM, Alan Modra  wrote:
> > > On Wed, Feb 03, 2016 at 05:34:17PM -0500, David Edelsohn wrote:
> > >> On Wed, Feb 3, 2016 at 5:28 PM, Jakub Jelinek  wrote:
> > >> > Hi!
> > >> >
> > >> > rs6000_expand_atomic_compare_and_swap uses oldval directly in
> > >> > a comparison instruction, but oldval might be a CONST_INT not suitable
> > >> > for the instruction (such as in the testcase below in SImode comparison
> > >> > 0x8000 constant).  We need to force those into register if they don't
> > >> > satisfy the predicate.
> > >> >
> > >> > Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?
> > >> >
> > >> > 2016-02-03  Jakub Jelinek  
> > >> >
> > >> > PR target/69644
> > >> > * config/rs6000/rs6000.c 
> > >> > (rs6000_expand_atomic_compare_and_swap):
> > >> > Force oldval into register if it does not satisfy 
> > >> > reg_or_short_operand
> > >> > predicate.  Fix up formatting.
> > >> >
> > >> > * gcc.dg/pr69644.c: New test.
> > >>
> > >> Okay.
> > >
> > > This needs to go on gcc-5 and gcc-4.9 branches too, where it fixes
> > > pr69146.  pr69146 and pr69644 are dups.  OK to apply to the branches?
> > 
> > Okay with me, but coordinate with Jakub.
> 
> Ok with me to, just don't have spare cycles now to bootstrap/regtest it.
> So, Alan, if you could do that, it would be greatly appreciated.

I've regression tested powerpc64le-linux gcc-5 and gcc-4.9, all langs.

-- 
Alan Modra
Australia Development Lab, IBM


Correct c-torture stkalign test

2016-02-07 Thread Alan Modra
This test was added by git commit 7c5f55675 (svn 231569)

Here's the log message from that commit:
avoid alignment of static variables affecting stack's

Function (or more narrow) scope static variables (as well as others not
placed on the stack) should also not have any effect on the stack
alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
construct using an 8-byte aligned sub-file-scope local variable.

However, the test assumes that a local var will normally not be 64-bit
aligned, causing it to fail on many targets.  So the test needs to
pass if the local var *is* normally 64-bit aligned.  Done as follows.
test2() is a duplicate of test() without the alignment on the static
vars.  Fails on x86_64 -m64 and -m32 if 7c5f55675 is reverted and
passes now for powerpc64-linux.  I expect sparc will pass too, so have
reverted Eric's change.

OK to apply?

PR testsuite/68886
* gcc.c-torture/execute/stkalign.c: Revise test.

diff --git a/gcc/testsuite/gcc.c-torture/execute/stkalign.c 
b/gcc/testsuite/gcc.c-torture/execute/stkalign.c
index 2f8d041..e10a1d2 100644
--- a/gcc/testsuite/gcc.c-torture/execute/stkalign.c
+++ b/gcc/testsuite/gcc.c-torture/execute/stkalign.c
@@ -1,5 +1,6 @@
-/* { dg-xfail-run-if "invalid assumption" { sparc*-*-* && lp64 } "*" "" } */
 /* { dg-options "-fno-inline" } */
+/* Check that stack alignment is not affected by variables not placed
+   on the stack.  */
 
 #include 
 
@@ -16,12 +17,28 @@ unsigned test(unsigned n, unsigned p)
   return n ? test(n - 1, x) : (x ^ p);
 }
 
+unsigned test2(unsigned n, unsigned p)
+{
+  static struct { char c; } s;
+  unsigned x;
+
+  assert(__alignof__(s) != ALIGNMENT);
+  asm ("" : "=g" (x), "+m" (s) : "0" (&x));
+
+  return n ? test2(n - 1, x) : (x ^ p);
+}
+
 int main (int argc, char *argv[] __attribute__((unused)))
 {
-  unsigned int x = test(argc, 0);
+  unsigned int x, y;
 
+  x = test(argc, 0);
   x |= test(argc + 1, 0);
   x |= test(argc + 2, 0);
 
-  return !(x & (ALIGNMENT - 1));
+  y = test2(argc, 0);
+  y |= test2(argc + 1, 0);
+  y |= test2(argc + 2, 0);
+
+  return (x & (ALIGNMENT - 1)) == 0 && (y & (ALIGNMENT - 1)) != 0 ? 1 : 0;
 }

-- 
Alan Modra
Australia Development Lab, IBM


Re: Combine simplify_set WORD_REGISTER_OPERATIONS

2016-02-09 Thread Alan Modra
On Mon, Feb 08, 2016 at 09:27:36AM -0700, Jeff Law wrote:
> On 01/31/2016 03:16 PM, Alan Modra wrote:
> >The comment says this test is supposed to prevent "a narrower
> >operation than requested", but it actually only allows a larger
> >subreg, not one the same size.  Fix that.
> >
> >Bootstrapped and regression tested powerpc64-linux.  OK for stage1?
> >
> >Note that this bug was found when investigating why gcc-6 does not
> >suffer from pr69548, ie. this bug was masking a powerpc backend bug.
> >
> > * combine.c (simplify_set): Correct WORD_REGISTER_OPERATIONS test.
> 
> Is there a strong need to apply this to gcc6?

No, better to wait for gcc-7, I think.

>  Can we construct a testcase
> where this makes a difference in the code we generate?

I instrumented the combine.c code in question with this

  if (!WORD_REGISTER_OPERATIONS
  && (GET_MODE_SIZE (GET_MODE (src))
  == GET_MODE_SIZE (GET_MODE (SUBREG_REG (src)
{
  FILE *f = fopen ("/tmp/alan", "a");
  fprintf (f, "%s\n", main_input_filename);
  print_inline_rtx (f, src, 0);
  fprintf (f, "\n");
  fclose (f);
}

to see what popped out when bootstrapping gcc on x86_64.  There were
quite a lot of hits, DI -> DF, SI -> SF, V4SF -> TI etc, especially in
the testsuite (I should have dumped options too)..  

Here's the first one:
/src/gcc.git/libgcc/config/libbid/bid64_div.c
(subreg:DF (plus:DI (subreg:DI (reg:DF 841) 0)
(const_int 1 [0x1])) 0)
This one resulted in using lea vs. add, so slightly better code.

One from the testsuite:
/src/gcc.git/gcc/testsuite/gcc.dg/sso/p4.c
(subreg:SF (bswap:SI (reg:SI 99 [ Local_R2.F ])) 0)
When compiling with -Og, this showed

before  after
.loc 3 49 0 .loc 3 49 0
movl-32(%ebp), %eax movl-32(%ebp), %eax
bswap   %eaxbswap   %eax
movl%eax, -44(%ebp) movl%eax, -28(%ebp)
flds-44(%ebp)
fstps   -28(%ebp)

Quite an improvement, if you care about -Og code.

I didn't see any worse code, except some cases that I think were
caused by register allocation differences.

> My inclination would be to approve for gcc-7 as-is, but I'm more hesitant
> for gcc-6.
> 
> jeff

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH, reload] PRE_INC with invalid hard reg

2016-02-11 Thread Alan Modra

This is PR68973 part 1, the fix for the regression of g++.dg/pr67211.C
on powerpc64-linux -mcpu=power7, which turns out to be a reload
problem.

Due to uses elsewhere in vsx instructions, reload chooses to put
psuedo 185 in fr31, which can't be used as a base register in the
following:
(set (reg/f:DI 178)
 (mem/f:DI (pre_inc:DI (reg/f:DI 185
So find_reloads_address decides that the (pre_inc:DI (reg:DI 63))
needs reloading, and pushes a reload on the entire address
expression.  This is unfortunate because the rs6000 backend reload
hooks don't look inside a pre_inc and thus don't arrange a secondary
reload.  Nor does the generic SECONDARY_MEMORY_NEEDED handling, which
is only prepared to handle regs or subregs.  So reload doesn't
allocate the stack temp needed to copy between fprs and gprs on
power7.

Now it turns out that if find_reloads_address instead pushed a reload
of just the (reg:DI 63), then SECONDARY_MEMORY_NEEDED would get a
chance to do something.  Furthermore, there is existing code in
find_reloads_address to do just that for pre_inc, but only enabled for
psuedos that don't get a hard reg.  The following patch extends this
to invalid hard regs.  I could have implemented a fix in the rs6000
backend, but it seems likely that other targets may run into this
problem.

Bootstrapped and regression tested powerpc64le-linux and
x86_64-linux.  OK to apply?

PR target/68973
* reloads.c (find_reloads_address_1): For pre/post-inc/dec
with an invalid hard reg, reload just the reg not the entire
pre/post-inc/dec address expression.

diff --git a/gcc/reload.c b/gcc/reload.c
index 6196e63..06426d9 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -5834,14 +5834,16 @@ find_reloads_address_1 (machine_mode mode, addr_space_t 
as,
   ? XEXP (x, 0)
   : reg_equiv_mem (regno));
  enum insn_code icode = optab_handler (add_optab, GET_MODE (x));
- if (insn && NONJUMP_INSN_P (insn) && equiv
- && memory_operand (equiv, GET_MODE (equiv))
+ if (insn && NONJUMP_INSN_P (insn)
 #if HAVE_cc0
  && ! sets_cc0_p (PATTERN (insn))
 #endif
- && ! (icode != CODE_FOR_nothing
-   && insn_operand_matches (icode, 0, equiv)
-   && insn_operand_matches (icode, 1, equiv))
+ && (regno < FIRST_PSEUDO_REGISTER
+ || (equiv
+ && memory_operand (equiv, GET_MODE (equiv))
+ && ! (icode != CODE_FOR_nothing
+   && insn_operand_matches (icode, 0, equiv)
+   && insn_operand_matches (icode, 1, equiv
  /* Using RELOAD_OTHER means we emit this and the reload we
 made earlier in the wrong order.  */
  && !reloaded_inner_of_autoinc)

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] reload_vsx_from_gprsf splitter

2016-02-11 Thread Alan Modra
tr "length" "8")
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 997ff31..2d2f137 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1518,15 +1518,6 @@
   "xscvdpspn %x0,%x1"
   [(set_attr "type" "fp")])
 
-;; Used by direct move to move a SFmode value from GPR to VSX register
-(define_insn "vsx_xscvspdpn_directmove"
-  [(set (match_operand:SF 0 "vsx_register_operand" "=wa")
-   (unspec:SF [(match_operand:DI 1 "vsx_register_operand" "wa")]
-  UNSPEC_VSX_CVSPDPN))]
-  "TARGET_XSCVSPDPN"
-  "xscvspdpn %x0,%x1"
-  [(set_attr "type" "fp")])
-
 ;; Convert and scale (used by vec_ctf, vec_cts, vec_ctu for double/long long)
 
 (define_expand "vsx_xvcvsxddp_scale"

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] reload_vsx_from_gprsf splitter

2016-02-11 Thread Alan Modra
On Thu, Feb 11, 2016 at 04:55:58PM -0500, Michael Meissner wrote:
> This is one of the cases I wished the reload support had the ability to
> allocate 2 scratch temporaries instead of 1.  As I said in my other message,
> TFmode was a hack to get two registers to use.

Another concern I had about this, besides using %L in asm output (what
forces TFmode to use just fprs?), is what happens when we're using
IEEE 128-bit floats?  In that case it looks like we'd get just one reg.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, reload] PRE_INC with invalid hard reg

2016-02-11 Thread Alan Modra
On Thu, Feb 11, 2016 at 03:29:05PM +0100, Bernd Schmidt wrote:
> On 02/11/2016 10:45 AM, Alan Modra wrote:
> 
> >Due to uses elsewhere in vsx instructions, reload chooses to put
> >psuedo 185 in fr31, which can't be used as a base register in the
> >following:
> 
> What code exactly makes the choice of fr31? I assume this is in
> reg_renumber, so it's IRA and not reload that's making that decision?

Yes, sorry, I shouldn't have said reload chooses the reg.

> > PR target/68973
> > * reloads.c (find_reloads_address_1): For pre/post-inc/dec
> > with an invalid hard reg, reload just the reg not the entire
> > pre/post-inc/dec address expression.
> 
> Hmm, you're patching tricky code.

Thanks for being willing to review!  :)

> I'm not sure yet whether this is right or
> not. More reload dumps might help if you have them; I'll Cc myself on the
> PR.

I've attached rtl dumps to the PR.

> My gut feeling is that we want to reload the inner reg before entering this
> block of code,

Yes, my first quick hack did just that, then I noticed there was
existing code to reload the inner reg..

> with a new test for SECONDARY_MEMORY_NEEDED alongside the
> existing block that already sets reloaded_inner_of_autoinc.

I don't understand this comment.  If we're pushing a reload of the
inner reg, then the SECONDARY_MEMORY_NEEDED code in push_reload will
fire.  Why then should there be any need to do anything special in
find_reloads_address_1 regarding secondary memory?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] reload_vsx_from_gprsf splitter

2016-02-15 Thread Alan Modra
On Fri, Feb 12, 2016 at 02:57:22PM +0100, Ulrich Weigand wrote:
> > On Fri, Feb 12, 2016 at 08:54:19AM +1030, Alan Modra wrote:
> > > Another concern I had about this, besides using %L in asm output (what
> > > forces TFmode to use just fprs?), is what happens when we're using
> > > IEEE 128-bit floats?  In that case it looks like we'd get just one reg.
> > 
> > Good point that it breaks if the default long double (TFmode) type is IEEE
> > 128-bit floating point.  We would need to have two patterns, one that uses
> > TFmode and one that uses IFmode.  I wrote the power8 direct move stuff 
> > before
> > going down the road of IEEE 128-bit floating point.
> 
> Right.  It's a bit unfortunate that we can't just use IFmode unconditionally,
> but it seems rs6000_scalar_mode_supported_p (IFmode) may return false, and
> then we probably shouldn't be using it.

Actually, we can use IFmode unconditionally.  scalar_mode_supported_p
is relevant only up to and including expand.  Nothing prevents the
backend from using IFmode.

> Another option might be to use TDmode to allocate a scratch register pair.

That won't work, at least if we want to extract the two component regs
with simplify_gen_subreg, due to rs6000_cannot_change_mode_class.  In
my original patch I just extracted the regs by using gen_rtx_REG but I
changed that, based on your criticism of using gen_rtx_REG in
reload_vsx_from_gprsf, and because rs6000.md avoids gen_rtx_REG using
operand regnos in other places.  That particular change is of course
entirely cosmetic.  I also changed reload_vsx_from_gprsf to avoid mode
punning regs, instead duplicating insn patterns as done elsewhere in
the vsx support.  I don't believe we will see subregs of vsx or fp
regs after reload, but I'm quite willing to concede the point for a
stage4 fix.

Here's the revised patch.  To recap, the main bug fixes here are:
- stop reload_vsx_from_gprsf splitter from emitting a move not
handled by movdi_internal64
- don't use TFmode, which cannot now be assumed to be IBM
double-double.
Secondary to that, not using or passing around TFmode means the %L
restriction no longer matters, and constraints on the reload temp reg
can be relaxed.

Bootstrapped and regression tested powerpc64-linux biarch and
powerpc64le-linux.  OK David?

PR target/68973
* config/rs6000/rs6000.md (reload_vsx_from_gprsf): Use p8_mtvsrd_sf
rather than attempting to use movdi_internal64.  Remove op0_di.
(p8_mtvsrd_df, p8_mtvsrd_sf): New.
(p8_mtvsrd_1, p8_mtvsrd_2): Delete.
(p8_mtvsrwz): New.
(p8_mtvsrwz_1, p8_mtvsrwz_2): Delete.
(p8_xxpermdi_): Take two DF inputs rather than one TF.
(p8_fmrgow_): Likewise.
(reload_vsx_from_gpr): Make clobber IF.  Adjust for above
changes.
(reload_fpr_from_gpr): Similarly. Use "d" for op0 constraint.
* config/rs6000/vsx.md (vsx_xscvspdpn_directmove): Make op1 SFmode.

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index cdbf873..ec356cb 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7488,41 +7488,31 @@
 ;; value, since it is allocated in reload and not all of the flow information
 ;; is setup for it.  We have two patterns to do the two moves between gprs and
 ;; fprs.  There isn't a dependancy between the two, but we could potentially
-;; schedule other instructions between the two instructions.  TFmode is
-;; currently limited to traditional FPR registers.  If/when this is changed, we
-;; will need to revist %L to make sure it works with VSX registers, or add an
-;; %x version of %L.
+;; schedule other instructions between the two instructions.
 
 (define_insn "p8_fmrgow_"
   [(set (match_operand:FMOVE64X 0 "register_operand" "=d")
-   (unspec:FMOVE64X [(match_operand:TF 1 "register_operand" "d")]
+   (unspec:FMOVE64X [
+   (match_operand:DF 1 "register_operand" "d")
+   (match_operand:DF 2 "register_operand" "d")]
 UNSPEC_P8V_FMRGOW))]
   "!TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
-  "fmrgow %0,%1,%L1"
+  "fmrgow %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-(define_insn "p8_mtvsrwz_1"
-  [(set (match_operand:TF 0 "register_operand" "=d")
-   (unspec:TF [(match_operand:SI 1 "register_operand" "r")]
+(define_insn "p8_mtvsrwz"
+  [(set (match_operand:DF 0 "register_operand" "=d")
+   (unspec:DF [(match_operand:SI 1 "register_operand" "r")]
   UNSPEC_P8V_MTVSRWZ))]
   "!TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
   "mtvsrwz 

Re: [RS6000] reload_vsx_from_gprsf splitter

2016-02-15 Thread Alan Modra
On Mon, Feb 15, 2016 at 06:42:35AM -0800, David Edelsohn wrote:
> Is there still an issue with the constraints used for movdi_internal64?

Yes and no.  No because we shouldn't be attempting DI moves between vsx
regs and gprs.  Yes because we ought to allow DImode in vsx regs, but
fixing that is likely not trivial.

Do we want to backport the PR68973 fixes to gcc-5 and gcc-4.9?  We are
exposed to the reload_vsx_from_gprsf bug there, I think, but TFmode
won't be IEEE.

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFC: [Patch, PR Bug 60818] - ICE in validate_condition_mode on powerpc*-linux-gnu* ]

2016-02-16 Thread Alan Modra
))
+return false;
 
-  gcc_assert (mode == CCFPmode
- || (code != ORDERED && code != UNORDERED
- && code != UNEQ && code != LTGT
- && code != UNGT && code != UNLT
- && code != UNGE && code != UNLE));
+  if (mode != CCFPmode
+  && (code == ORDERED || code == UNORDERED
+ || code == UNEQ || code == LTGT
+ || code == UNGT || code == UNLT
+ || code == UNGE || code == UNLE))
+return false;
 
   /* These should never be generated except for
  flag_finite_math_only.  */
-  gcc_assert (mode != CCFPmode
- || flag_finite_math_only
- || (code != LE && code != GE
- && code != UNEQ && code != LTGT
- && code != UNGT && code != UNLT));
+  if (mode == CCFPmode
+  && !flag_finite_math_only
+  && (code == LE || code == GE
+ || code == UNEQ || code == LTGT
+ || code == UNGT || code == UNLT))
+return false;
 
   /* These are invalid; the information is not there.  */
-  gcc_assert (mode != CCEQmode || code == EQ || code == NE);
+  if (mode == CCEQmode
+  && code != EQ && code != NE)
+return false;
+
+  return true;
 }
 
 
@@ -19583,7 +19591,7 @@ ccr_bit (rtx op, int scc_p)
   cc_regnum = REGNO (reg);
   base_bit = 4 * (cc_regnum - CR0_REGNO);
 
-  validate_condition_mode (code, cc_mode);
+  gcc_assert (validate_condition_mode (code, cc_mode));
 
   /* When generating a sCOND operation, only positive conditions are
  allowed.  */
@@ -20847,8 +20855,8 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)
case UNLT: or1 = UNORDERED;  or2 = LT;  break;
default:  gcc_unreachable ();
}
-  validate_condition_mode (or1, comp_mode);
-  validate_condition_mode (or2, comp_mode);
+  gcc_assert (validate_condition_mode (or1, comp_mode));
+  gcc_assert (validate_condition_mode (or2, comp_mode));
   or1_rtx = gen_rtx_fmt_ee (or1, SImode, compare_result, const0_rtx);
   or2_rtx = gen_rtx_fmt_ee (or2, SImode, compare_result, const0_rtx);
   compare2_rtx = gen_rtx_COMPARE (CCEQmode,
@@ -20860,7 +20868,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)
   code = EQ;
 }
 
-  validate_condition_mode (code, GET_MODE (compare_result));
+  gcc_assert (validate_condition_mode (code, GET_MODE (compare_result)));
 
   return gen_rtx_fmt_ee (code, VOIDmode, compare_result, const0_rtx);
 }
@@ -21368,7 +21376,7 @@ output_cbranch (rtx op, const char *label, int 
reversed, rtx_insn *insn)
   const char *pred;
   rtx note;
 
-  validate_condition_mode (code, mode);
+  gcc_assert (validate_condition_mode (code, mode));
 
   /* Work out which way this really branches.  We could use
  reverse_condition_maybe_unordered here always but this
@@ -25688,7 +25696,7 @@ rs6000_emit_prologue (void)
   hi = gen_int_mode (toc_restore_insn & ~0x, SImode);
   emit_insn (gen_xorsi3 (tmp_reg_si, tmp_reg_si, hi));
   compare_result = gen_rtx_REG (CCUNSmode, CR0_REGNO);
-  validate_condition_mode (EQ, CCUNSmode);
+  gcc_assert (validate_condition_mode (EQ, CCUNSmode));
   lo = gen_int_mode (toc_restore_insn & 0x, SImode);
   emit_insn (gen_rtx_SET (compare_result,
  gen_rtx_COMPARE (CCUNSmode, tmp_reg_si, lo)));

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFC: [Patch, PR Bug 60818] - ICE in validate_condition_mode on powerpc*-linux-gnu* ]

2016-02-16 Thread Alan Modra
On Tue, Feb 16, 2016 at 07:00:58PM +1030, Alan Modra wrote:
> What's wrong is the rs6000 backend asserting that (gtu (reg:CC)) can't
> happen, because obviously it does.  Rather than trying to fix combine,
> (where the ICE happens on attempting to validate the insn!), I think
> the rs6000 backend should change.  Like so.  Not yet bootstrapped,
> but I'm about to fire one off.
> 
>   PR target/60818
>   * config/rs6000/rs6000.c (validate_condition_mode): Return a
>   bool rather than asserting modes as expected.  Update all uses
>   to assert.
>   * config/rs6000/rs6000-protos.h (validate_condition_mode):
>   Update prototype.
>   * config/rs6000/predicates.md (branch_comparison_operator):
>   Use result of validate_condition_mode.

Now bootstrapped and regression tested powerpc64le-linux and
powerpc64-linux biarch, mainline, gcc-5 and gcc-4.9.  With this
testsuite addition.  OK to apply?

* gcc.target/powerpc/pr60818.c: New.

diff --git a/gcc/testsuite/gcc.target/powerpc/pr60818.c 
b/gcc/testsuite/gcc.target/powerpc/pr60818.c
new file mode 100644
index 000..773480b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr60818.c
@@ -0,0 +1,62 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O1 -mno-mfcrf -misel" } */
+
+int d7;
+
+static int
+ca(int l3)
+{
+  for (d7 = 0; d7 < 1; ++d7)
+;
+  return l3;
+}
+
+int
+c9(void)
+{
+  int yj;
+  return ca(((yj != 1) & 65535U) > d7);
+}
+
+
+int
+kf(int a2, unsigned int dc)
+{
+  int t3;
+  int b1[2];
+  for (t3 = 0; t3 < 2; ++t3)
+b1[t3] = 2;
+  return ((t3 > a2) >= b1[0]) < dc;
+}
+
+
+void
+ds(void)
+{
+  unsigned int t5;
+  unsigned int re;
+  int yn;
+  int *o2;
+  int *s0 = &yn;
+  for (re = 0; re < 2; ++re)
+if (0 != t5)
+  *o2 = (*s0 ^= 1) | (re = ((t5 < yn) >= (t5 > yn)));
+}
+
+
+unsigned int ou;
+int jv (void)
+{
+  unsigned int rg;
+  return rg < ou;
+}
+
+
+unsigned int vz, tr, c, fr;
+
+void
+gi(void)
+{
+  if (vz < 1)
+vz = ((fr < tr) >= (fr > tr));
+}

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFC: [Patch, PR Bug 60818] - ICE in validate_condition_mode on powerpc*-linux-gnu* ]

2016-02-17 Thread Alan Modra
On Wed, Feb 17, 2016 at 06:31:45AM -0600, Segher Boessenkool wrote:
> > Corresponding content of "op" which causes the ICE:
> > gdb) p debug_rtx (op)
> > (gtu:SI (reg:CC 166)  -- (operator and mode doesn't 
> > match)
> > (const_int 0 [0]))
> 
> That is invalid RTL for this target (should be CCUNS).  Invalid RTL
> should not be passed to recog.

Really??  combine does that all the time, when it asks "is this
instruction valid"!

> > (gdb) p debug_rtx (other_insn)
> > (insn 11 10 16 2 (set (reg:SI 165 [ D.2339+-3 ])
> > (if_then_else:SI (ne (reg:CC 166)
> > (const_int 0 [0]))
> > (reg:SI 168)
> > (reg:SI 167))) test.c:7 317 {isel_unsigned_si}
> >  (expr_list:REG_DEAD (reg:SI 168)
> > (expr_list:REG_DEAD (reg:SI 167)
> > (expr_list:REG_DEAD (reg:CC 166)
> > (expr_list:REG_EQUAL (gtu:SI (reg:CC 166)
> > (const_int 0 [0]))
> > (nil))
> 
> The REG_EQUAL there is bad already.  Where does that come from?

Rohit explain that quite well already, I thought.  It's there due to
combine transforming a GTU to NE in another insn, which means the reg
mode changes to CCmode via rs6000.h:SELECT_CC_MODE.

You might argue that combine shouldn't create such a note, but whether
the note is valid or not depends on the target, doesn't it?  And the
usual way for combine to check validity of rtl is to form up an
instruction and pass that to recog.  Which is exactly what happens
later when combine tries to use the note and runs into the rs6000
backend assert.

It seems quite plain to me that this is primarily an rs6000 backend
problem, solved by the blindingly obvious patch I posted.  Whether you
want to do something in combine as well is a secondary problem.  The
rs6000 backend shouldn't assert on this rtl.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PPC libgcc IEEE128 soft-fp exception/rounding fixes

2016-02-17 Thread Alan Modra
On Wed, Feb 17, 2016 at 05:40:01PM -0600, Paul E. Murphy wrote:
> - FP_INIT_ROUNDMODE writes junk to the fpscr. I assume this should be
>   reading the fpscr and initializing the local rounding mode variable
>   declared via _FP_DECL_EX.

Yeah, looks that way.

> - FP_TRAPPING_EXCEPTIONS evaluates to zero where used. It seems like it
>   should return a bit field of FP_EX_* bits indicating which trap is
>   enabled. Likewise, when these bits are set in the fpscr, the trap is
>   enabled.

Yes, but

> +/* A set bit indicates an exception is trapping.  */
> +# define FP_TRAPPING_EXCEPTIONS ((_fpscr.i << 22) & FP_EX_ALL)

why then a shift here, since FP_EX_* are defined as the actual
register bits?  Oh, I see.  FP_EX_* are the status bits, and you want
the enable bits.  ie. bit 56 rather than bit 34, bit 57 rather than
bit 35 and so on (bits numbered from 0 as msb).  A comment to that
effect might reduce head scratching.

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFC: [Patch, PR Bug 60818] - ICE in validate_condition_mode on powerpc*-linux-gnu* ]

2016-02-18 Thread Alan Modra
On Thu, Feb 18, 2016 at 03:43:07AM -0600, Segher Boessenkool wrote:
> Either combine should delete the note (my current patch), or it can

Works for me.  I'm not sure I'd want to promise that combine won't
ever create what you call "invalid RTL", in notes.

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH] decl alignment not respected

2016-03-01 Thread Alan Modra
This patch cures a problem with ICF of read-only variables at the
intersection of -fsection-anchors, -ftree-loop-vectorize, and targets
with alignment restrictions.  The testcase results in
/usr/local/powerpc64le-linux/bin/ld: pack.o: In function `main':
pack.c:(.text.startup+0xc): error: R_PPC64_TOC16_LO_DS not a multiple of 4
/usr/local/powerpc64le-linux/bin/ld: final link failed: Bad value
on powerpc64le-linux.

What happens is:
- "c" is referenced in a constructor, thus make_decl_rtl for "c",
- make_decl_rtl puts "c" in an anchor block (-fsection-anchors),
- anchor block contents can't move, so "c" alignment can't change by
  ipa_increase_alignment (-ftree-loop-vectorize),
- however "a" alignment can be increased,
- ICF aliases "a" to "c".
So we have a decl for "a" saying it is aligned to 128 bits, using mem
for "c" which is only 16 bit aligned.  The supposed increased
alignment causes "a" to be accessed as a 64-bit word, and the
powerpc64 backend to use a ds-form addressing mode.

Somewhere this chain of events needs to be broken.  It isn't possible
to stop ipa_increase_alignment changing "a", because at that stage
ICF aliases don't exist.  So it seemed to me that ICF needed to be
taught not to create a problematic alias.  Not being very familiar
with this code, I don't know if the following is the best place to
change, but sem_variable::merge does throw away aliases for quite a
lot of other reasons.  Another possibility is
sem_variable::equals_wpa, where there's a comment about DECL_ALIGN
being safe to merge "because we will always choose the largest
alignment out of all aliases".  Except with this testcase we don't
choose the largest alignment, and indeed can't (I think) due to "c"
being used in a constructor.

Bootstrapped and regression tested powerpc64le-linux.  OK to apply?

gcc/
PR ipa/69990
* ipa-icf.c (sem_variable::merge): Do not merge an alias with
larger alignment.
gcc/testsuite/
gcc.dg/pr69990.c: New.

diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index ef04c55..d82eb87 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -2209,6 +2209,16 @@ sem_variable::merge (sem_item *alias_item)
 "adress of original and alias may be compared.\n\n");
   return false;
 }
+
+  if (DECL_ALIGN (original->decl) < DECL_ALIGN (alias->decl))
+{
+  if (dump_file)
+   fprintf (dump_file, "Not unifying; "
+"original and alias have incompatible alignments\n\n");
+
+  return false;
+}
+
   if (DECL_COMDAT_GROUP (original->decl) != DECL_COMDAT_GROUP (alias->decl))
 {
   if (dump_file)
diff --git a/gcc/testsuite/gcc.dg/pr69990.c b/gcc/testsuite/gcc.dg/pr69990.c
new file mode 100644
index 000..efb835e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69990.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-require-effective-target section_anchors } */
+/* { dg-options "-O2 -fsection-anchors -ftree-loop-vectorize" } */
+
+#pragma pack(1)
+struct S0 {
+  volatile int f0:12;
+} static a[] = {{15}}, c[] = {{15}};
+
+struct S0 b[] = {{7}};
+
+int __attribute__ ((noinline, noclone))
+ok (int a, int b, int c)
+{
+  return a == 15 && b == 7 && c == 15 ? 0 : 1;
+}
+
+int
+main (void)
+{
+  struct S0 *f[] = { c, b };
+
+  return ok (a[0].f0, b[0].f0, f[0]->f0);
+}


[RFC] PR69195, Reload confused by invalid reg equivs

2016-03-04 Thread Alan Modra
ded_label_ref;
 }
 
@@ -3882,11 +3914,12 @@ setup_reg_equiv (void)
   {
next_elem = elem->next ();
insn = elem->insn ();
-   set = single_set (insn);

/* Init insns can set up equivalence when the reg is a destination or
   a source (in this case the destination is memory).  */
-   if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set
+   if (! insn->deleted ()
+   && (set = single_set (insn)) != 0
+   && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set
  {
if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL)
  {
@@ -5090,7 +5123,6 @@ ira (FILE *f)
 {
   bool loops_p;
   int ira_max_point_before_emit;
-  int rebuild_p;
   bool saved_flag_caller_saves = flag_caller_saves;
   enum ira_region saved_flag_ira_region = flag_ira_region;
 
@@ -5184,10 +5216,9 @@ ira (FILE *f)
   if (resize_reg_info () && flag_ira_loop_pressure)
 ira_set_pseudo_classes (true, ira_dump_file);
 
-  rebuild_p = update_equiv_regs ();
-  setup_reg_equiv ();
-  setup_reg_equiv_init ();
-
+  pdx_subregs = XCNEWVEC (bool, max_regno);
+  reg_equiv = XCNEWVEC (struct equivalence, max_regno);
+  bool rebuild_p = update_equiv_regs ();
   bool update_regstat = false;
 
   if (optimize && rebuild_p)
@@ -5210,6 +5241,14 @@ ira (FILE *f)
   update_regstat = true;
 }
 
+  if (update_regstat)
+validate_equiv_regs ();
+  free (reg_equiv);
+  free (pdx_subregs);
+
+  setup_reg_equiv ();
+  setup_reg_equiv_init ();
+
   /* It is not worth to do such improvement when we use a simple
  allocation because of -O0 usage or because the function is too
  big.  */
diff --git a/gcc/testsuite/gcc.dg/pr69195.c b/gcc/testsuite/gcc.dg/pr69195.c
new file mode 100644
index 000..af373a1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69195.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-dce -fno-forward-propagate" } */
+
+void __attribute__ ((noinline, noclone))
+foo (int *a, int n)
+{
+  int *lasta = a + n;
+  for (; a != lasta; a++)
+{
+  *a *= 2;
+  a[1] = a[-1] + a[-2];
+}
+}
+
+int
+main ()
+{
+  int a[16] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 };
+  int r[16] = { 1, 2, 6, 6, 16, 24, 44, 80,
+   136, 248, 432, 768, 1360, 2400, 4256, 3760 };
+  unsigned i;
+  foo (&a[2], 13);
+  for (i = 0; i < 8; ++i)
+if (a[i] != r[i])
+  __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr69238.c b/gcc/testsuite/gcc.dg/pr69238.c
new file mode 100644
index 000..3538e63
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69238.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-dce -fno-forward-propagate -fno-rerun-cse-after-loop 
-funroll-loops" } */
+
+
+#define N 32
+
+short sa[N];
+short sb[N];
+int ia[N];
+int ib[N];
+
+int __attribute__ ((noinline, noclone))
+main1 (int n)
+{
+  int i;
+  for (i = 0; i < n; i++)
+{
+  sa[i+7] = sb[i];
+  ia[i+3] = ib[i+1];
+}
+  return 0;
+}
+
+int
+main (void)
+{ 
+  return main1 (N-7);
+}

-- 
Alan Modra
Australia Development Lab, IBM


Re: gcc's obvious patch policy

2013-11-26 Thread Alan Modra
On Tue, Nov 26, 2013 at 04:30:50PM -0500, David Edelsohn wrote:
> >> Sorry to pick on you here Steven, but this doesn't meet gcc's
> >> definition of an obvious patch.  Don't believe me?
> 
> > No.  I don't, let me quote from the policy:
> 
> I find this whole thread a rather sad and pathetic bikeshed
> discussion. Regardless of the formal policy, the basic concept is to
> use common sense.  Common sense about the context of the code being
> changed, common sense about the patch itself, and common sense about
> the maintenance area and the maintainers.
> 
> Anything more than that is people trying to create / change rules as a
> stick to hit each other over the head or a straight jacket to tie each
> other up.

I find this a bit rich coming from you, David.  On the weekend I
committed a patch as obvious, for which you "hit me over the head",
stating in no uncertain terms that I should not bypass you and commit
patches like that as "obvious".  I still think the substance of the
patch was obvious for anyone who has worked on the powerpc backend for
as long as I have, but after some discussion I backed down because
technically, you were within your rights and I had transgressed the
rules.

You have the stick *now*.  And wield it.  I'm trying to take it away
from you..

-- 
Alan Modra
Australia Development Lab, IBM


Re: gcc's obvious patch policy

2013-11-26 Thread Alan Modra
On Tue, Nov 26, 2013 at 04:56:26PM -0500, Robert Dewar wrote:
> To me the issue is not what is written down about
> the policy, but whether the policy works in practice,
> and it seems like it does, so what's the problem?
> 
> This just seems to be making a problem where
> none exists.

I gave some background in my email to David over why I'm stirring the
pot here.

The thing about written policy is that it sets the tone for a project.
A restrictive policy tends to authoritarian rule by maintainers, it
seems to me.  A less restricive policy ought to ease some of the
nonsense that goes on currently, for instance, port maintainers
thinking they need to get global maintainer permission for trivial
patches outside their area of responsibility.  As an example, for
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02793.html I was told
I'd need global maintainer approval..  Well, maybe I do in the current
climate.

I hope I haven't offended the review gods too much here.  I'm sure
other people have noticed the issues I'm raising but have more wisely
than I, kept quiet.  

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PING][PATCH] LRA: check_rtl modifies RTL instruction stream

2013-11-28 Thread Alan Modra
On Wed, Nov 20, 2013 at 11:18:49AM -0700, Jeff Law wrote:
> >2013-11-13  Robert Suchanek  
> >
> > * lra.c (lra): Set lra_in_progress before check_rtl call.
> > * recog.c (insn_invalid_p): Add !lra_in_progress to prevent
> > adding clobber regs when LRA is running

Trying to run the testsuite with -mlra and the default -mcmodel=medium
on powerpc64 now results in enormous numbers of failures like the
following.

/home/alanm/src/gcc-virgin/libatomic/testsuite/libatomic.c/atomic-exchange-1.c:67:1:
 error: insn does not satisfy its constraints:
 }
 ^
(insn 5 2 6 2 (set (reg/f:DI 212)
(mem/u/c:DI (unspec:DI [
(symbol_ref/u:DI ("*.LC0") [flags 0x2])
(reg:DI 2 2)
] UNSPEC_TOCREL) [0 S8 A8])) 
/home/alanm/src/gcc-virgin/libatomic/testsuite/libatomic.c/atomic-exchange-1.c:14
 505 {*movdi_internal64}
 (expr_list:REG_EQUAL (symbol_ref:DI ("v") )
(nil)))

This is due to that innocuous seeming change of setting
lra_in_progress before calling check_rtl(), in combination with
previous changes Vlad made to the rs6000 backend here:
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html
In particular the "Call legitimate_constant_pool_address_p in strict
mode for LRA" change, that sets "strict" when lra_in_progress.

I'm not at all familiar with lra so why Vlad made those changes to
rs6000.c is totally opaque to me.  If this were a reload problem I
could dive in and fix it, but not lra, sorry..

What I can say is that the rtl shown above is a toc reference of the
form that is valid for -mcmodel=small both before and after reload,
and generates "ld offset(r2)" machine instructions.  The form is valid
for -mcmodel=medium/large only before reload.  After reload it is
supposed to be split into high/lo_sum variants that generate
"addis rtmp,offset@ha(r2); ld offset@l(rtmp)".

-- 
Alan Modra
Australia Development Lab, IBM


Re: LRA vs reload on powerpc: 2 extra FAILs that are actually improvements?

2013-11-30 Thread Alan Modra
> On Sat, Nov 2, 2013 at 6:48 PM, Steven Bosscher  wrote:
> > The failure of pr53199.c is because of different instruction selection
> > for bswap. Test case is reduced to just one function:
[snip]
> > Is this an improvement or a regression? If it's an improvement then
> > these two test cases should be adjusted :-)

As David said, going through memory is bad, we get a load-hit-store
flush.  Definitely a regression on power7.  Does anyone know why the
bswapdi2_64bit r,r alternative is disparaged?  Seems like it has been
that way since the orginal mainline commit.

int main (void)
{
  int i;
  long ret = 0;
  long tmp1, tmp2, tmp3;

  for (i = 0; i < 10; i++)
#if MEM == 1
/* From pr53199.c reg_reverse, -mlra -mcpu=power6 -mtune=power7.  */
__asm__ __volatile__ ("\
addi %1,1,-16\n\
srdi %3,%0,32\n\
li %2,4\n\
stwbrx %0,0,%1\n\
stwbrx %3,%2,%1\n\
ld %0,-16(1)" : "+r" (ret), "=&b" (tmp1), "=&r" (tmp2), "=&r" (tmp3));
#elif MEM == 2
/* From pr53199.c reg_reverse, -mlra -mcpu=power6.  */
__asm__ __volatile__ ("\
addi %1,1,-16\n\
srdi %3,%0,32\n\
addi %2,%1,4\n\
stwbrx %0,0,%1\n\
stwbrx %3,0,%2\n\
ld %0,-16(1)" : "+r" (ret), "=&b" (tmp1), "=&b" (tmp2), "=&r" (tmp3));
#elif MEM == 3
/* From pr53199.c reg_reverse, -mlra -mcpu=power7.  */
__asm__ __volatile__ ("\
std %0,-16(1)\n\
addi %1,1,-16\n\
ldbrx %0,0,%1\n" : "+r" (ret), "=&b" (tmp1));
#else
__asm__ __volatile__ ("\
srdi %1,%0,32\n\
rlwinm %2,%0,8,0x\n\
rlwinm %3,%1,8,0x\n\
rlwimi %2,%0,24,0,7\n\
rlwimi %2,%0,24,16,23\n\
rlwimi %3,%1,24,0,7\n\
rlwimi %3,%1,24,16,23\n\
sldi %2,%2,32\n\
or %2,%2,%3\n\
mr %0,%2" : "+r" (ret), "=&r" (tmp1), "=&r" (tmp2), "=&r" (tmp3));
#endif
  return ret;
}

/*
amodra@bns:~> gcc -O2 bswap_mem.c 
amodra@bns:~> time ./a.out 

real0m3.096s
user0m3.089s
sys 0m0.001s
amodra@bns:~> time ./a.out 

real0m3.096s
user0m3.094s
sys 0m0.002s
amodra@bns:~> gcc -O2 -DMEM=1 bswap_mem.c 
amodra@bns:~> time ./a.out 

real0m12.661s
user0m12.657s
sys 0m0.003s
amodra@bns:~> time ./a.out 

real0m12.660s
user0m12.657s
sys 0m0.003s
amodra@bns:~> gcc -O2 -DMEM=2 bswap_mem.c 
amodra@bns:~> time ./a.out 

real0m12.660s
user0m12.657s
sys 0m0.003s
amodra@bns:~> time ./a.out 

real0m12.660s
user0m12.657s
sys 0m0.004s
amodra@bns:~> gcc -O2 -DMEM=3 bswap_mem.c 
amodra@bns:~> time ./a.out 

real0m10.279s
user0m10.276s
sys 0m0.003s
amodra@bns:~> time ./a.out 

real0m10.279s
user0m10.276s
sys 0m0.003s

I also looked at the register version and -DMEM=1 case with power7
simulators finding that the register version had a delay of 12 cycles
from completion of the first instruction to completion of the last.
The -DMEM=1 case had a corresponding delay of 49 cycles, which matches
the loop timing above quite well.
*/

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PowerPC] libffi fixes and support for PowerPC64 ELFv2

2013-12-01 Thread Alan Modra
Ping http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02793.html

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PING][PATCH] LRA: check_rtl modifies RTL instruction stream

2013-12-03 Thread Alan Modra
On Mon, Dec 02, 2013 at 11:04:39PM -0700, Jeff Law wrote:
> On 11/28/13 16:50, Alan Modra wrote:
> >
> >This is due to that innocuous seeming change of setting
> >lra_in_progress before calling check_rtl(), in combination with
> >previous changes Vlad made to the rs6000 backend here:
> >http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html
> >In particular the "Call legitimate_constant_pool_address_p in strict
> >mode for LRA" change, that sets "strict" when lra_in_progress.
> Is this still an issue?

No, the changes Vlad made fixed the powerpc problem.  Thanks!

> That code has gone through a couple revisions.  Robert's change was
> reverted and Vlad twiddled thigns to use recog_memoized instead of
> insn_invalid_p which prevents check_rtl from incorrectly adding
> CLOBBERs after the point where an insn's form is supposed to be
> fixed.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PowerPC] libffi fixes and support for PowerPC64 ELFv2

2013-12-03 Thread Alan Modra
On Tue, Dec 03, 2013 at 09:05:53AM -0500, David Edelsohn wrote:
> On Thu, Nov 21, 2013 at 9:57 PM, Alan Modra  wrote:
> > David,
> > Here comes the inevitable followup..  I broke backwards compatibility
> > when adding an extra field to ffi_cif.  I'd like to import again from
> > upstream, where I've already fixed the problem.
> >
> > https://sourceware.org/ml/libffi-discuss/2013/msg00220.html
> >
> > Actually, it's not a straight import because many files outside of
> > libffi/src/powerpc/ have diverged, but fortunately for me, not
> > significantly.  For the record, I've shown the files that need
> > patching below.  Identical patches went in upstream (except for
> > formatting differences in Makefile.am).  Bootstrapped etc.
> > powerpc64-linux and powerpc64le-linux.  OK to apply?
> >
> > libffi/
> > * src/powerpc/ffitarget.h: Import from upstream.
> > * src/powerpc/ffi_powerpc.h: Likewise.
> > * src/powerpc/ffi.c: Likewise.
> > * src/powerpc/ffi_sysv.c: Likewise.
> > * src/powerpc/ffi_linux64.c: Likewise.
> > * src/powerpc/sysv.S: Likewise.
> > * src/powerpc/ppc_closure.S: Likewise.
> > * src/powerpc/linux64.S: Likewise.
> > * src/powerpc/linux64_closure.S: Likewise.
> > * src/types.c: Likewise.
> > * Makefile.am (EXTRA_DIST): Add new src/powerpc files.
> > (nodist_libffi_la_SOURCES ): Likewise.
> > * configure.ac (HAVE_LONG_DOUBLE_VARIANT): Define for powerpc.
> > * include/ffi.h.in (ffi_prep_types): Declare.
> > * src/prep_cif.c (ffi_prep_cif_core): Call ffi_prep_types.
> > * configure: Regenerate.
> > * fficonfig.h.in: Regenerate.
> > * Makefile.in: Regenerate.
> > * man/Makefile.in: Regenerate.
> > * include/Makefile.in: Regenerate.
> > * testsuite/Makefile.in: Regenerate.
> 
> Have you tested this patch on targets other than powerpc-linux? Have
> you tested this patch on AIX?

I haven't tested on AIX or Darwin, sorry  If you find any problems on
AIX, please let me know and I'll fix them.  I have tested on
powerpc-linux, powerpc64-linux, powerpc64le-linux and powerpc-freebsd,
which are the targets most affected by this change.  I have also
bootstrapped and regression tested x86_64-linux gcc with this patch
applied.

-- 
Alan Modra
Australia Development Lab, IBM


Two build != host fixes

2013-12-03 Thread Alan Modra
When build != host, the build compiler obviously isn't the same as the
host compiler, but BUILD_CXXFLAGS is currently ALL_CXXFLAGS.
ALL_CXXFLAGS contains flags for which we run configure tests on the
host compiler, which may not work on the build compiler.  I ran into
this with -Wno-narrowing.  It looks like this problem has been hit
before when we used to build gcc with a C compiler and solved, but the
solution needs extending now that we build with a C++ compiler.

The second problem is that GMPINC (specifying a host include path) may
not be correct for the build compiler.  I found this caused a lot of
build configure failures when attempting a build=powerpc64-linux,
host=powerpc64le-linux, target=powerpc64le-linux toolchain due to
mismatches in various function declarations between build and host.
The configure problems caused later build failure.  (Hint for anyone
trying to debug canadian crosses: comment out "rm -rf $tempdir" just a
few lines below the first hunk in the patch below to see the build
config.log.)

The other little tweak here is to omit saving CFLAGS.  CFLAGS is set
here only in the environment of the called command, not the caller
shell.  So there is no need to save and restore.

Bootstrapped etc. powerpc64-linux.  OK mainline and 4.8 branch?

* configure.ac (BUILD_CXXFLAGS) Don't use ALL_CXXFLAGS for
build != host.
: Clear GMPINC.  Don't bother
saving CFLAGS.
* configure: Regenerate.

Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 205341)
+++ gcc/configure.ac(working copy)
@@ -1529,13 +1529,11 @@
/* | [A-Za-z]:[\\/]* ) realsrcdir=${srcdir};;
*) realsrcdir=../${srcdir};;
esac
-   saved_CFLAGS="${CFLAGS}"
CC="${CC_FOR_BUILD}" CFLAGS="${CFLAGS_FOR_BUILD}" \
-   LDFLAGS="${LDFLAGS_FOR_BUILD}" \
+   LDFLAGS="${LDFLAGS_FOR_BUILD}" GMPINC="" \
${realsrcdir}/configure \
--enable-languages=${enable_languages-all} \
--target=$target_alias --host=$build_alias --build=$build_alias
-   CFLAGS="${saved_CFLAGS}"
 
# We just finished tests for the build machine, so rename
# the file auto-build.h in the gcc directory.
@@ -1900,6 +1898,7 @@
 if test x$build != x$host || test "x$coverage_flags" != x
 then
 BUILD_CFLAGS='$(INTERNAL_CFLAGS) $(T_CFLAGS) $(CFLAGS_FOR_BUILD)'
+BUILD_CXXFLAGS='$(INTERNAL_CFLAGS) $(T_CFLAGS) $(CXXFLAGS_FOR_BUILD)'
 BUILD_LDFLAGS='$(LDFLAGS_FOR_BUILD)'
 fi
 

-- 
Alan Modra
Australia Development Lab, IBM


Re: Two build != host fixes

2013-12-03 Thread Alan Modra
On Tue, Dec 03, 2013 at 11:44:46PM -0500, DJ Delorie wrote:
> Alan Modra  writes:
> > Bootstrapped etc. powerpc64-linux.  OK mainline and 4.8 branch?
> >
> > * configure.ac (BUILD_CXXFLAGS) Don't use ALL_CXXFLAGS for
> > build != host.
> > : Clear GMPINC.  Don't bother
> > saving CFLAGS.
> > * configure: Regenerate.
> 
> Ok for mainline, up to the 4.8 release manager if it's OK there but it
> looks OK to me.
> 
> Do we need to add a CXXFLAGS= to that configure too?

We basically need whatever is used to make decisions for auto-host.h
(auto-build.h) contents.  I hadn't found CXXFLAGS necessary, but that
might just be similarity of build to host in my case or simply lack of
noticing a configury error..  I guess adding CXXFLAGS might be a good
idea for future-proofing.

Hmm, this is opening up a whole can of worms.  auto-host.h definitely
contains info about assembler capabilities, so to get a "good"
auto-build.h we really ought to set AS=$AS_FOR_BUILD too.  Similar
reasoning applies to LD= for all the HAVE_LD_* macros.  Whether a
"good" auto-build.h makes any difference over a "bad" one is another
question.

Maybe we should use most of BUILD_EXPORTS in the top level
Makefile.in?  What can go wrong with that? :)

-- 
Alan Modra
Australia Development Lab, IBM


Re: Two build != host fixes

2013-12-04 Thread Alan Modra
On Wed, Dec 04, 2013 at 04:36:58PM +1030, Alan Modra wrote:
> Maybe we should use most of BUILD_EXPORTS in the top level
> Makefile.in?  What can go wrong with that? :)

I had a look at this, as it's easy to do, but I didn't find any
significant bug to justify such a change in stage3.  So I've committed
the original patch as posted, rev 205690.

I did run into other problems in the process:  Running configure in
the gcc directory rather than at the top level is apparently a bad
idea nowadays for a canadian cross.  For example, absent an
AS_FOR_TARGET or DEFAULT_ASSEMBLER definition, configure doesn't find
an assembler..

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFA: patch to fix 2 testsuite failures for LRA on PPC

2013-12-06 Thread Alan Modra
On Fri, Dec 06, 2013 at 05:23:28PM -0500, Vladimir Makarov wrote:
> On 12/6/2013, 2:40 PM, David Edelsohn wrote:
> >On Fri, Dec 6, 2013 at 2:02 PM, Vladimir Makarov  wrote:
> >> * config/rs6000/rs600.md (*bswapdi2_64bit): Remove ?? from the
> >> constraint.
> >
> >Okay, let's just remove the "??" modifier from the constraint.
> >
> >Thanks for your patience, explanations, and work on this, Vlad.
> >
> 
> Thanks, David.
> 
> Committed as rev. 205765.

/* -m32 -O2 -S -mlra */
long long swap64 (long long x)
{
  return __builtin_bswap64 (x);
}

Here too, I think.  OK to apply David?

* config/rs6000/rs600.md (bswapdi2_32bit): Remove ?? from the
constraint.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 205767)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -2544,7 +2544,7 @@
 }")
 
 (define_insn "bswapdi2_32bit"
-  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,??&r")
+  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,&r")
(bswap:DI (match_operand:DI 1 "reg_or_mem_operand" "Z,r,r")))
(clobber (match_scratch:SI 2 "=&b,&b,X"))]
   "!TARGET_POWERPC64 && (REG_P (operands[0]) || REG_P (operands[1]))"

-- 
Alan Modra
Australia Development Lab, IBM


Re: Two build != host fixes

2013-12-08 Thread Alan Modra
On Tue, Dec 03, 2013 at 11:44:46PM -0500, DJ Delorie wrote:
> Alan Modra  writes:
> > Bootstrapped etc. powerpc64-linux.  OK mainline and 4.8 branch?
> >
> > * configure.ac (BUILD_CXXFLAGS) Don't use ALL_CXXFLAGS for
> > build != host.
> > : Clear GMPINC.  Don't bother
> > saving CFLAGS.
> > * configure: Regenerate.
> 
> Ok for mainline, up to the 4.8 release manager if it's OK there but it
> looks OK to me.

http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00304.html
Ping for 4.8?

-- 
Alan Modra
Australia Development Lab, IBM


Re: Two build != host fixes

2013-12-11 Thread Alan Modra
On Wed, Dec 11, 2013 at 12:10:04PM +0100, Bernd Edlinger wrote:
> Hi,
> 
> I'm having problems with that patch.

Sorry to hear that.

> I try to start at X86_64-linux-gnu, and I want to get the GCC running on 
> arm-linux-gnueabihf.
> I grabbed system headers and libraries from the target and put it in the 
> prefix path.
> 
> In the first step I do
> 
> ../gcc-4.9-20131208/configure 
> --prefix=/home/ed/gnu/arm-linux-gnueabihf-linux64 
> --target=arm-linux-gnueabihf --enable-languages=c,c++,fortran 
> --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 
> --with-float=hard
> 
> This GCC runs on PC and generates arm-linux-gnueabihf executables.
> 
> Then I try this
> 
> ../gcc-4.9-20131208/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf-cross 
> --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf 
> --enable-languages=c,c++,fortran --with-arch=armv7-a --with-tune=cortex-a9 
> --with-fpu=vfpv3-d16 --with-float=hard
> 
> But It fails because auto-build.h contains nonsense. That is probably because 
> almost every check
> has a fatal error #include  not found.
> 
> I personally prefer to have gmp, mpfr, mpc in-tree (using 
> contrib/download_prerequisites).
> 
> I experimented a bit and at least this attached patch improves the situation 
> for me.
> 
> Maybe I never had any problems with GMP before, because the in-tree 
> configuration of GMP does -DNO_ASM ?

GMPINC really shouldn't be used to find build headers, since it is used
to find host headers.  See the top level Makefile.in.  When gmp has
been installed, using GMPINC means you pull in a whole lot of host
headers for the build compiler.  Which might work in rare cases, but
it's a lot more likely to fail.  Even with in-tree gmp, how do you get
things like GMP_LIMB_BITS correct if your build machine is 64-bit and
your host is 32-bit?  (Perhaps there is some build magic that allows
this to work, I'll investigate when I get back from vacation.)

Incidentally, we've been using a couple of other patches for
build != host that I haven't posted because I wasn't sure who authored
them.  It's possible the first one might help you.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index aad927c..7995e64 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -747,7 +747,8 @@ BUILD_LINKERFLAGS = $(BUILD_CXXFLAGS)
 
 # Native linker and preprocessor flags.  For x-fragment overrides.
 BUILD_LDFLAGS=@BUILD_LDFLAGS@
-BUILD_CPPFLAGS=$(ALL_CPPFLAGS)
+BUILD_CPPFLAGS= -I. -I$(@D) -I$(srcdir) -I$(srcdir)/$(@D) \
+  -I$(srcdir)/../include $(CPPINC)
 
 # Actual name to use when installing a native compiler.
 GCC_INSTALL_NAME := $(shell echo gcc|sed '$(program_transform_name)')
diff --git a/gcc/ada/gcc-interface/Make-lang.in 
b/gcc/ada/gcc-interface/Make-lang.in
index 57f9009..e1d3ed6 100644
--- a/gcc/ada/gcc-interface/Make-lang.in
+++ b/gcc/ada/gcc-interface/Make-lang.in
@@ -625,7 +625,7 @@ ada.tags: force
 ada/doctools/xgnatugn$(build_exeext): ada/xgnatugn.adb
-$(MKDIR) ada/doctools
$(CP) $^ ada/doctools
-   cd ada/doctools && $(GNATMAKE) -q xgnatugn
+   cd ada/doctools && gnatmake -q xgnatugn
 
 # Note that doc/gnat_ugn.texi and doc/projects.texi do not depend on
 # xgnatugn being built so we can distribute a pregenerated doc/gnat_ugn.info
diff --git a/gnattools/Makefile.in b/gnattools/Makefile.in
index 794d374..6b0d5e8 100644
--- a/gnattools/Makefile.in
+++ b/gnattools/Makefile.in
@@ -23,6 +23,7 @@ SHELL = @SHELL@
 srcdir = @srcdir@
 libdir = @libdir@
 build = @build@
+host = @host@
 target = @target@
 prefix = @prefix@
 INSTALL = @INSTALL@
@@ -31,6 +32,7 @@ INSTALL_PROGRAM = @INSTALL_PROGRAM@
 
 # Nonstandard autoconf-set variables.
 LN_S=@LN_S@
+host_alias=@host_alias@
 target_noncanonical=@target_noncanonical@
 
 # Variables for the user (or the top level) to override.
@@ -183,7 +185,11 @@ regnattools: $(GCC_DIR)/stamp-gnatlib-rts
 # put the host RTS dir first in the PATH to hide the default runtime
 # files that are among the sources
 # FIXME: This should be done in configure.
+ifeq ($(host), $(build))
 RTS_DIR:=$(strip $(subst \,/,$(shell gnatls -v | grep adalib )))
+else
+RTS_DIR:=$(strip $(subst \,/,$(shell $(host_alias)-gnatls -v | grep adalib )))
+endif
 gnattools-cross: $(GCC_DIR)/stamp-tools
# gnattools1-re
$(MAKE) -C $(GCC_DIR)/ada/tools -f ../Makefile \

-- 
Alan Modra
Australia Development Lab, IBM


Re: Two build != host fixes

2013-12-11 Thread Alan Modra
On Wed, Dec 11, 2013 at 02:11:49PM +0100, Bernd Edlinger wrote:
> We need the auto-build only to build something that translates .md files to 
> .c,
> so I would'nt care about GMP, but some other things, like the right prototype 
> for
> printf make a difference.

Right, but when you get some of the HAVE_* wrong, libiberty for the
build compiler provides some of the "missing" functions and
declarations.  The declarations can clash with system header
declarations giving you bootstrap failures for no good reason..

> > diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> > index aad927c..7995e64 100644
> > --- a/gcc/Makefile.in
> > +++ b/gcc/Makefile.in
> > @@ -747,7 +747,8 @@ BUILD_LINKERFLAGS = $(BUILD_CXXFLAGS)
> >
> > # Native linker and preprocessor flags. For x-fragment overrides.
> > BUILD_LDFLAGS=@BUILD_LDFLAGS@
> > -BUILD_CPPFLAGS=$(ALL_CPPFLAGS)
> > +BUILD_CPPFLAGS= -I. -I$(@D) -I$(srcdir) -I$(srcdir)/$(@D) \
> > + -I$(srcdir)/../include $(CPPINC)
> >
> 
> I did not have this one.
> What is it good for?

It fixes another case of host header directories being searched for
the build compiler.  Important when GMPINC and other *INC point at
installed locations for the host compiler.  Trouble is, you don't just
get the headers you want (eg. gmp.h) but all the other host headers
too.

-- 
Alan Modra
Australia Development Lab, IBM


Re: question about REG_PARM_STACK_SPACE usage in expand_call

2013-12-14 Thread Alan Modra
On Sat, Dec 14, 2013 at 09:14:34PM +0100, Tom de Vries wrote:
> I wonder if OUTGOING_REG_PARM_STACK_SPACE makes a difference here.
> 
> If OUTGOING_REG_PARM_STACK_SPACE == 0, it is the responsibility of
> the callee to allocate the area reserved for arguments passed in
> registers. AFAIU, both functions a and b would do that in their own
> stack frame, and there's no need to test for reg_parm_stack_space !=
> REG_PARM_STACK_SPACE (current_function_decl).
> 
> If OUTGOING_REG_PARM_STACK_SPACE != 0, it is the responsibility of
> the caller to allocate the area reserved for arguments passed in
> registers. Which means that function a and b share the space
> allocated by the caller of function a.
> AFAIU, what is required is reg_parm_stack_space <=
> REG_PARM_STACK_SPACE (current_function_decl).

Hi Tom, I happened to be looking at this code a few weeks ago as part
of the PowerPC64 ELFv2 ABI work.  I missed seeing the fndecl /
current_function_decl bug, but came to the same conclusion as
you do above regarding reg_parm_stack_space.  In fact, I think you can
go a little further.  Not all changes in OUTGOING_REG_PARM_STACK_SPACE
are bad.  If the current function has OUTGOING_REG_PARM_STACK_SPACE
non-zero then a sibcall to a function with
OUTGOING_REG_PARM_STACK_SPACE zero ought to be OK.  So..

 #ifdef REG_PARM_STACK_SPACE
-  /* If outgoing reg parm stack space changes, we can not do sibcall.  */
-  || (OUTGOING_REG_PARM_STACK_SPACE (funtype)
- != OUTGOING_REG_PARM_STACK_SPACE (TREE_TYPE (current_function_decl)))
-  || (reg_parm_stack_space != REG_PARM_STACK_SPACE (fndecl))
+  || (OUTGOING_REG_PARM_STACK_SPACE (funtype)
+ && (!OUTGOING_REG_PARM_STACK_SPACE (TREE_TYPE (current_function_decl))
+ || (reg_parm_stack_space
+ > REG_PARM_STACK_SPACE (current_function_decl
 #endif

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] bswapdi2 pattern, reload and lra

2013-12-17 Thread Alan Modra
 (word1, src_si));
+  emit_insn (gen_bswapsi2 (word2, op3_si));
 }
   else
 {
-  word_high = change_address (dest, SImode, addr2);
-  word_low  = change_address (dest, SImode, addr1);
+  emit_insn (gen_bswapsi2 (word2, src_si));
+  emit_insn (gen_bswapsi2 (word1, op3_si));
 }
-  emit_insn (gen_bswapsi2 (word_high, src_si));
-  emit_insn (gen_bswapsi2 (word_low, op3_si));
 }")
 
 (define_split
   [(set (match_operand:DI 0 "gpc_reg_operand" "")
(bswap:DI (match_operand:DI 1 "gpc_reg_operand" "")))
(clobber (match_operand:DI 2 "gpc_reg_operand" ""))
-   (clobber (match_operand:DI 3 "gpc_reg_operand" ""))
-   (clobber (match_operand:DI 4 "" ""))]
+   (clobber (match_operand:DI 3 "gpc_reg_operand" ""))]
   "TARGET_POWERPC64 && reload_completed"
   [(const_int 0)]
   "
@@ -2544,7 +2540,7 @@
 }")
 
 (define_insn "bswapdi2_32bit"
-  [(set (match_operand:DI 0 "reg_or_mem_operand" "=&r,Z,&r")
+  [(set (match_operand:DI 0 "reg_or_mem_operand" "=r,Z,?&r")
(bswap:DI (match_operand:DI 1 "reg_or_mem_operand" "Z,r,r")))
(clobber (match_scratch:SI 2 "=&b,&b,X"))]
   "!TARGET_POWERPC64 && (REG_P (operands[0]) || REG_P (operands[1]))"
@@ -2573,7 +2569,8 @@
   if (GET_CODE (addr1) == PLUS)
 {
   emit_insn (gen_add3_insn (op2, XEXP (addr1, 0), GEN_INT (4)));
-  if (TARGET_AVOID_XFORM)
+  if (TARGET_AVOID_XFORM
+ || REGNO (XEXP (addr1, 1)) == REGNO (dest2))
{
  emit_insn (gen_add3_insn (op2, XEXP (addr1, 1), op2));
  addr2 = op2;
@@ -2581,7 +2578,8 @@
   else
addr2 = gen_rtx_PLUS (SImode, op2, XEXP (addr1, 1));
 }
-  else if (TARGET_AVOID_XFORM)
+  else if (TARGET_AVOID_XFORM
+  || REGNO (addr1) == REGNO (dest2))
 {
   emit_insn (gen_add3_insn (op2, addr1, GEN_INT (4)));
   addr2 = op2;
@@ -2596,6 +2594,8 @@
   word2 = change_address (src, SImode, addr2);
 
   emit_insn (gen_bswapsi2 (dest2, word1));
+  /* The REGNO (dest2) tests above ensure that addr2 has not been trashed,
+ thus allowing us to omit an early clobber on the output.  */
   emit_insn (gen_bswapsi2 (dest1, word2));
 }")
 
Index: gcc/testsuite/gcc.target/powerpc/pr53199.c
===
--- gcc/testsuite/gcc.target/powerpc/pr53199.c  (revision 206009)
+++ gcc/testsuite/gcc.target/powerpc/pr53199.c  (working copy)
@@ -1,7 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
 /* { dg-options "-O2 -mcpu=power6 -mavoid-indexed-addresses" } */
-/* { dg-final { scan-assembler-times "lwbrx" 6 } } */
+/* { dg-final { scan-assembler-times "lwbrx" 12 } } */
 /* { dg-final { scan-assembler-times "stwbrx" 6 } } */
 
 /* PR 51399: bswap gets an error if -mavoid-indexed-addresses was used in
@@ -25,6 +25,24 @@
   return __builtin_bswap64 (p[i]);
 }
 
+long long
+load64_reverse_4 (long long dummy __attribute__ ((unused)), long long *p)
+{
+  return __builtin_bswap64 (*p);
+}
+
+long long
+load64_reverse_5 (long long dummy __attribute__ ((unused)), long long *p)
+{
+  return __builtin_bswap64 (p[1]);
+}
+
+long long
+load64_reverse_6 (long long dummy __attribute__ ((unused)), long long *p, int 
i)
+{
+  return __builtin_bswap64 (p[i]);
+}
+
 void
 store64_reverse_1 (long long *p, long long x)
 {
@@ -44,7 +62,13 @@
 }
 
 long long
-reg_reverse (long long x)
+reg_reverse_1 (long long x)
 {
   return __builtin_bswap64 (x);
 }
+
+long long
+reg_reverse_2 (long long dummy __attribute__ ((unused)), long long x)
+{
+  return __builtin_bswap64 (x);
+}

-- 
Alan Modra
Australia Development Lab, IBM


Re: Two build != host fixes

2013-12-17 Thread Alan Modra
On Tue, Dec 17, 2013 at 01:14:23PM +0100, Bernd Edlinger wrote:
> the reason for this is overwriting GMPINC for the auto-build generation, 
> because
> many test scripts include  which fails now completely (it is not 
> installed,
> I have it in-tree).

Yes, I understand the reason why your setup is failing.  Please try
this patch.

Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 206009)
+++ gcc/configure.ac(working copy)
@@ -1529,8 +1529,13 @@
/* | [A-Za-z]:[\\/]* ) realsrcdir=${srcdir};;
*) realsrcdir=../${srcdir};;
esac
+   # Clearing GMPINC is necessary to prevent host headers being
+   # used by the build compiler.  Defining GENERATOR_FILE stops
+   # system.h from including gmp.h.
CC="${CC_FOR_BUILD}" CFLAGS="${CFLAGS_FOR_BUILD}" \
-   LDFLAGS="${LDFLAGS_FOR_BUILD}" GMPINC="" \
+   CXX="${CXX_FOR_BUILD}" CXXFLAGS="${CXXFLAGS_FOR_BUILD}" \
+   LD="${LD_FOR_BUILD}" LDFLAGS="${LDFLAGS_FOR_BUILD}" \
+   GMPINC="" CPPFLAGS="${CPPFLAGS} -DGENERATOR_FILE" \
${realsrcdir}/configure \
--enable-languages=${enable_languages-all} \
--target=$target_alias --host=$build_alias --build=$build_alias

-- 
Alan Modra
Australia Development Lab, IBM


Re: Two build != host fixes

2013-12-19 Thread Alan Modra
On Wed, Dec 18, 2013 at 02:32:01PM +0100, Bernd Edlinger wrote:
> I wonder if the GMPINC="" is still necessary, as the actual host g++ 
> invocation also has GMP-directories
> but does not use them because of the -DGENERATOR_FILE.

The issue isn't finding gmp.h, it's finding other host headers you
don't want.  Clearly, --with-gmp and similar options are for finding
host libraries and headers.  install.texi even says so!

To give an example of what can go wrong, suppose someone specifies
--with-gmp=/sysroot_for_host for an installed gmp.  Now I know that
it's unnecessary to specify --with-gmp if it's at the default install
location, but many people blindly follow recipes.  What's more,
specifying --with-gmp to the default works fine when build == host.
So it doesn't seem wrong to me to specify --with-gmp (even needlessly
to the default) when build != host.

The trouble is that GMPINC is then /sysroot_for_host/include, which is
where you find all the other host headers, not just gmp.h..

-- 
Alan Modra
Australia Development Lab, IBM


Re: Two build != host fixes

2013-12-19 Thread Alan Modra
On Thu, Dec 19, 2013 at 11:50:02AM +0100, Bernd Edlinger wrote:
> Isn't the actual invocation of the build-g++ also including 
> /sysroot_for_host/include
> in that case? Why doesn't this cause problems then?

Yes, and that causes failures too.  BUILD_CPPFLAGS is the culprit.
See http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01149.html

-- 
Alan Modra
Australia Development Lab, IBM


Re: Fix IBM long double division inaccuracy (glibc bug 15396)

2014-01-03 Thread Alan Modra
On Thu, Jan 02, 2014 at 09:46:56PM +, Joseph S. Myers wrote:
> (Note that there remain other bugs in the IBM long double code, some
> causing glibc test failures, at least (a) invalid results in rounding
> modes other than FE_TONEAREST, (b) spurious overflow and underflow
> exceptions, mainly but not entirely where discontiguous mantissa bits
> are involved.)

Thing is, the algorithms in rs6000/ibm-ldouble.c require round to
nearest to generate correct results.  Quoting from the PowerPC64 ABI:

This "Extended precision" differs from the IEEE 754 Standard in the 
following ways:
 
 * The software support is restricted to round-to-nearest mode.  
   Programs that use extended precision must ensure that this rounding 
   mode is in effect when extended-precision calculations are performed.
 * Does not fully support the IEEE special numbers NaN and INF.  These
   values are encoded in the high-order double value only.  The 
   low-order value is not significant.  
 * Does not support the IEEE status flags for overflow, underflow, and 
   other conditions.  These flag have no meaning in this format.


-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] rs6000: Revamp rotate-and-mask and insert

2015-07-14 Thread Alan Modra
On Sun, Jul 12, 2015 at 04:18:31PM -0400, David Edelsohn wrote:
> On Sun, Jul 12, 2015 at 1:08 PM, Segher Boessenkool
>  wrote:
> > This rewrites all the rotate-and-mask and insert patterns.
> 
> This is great!  I'm glad that you completed this feature.

Compared to mainline the results do look good, and some of bugs I
found in the old patterns have disappeared.  I particularly like the
fact that the "S" and "T" operand constraints are no longer needed.

There are one or two regressions related to a TODO that Segher added.
The following produces poorer code than mainline.

extern void lfoo (long);
void mask2_cond1 (long x)
{
  if ((x & 0x00fff00fL) > 0)
lfoo (0);
}
void mask2_cond2 (long x)
{
  if ((x & 0x00fff00fL) > 0)
lfoo (x & 0x00fff00fL);
}

Also, rs6000.md patterns uses SImode for the rotate/shift count.
Segher has added some new insns that use DImode when 64-bit.  I think
that inconsistency ought to be fixed.

(I haven't completely analysed this) but won't

(define_insn_and_split "*and3_imm_dot_shifted"
[snip]
(lshiftrt:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,r")
  (match_operand:GPR 4 "const_int_operand" "n,n"))
 ^^^this
fail to match combined patterns generated from other rs6000.md
patterns like

(define_insn "lshr3"
  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
(lshiftrt:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
  (match_operand:SI 2 "reg_or_cint_operand" "rn")))]
 ^^this?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix PR66870 ppc64le, ppc64 split stack

2015-07-30 Thread Alan Modra
On Thu, Jul 30, 2015 at 03:30:12PM -0500, Lynn A. Boger wrote:
> PR66870
> * gcc/config/rs6000/rs6000.c:  Add check for no_split_stack
> function attribute along with flag_split_stack check to
> determine when to generate split stack prologue for
> ppc64 and ppc64le.

Looks good to me, except that the changelog entry should mention the
modified functions, for example:

PR target/66870
* gcc/config/rs6000/rs6000.c (rs6000_emit_prologue): Check for
no_split_stack function attribute along with flag_split_stack.
(rs6000_expand_split_stack_prologue): Likewise.

Also, formatting rules for gcc say to not split a line after an
operator.

> +  int using_split_stack = flag_split_stack &&
> +   (lookup_attribute ("no_split_stack", DECL_ATTRIBUTES (cfun->decl))
> + == NULL);

The "&&" belongs on the next line, with parentheses added so that emacs
and indent will line up the continuation nicely.

  int using_split_stack = (flag_split_stack
   && (lookup_attribute ("no_split_stack",
 DECL_ATTRIBUTES (cfun->decl))
   == NULL));


David, the following is another piece of the PR66870 fixes.  This
stops shrink-wrap from moving insns around in the first few blocks of
a function, in a way that is incorrect given that r12 is live.
Bootstrapped and regression tested powerpc64le-linux (and
powerpc64-linux by Lynn).

PR target/66870
* config/rs6000/rs6000.c (machine_function): Add split_stack_argp_used.
(rs6000_emit_prologue): Set it.
(rs6000_set_up_by_prologue): Specify r12 when split_stack_argp_used.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5d9ff88..dc2e20c 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -166,6 +166,7 @@ typedef struct GTY(()) machine_function
   rtx sdmode_stack_slot;
   /* Alternative internal arg pointer for -fsplit-stack.  */
   rtx split_stack_arg_pointer;
+  bool split_stack_argp_used;
   /* Flag if r2 setup is needed with ELFv2 ABI.  */
   bool r2_setup_needed;
 } machine_function;
@@ -24458,6 +24459,7 @@ rs6000_emit_prologue (void)
 __morestack was called, it left the arg pointer to the old
 stack in r29.  Otherwise, the arg pointer is the top of the
 current frame.  */
+  cfun->machine->split_stack_argp_used = true;
   if (sp_adjust)
{
  rtx r12 = gen_rtx_REG (Pmode, 12);
@@ -33711,6 +33713,8 @@ rs6000_set_up_by_prologue (struct 
hard_reg_set_container *set)
   && TARGET_MINIMAL_TOC
   && get_pool_size () != 0)
 add_to_hard_reg_set (&set->set, Pmode, RS6000_PIC_OFFSET_TABLE_REGNUM);
+  if (cfun->machine->split_stack_argp_used)
+add_to_hard_reg_set (&set->set, Pmode, 12);
 }
 
 


-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] Don't pass --oformat to ld

2015-09-24 Thread Alan Modra
On Thu, Sep 24, 2015 at 02:24:25PM +1000, Michael Ellerman wrote:
> On Wed, 2015-09-02 at 11:05 +0930, Alan Modra wrote:
> > bugzilla.redhat.com/show_bug_cgi?id=1255946 shows that gcc built with
> > both powerpc64-linux and powerpc64le-linux support passes wrong linker
> > options when trying to link in the non-default endian.  A --oformat
> > option coming from LINK_TARGET_SPEC is only correct for 32-bit.
> > 
> > It turns out that GNU ld -m options select a particular ld emulation
> > (e*.c file in ld build dir) which provides compiled-in scripts or
> > selects a script from ldscripts/.  Each of these has an OUTPUT_FORMAT
> > statement, which does the same thing as --oformat.  --oformat is
> > therefore redundant when using GNU ld built this century, except
> > possibly when a user overrides the default ld script with -Wl,-T and
> > their script neglects OUTPUT_FORMAT, and it isn't the default output.
> > I don't think it's worth fixing this possible use case.
> > 
> > Bootstrap and testing in progress.  OK for mainline assuming all is
> > OK?
> > 
> > * config/rs6000/sysv4le.h (LINK_TARGET_SPEC): Don't define.
> > * config/rs6000/sysv4.h (LINK_TARGET_SPEC): Likewise.
> > (LINK_SPEC, SUBTARGET_EXTRA_SPECS): Delete link_target.
> 
> Hi Alan,
> 
> If you could please backport this to the gcc-5-branch, that would helpful for
> us (kernel folks).

Bootstrapped and regression tested powerpc64le-linux.  Is this OK for
the branch too, David?

-- 
Alan Modra
Australia Development Lab, IBM


[RS6000] Make -msingle-pic-base remove the ELFv2 global entry code

2015-09-29 Thread Alan Modra
For other ABIs, -msingle-pic-base makes gcc omit loading of the PIC
register in function prologues.  This patch makes the option affect
ELFv2 too.

I wrote a patch like this during the initial ELFv2 effort, but there
were many more important patches to push and this one somehow got
dropped.  Dusted off and retested at the request of powerpc64 kernel
people who'd like an option to disable ELFv2 global entry code for
the kernel.  OK mainline?

* config/rs6000/rs6000.c (rs6000_emit_prologue): Don't set
r2_setup_needed when TARGET_SINGLE_PIC_BASE.
(rs6000_output_mi_thunk): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ae456ff..023f622 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24118,13 +24118,13 @@ rs6000_emit_prologue (void)
 #define NOT_INUSE(R) do {} while (0)
 #endif
 
-  if (DEFAULT_ABI == ABI_ELFv2)
+  if (DEFAULT_ABI == ABI_ELFv2
+  && !TARGET_SINGLE_PIC_BASE)
 {
   cfun->machine->r2_setup_needed = df_regs_ever_live_p (TOC_REGNUM);
 
   /* With -mminimal-toc we may generate an extra use of r2 below.  */
-  if (!TARGET_SINGLE_PIC_BASE
- && TARGET_TOC && TARGET_MINIMAL_TOC && get_pool_size () != 0)
+  if (TARGET_TOC && TARGET_MINIMAL_TOC && get_pool_size () != 0)
cfun->machine->r2_setup_needed = true;
 }
 
@@ -26800,7 +26800,8 @@ rs6000_output_mi_thunk (FILE *file, tree thunk_fndecl 
ATTRIBUTE_UNUSED,
   /* Ensure we have a global entry point for the thunk.   ??? We could
  avoid that if the target routine doesn't need a global entry point,
  but we do not know whether this is the case at this point.  */
-  if (DEFAULT_ABI == ABI_ELFv2)
+  if (DEFAULT_ABI == ABI_ELFv2
+  && !TARGET_SINGLE_PIC_BASE)
 cfun->machine->r2_setup_needed = true;
 
   /* Run just enough of rest_of_compilation to get the insns emitted.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [libffi] Correct powerpc sysv stack argument accounting (#194)

2015-09-30 Thread Alan Modra
On Thu, Sep 03, 2015 at 09:33:45PM -0400, Anthony Green wrote:
> Please go ahead. I've been on vacation for a while. Returning next week... 

Committed revision 228307.

>  Original message ----
> From: Alan Modra  
> Date: 09-03-2015  7:40 PM  (GMT-05:00) 
> To: Richard Henderson , gcc-patches@gcc.gnu.org 
> Cc: Anthony Green , David Edelsohn  
> Subject: Re: [libffi] Correct powerpc sysv stack argument accounting (#194) 
> 
> On Tue, Aug 04, 2015 at 08:23:46AM -0700, Richard Henderson wrote:
> > Looks good, Alan.  Thanks.  After this gets merged, I guess it's
> > worth merging back to gcc.
> 
> It's been a month since I created the pull request and posted
> https://sourceware.org/ml/libffi-discuss/2015/msg00079.html
> 
> Given that things are going rather slow in libffi land at the moment,
> perhaps I should merge this to gcc now?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] rs6000: Add "cannot_copy" attribute, use it (PR67788, PR67789)

2015-10-01 Thread Alan Modra
On Thu, Oct 01, 2015 at 12:18:08PM -0500, Segher Boessenkool wrote:
> On Thu, Oct 01, 2015 at 12:14:44PM +0200, Richard Biener wrote:
> > So even if not "easy", can you try?
> 
> I did, and after half a day had a big mess and lots of things failing,
> no idea where this was headed, and in the meantime bootstrap still fails
> (on affected targets).

I had a look too, and while you can revise the load_toc_v4_PIC
patterns to use labels emitted the usual way (eg. as in
i386.c:ix86_init_large_pic_reg) they tend to wander away from the
insn.  I think that could be solved, but these labels which aren't
referred to by jump insns get converted to NOTE_INSN_DELETED_LABEL
somewhere, and that leads to further pain.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] PR66870 PowerPC64 Enable gold linker with split stack

2015-10-11 Thread Alan Modra
On Sat, Oct 10, 2015 at 11:25:38PM +0200, Andreas Schwab wrote:
> "Lynn A. Boger"  writes:
> 
> > Index: gcc/config/rs6000/sysv4.h
> > ===
> > --- gcc/config/rs6000/sysv4.h   (revision 228653)
> > +++ gcc/config/rs6000/sysv4.h   (working copy)
> > @@ -940,13 +940,15 @@ ncrtn.o%s"
> >  #undef TARGET_ASAN_SHADOW_OFFSET
> >  #define TARGET_ASAN_SHADOW_OFFSET rs6000_asan_shadow_offset
> >  
> > -/* On ppc64 and ppc64le, split stack is only support for
> > -   64 bit. */
> > +/* On ppc64 and ppc64le, split stack is only supported for
> > +   64 bit targets with a 64 bit compiler. */
> >  #undef TARGET_CAN_SPLIT_STACK_64BIT
> > +#if defined (__64BIT__) || defined (__powerpc64__) || defined (__ppc64__)
> 
> This doesn't make sense.  A target header cannot use host defines.

Right.  Here's a better fix.  A powerpc-linux biarch compiler can
default to either -m32 or -m64 so we need to take that into account,
and notice both -m32 and -m64 on the gccgo command line.  It's also
possible to build a -m64 only compiler, so in that case we can define
TARGET_CAN_SPLIT_STACK.

Bootstrapped etc. powerpc64-linux, powerpc-linux and
powerpc64le-linux.  OK?

gcc/
* config/rs6000/sysv4.h (TARGET_CAN_SPLIT_STACK_64BIT): Don't define.
* config/rs6000/linux64.h (TARGET_CAN_SPLIT_STACK): Define.
(TARGET_CAN_SPLIT_STACK_64BIT): Define.
gcc/go/
* gospec.c (saw_opt_m32): Rename to..
(is_m64): ..this, initialised by TARGET_CAN_SPLIT_STACK_64BIT.
Update uses.
(lang_specific_driver): Set is_m64 if OPT_m64, clear if OPT_m32.

diff --git a/gcc/config/rs6000/sysv4.h b/gcc/config/rs6000/sysv4.h
index 7b2f9bd..f48af43 100644
--- a/gcc/config/rs6000/sysv4.h
+++ b/gcc/config/rs6000/sysv4.h
@@ -940,14 +940,6 @@ ncrtn.o%s"
 #undef TARGET_ASAN_SHADOW_OFFSET
 #define TARGET_ASAN_SHADOW_OFFSET rs6000_asan_shadow_offset
 
-/* On ppc64 and ppc64le, split stack is only support for
-   64 bit. */
-#undef TARGET_CAN_SPLIT_STACK_64BIT
-#if TARGET_GLIBC_MAJOR > 2 \
-  || (TARGET_GLIBC_MAJOR == 2 && TARGET_GLIBC_MINOR >= 18)
-#define TARGET_CAN_SPLIT_STACK_64BIT
-#endif
-
 /* This target uses the sysv4.opt file.  */
 #define TARGET_USES_SYSV4_OPT 1
 
diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 9599735..28c83e41 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -245,6 +245,21 @@ extern int dot_symbols;
 #define MULTILIB_DEFAULTS { "m32" }
 #endif
 
+/* Split stack is only supported for 64 bit, and requires glibc >= 2.18.  */
+#if TARGET_GLIBC_MAJOR * 1000 + TARGET_GLIBC_MINOR >= 2018
+# ifndef RS6000_BI_ARCH
+#  define TARGET_CAN_SPLIT_STACK
+# else
+#  if DEFAULT_ARCH64_P
+/* Supported, and the default is -m64  */
+#   define TARGET_CAN_SPLIT_STACK_64BIT 1
+#  else
+/* Supported, and the default is -m32  */
+#   define TARGET_CAN_SPLIT_STACK_64BIT 0
+#  endif
+# endif
+#endif
+
 #ifndef RS6000_BI_ARCH
 
 /* 64-bit PowerPC Linux always has a TOC.  */
diff --git a/gcc/go/gospec.c b/gcc/go/gospec.c
index ca3c2d7..fbb55be 100644
--- a/gcc/go/gospec.c
+++ b/gcc/go/gospec.c
@@ -120,8 +120,10 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
   /* Whether the -S option was used.  */
   bool saw_opt_S = false;
 
-  /* Whether the -m32 option was used. */
-  bool saw_opt_m32 ATTRIBUTE_UNUSED = false;
+#ifdef TARGET_CAN_SPLIT_STACK_64BIT
+  /* Whether the -m64 option is in force. */
+  bool is_m64 = TARGET_CAN_SPLIT_STACK_64BIT;
+#endif
 
   /* The first input file with an extension of .go.  */
   const char *first_go_file = NULL;  
@@ -160,7 +162,11 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
 
 #ifdef TARGET_CAN_SPLIT_STACK_64BIT
case OPT_m32:
- saw_opt_m32 = true;
+ is_m64 = false;
+ break;
+
+   case OPT_m64:
+ is_m64 = true;
  break;
 #endif
 
@@ -253,7 +259,7 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
 #endif
 
 #ifdef TARGET_CAN_SPLIT_STACK_64BIT
-  if (!saw_opt_m32)
+  if (is_m64)
 supports_split_stack = 1;
 #endif
 


-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] PR66870 PowerPC64 Enable gold linker with split stack

2015-10-11 Thread Alan Modra
On Sun, Oct 11, 2015 at 11:29:36AM -0700, Ian Lance Taylor wrote:
> On Sun, Oct 11, 2015 at 7:43 AM, Andreas Schwab  wrote
> >
> > Please remind me why this logic isn't implemented as a target hook.
> >
> > supports_split_stack = TARGET_CAN_SPLIT_STACK;
> >
> > /* rs6000.h */
> > #define TARGET_CAN_SPLIT_STACK TARGET_64BIT
> 
> There is a target hook for split stack support in
> gcc/common/common-target.def.  The PPC version of it is in
> gcc/common/config/rs6000/rs6000-common.c.
> 
> But the issue here is that we need access from the gccgo driver
> program.  Can the driver program call the common target hooks?

Not the way the gccgo driver is currently written.  In
lang_specific_driver you get to see global_options as set up by
init_options_struct.  TARGET_64BIT, used by the hook, is at its
default value rather than what you'd see after command line option
processing.  This isn't at all surprising when you consider that
lang_specific_driver must run before option processing since one of
its jobs is to insert command line options.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] PR66870 PowerPC64 Enable gold linker with split stack

2015-10-12 Thread Alan Modra
On Mon, Oct 12, 2015 at 10:15:04AM -0500, Lynn A. Boger wrote:
> Thanks for doing this Alan.  I agree this looks better to me.
> 
> I assume by "etc" you mean you did biarch builds for your bootstraps on BE?

By "etc" I meant "and regression tested".

I built four configurations, powerpc-linux 32-bit only,
powerpc64le-linux 64-bit only, biarch powerpc-linux with 32-bit
default, and biarch powerpc64-linux with 64-bit default.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, rs6000] Pass --secure-plt to the linker

2015-10-19 Thread Alan Modra
On Thu, Oct 15, 2015 at 06:50:50PM +0100, Szabolcs Nagy wrote:
> A powerpc toolchain built with (or without) --enable-secureplt
> currently creates a binary that uses bss plt if
> 
> (1) any of the linked PIC objects have bss plt relocs
> (2) or all the linked objects are non-PIC or have no relocs,
> 
> because this is the binutils linker behaviour.
> 
> This patch passes --secure-plt to the linker which makes the linker
> warn in case (1) and produce a binary with secure plt in case (2).

The idea is OK I think, but

> @@ -574,6 +577,7 @@ ENDIAN_SELECT(" -mbig", " -mlittle", DEFAULT_ASM_ENDIAN)
>  %{R*} \
>  %(link_shlib) \
>  %{!T*: %(link_start) } \
> +%{!static: %(link_secure_plt_default)} \
>  %(link_os)"

this change needs to be conditional on !mbss-plt too.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, rs6000] Pass --secure-plt to the linker

2015-10-19 Thread Alan Modra
On Mon, Oct 19, 2015 at 08:10:32PM +0100, Szabolcs Nagy wrote:
> On 19/10/15 14:04, Szabolcs Nagy wrote:
> >On 19/10/15 12:12, Alan Modra wrote:
> >>On Thu, Oct 15, 2015 at 06:50:50PM +0100, Szabolcs Nagy wrote:
> >>>A powerpc toolchain built with (or without) --enable-secureplt
> >>>currently creates a binary that uses bss plt if
> >>>
> >>>(1) any of the linked PIC objects have bss plt relocs
> >>>(2) or all the linked objects are non-PIC or have no relocs,
> >>>
> >>>because this is the binutils linker behaviour.
> >>>
> >>>This patch passes --secure-plt to the linker which makes the linker
> >>>warn in case (1) and produce a binary with secure plt in case (2).
> >>
> >>The idea is OK I think, but
> >>
> >>>@@ -574,6 +577,7 @@ ENDIAN_SELECT(" -mbig", " -mlittle", 
> >>>DEFAULT_ASM_ENDIAN)
> >>>  %{R*} \
> >>>  %(link_shlib) \
> >>>  %{!T*: %(link_start) } \
> >>>+%{!static: %(link_secure_plt_default)} \
> >>>  %(link_os)"
> >>
> >>this change needs to be conditional on !mbss-plt too.
> >>
> >
> >OK, will change that.
> >
> >if -msecure-plt and -mbss-plt are supposed to affect
> >linking too (not just code gen) then shall i add
> >%{msecure-plt: --secure-plt} too?
> >
> 
> I added !mbss-plt only for now as a mix of -msecure-plt
> and -mbss-plt options do not cancel each other in gcc,

They do for code-gen since they share the same variable (see
sysv4.opt), but I guess you meant as far as spec parsing goes.  In
hindsight, it might have been better if I'd spelled -mbss-plt as
-mno-secure-plt.

> the patch only changes behaviour for a secureplt toolchain.
> 
> OK to commit?

Apologies for not thinking of this before when I first reviewed the
patch, but have you bootstrapped this patch on powerpc64-linux?  I'm
guessing not, because it occurs to me that --secure-plt is not a
powerpc64-linux-ld option.  So if you try to build a biarch compiler
with --enable-secure-plt then ld will complain when attempting to link
64-bit binaries.

You'll want this on top of your patch:

diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 9599735..01fb880 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -174,20 +174,24 @@ extern int dot_symbols;
 #undef ASM_DEFAULT_SPEC
 #undef ASM_SPEC
 #undef LINK_OS_LINUX_SPEC
+#undef LINK_SECURE_PLT_SPEC
 
 #ifndefRS6000_BI_ARCH
 #defineASM_DEFAULT_SPEC "-mppc64"
 #defineASM_SPEC "%(asm_spec64) %(asm_spec_common)"
 #defineLINK_OS_LINUX_SPEC "%(link_os_linux_spec64)"
+#defineLINK_SECURE_PLT_SPEC ""
 #else
 #if DEFAULT_ARCH64_P
 #defineASM_DEFAULT_SPEC "-mppc%{!m32:64}"
 #defineASM_SPEC "%{m32:%(asm_spec32)}%{!m32:%(asm_spec64)} 
%(asm_spec_common)"
 #defineLINK_OS_LINUX_SPEC 
"%{m32:%(link_os_linux_spec32)}%{!m32:%(link_os_linux_spec64)}"
+#defineLINK_SECURE_PLT_SPEC "%{m32: " LINK_SECURE_PLT_DEFAULT_SPEC "}"
 #else
 #defineASM_DEFAULT_SPEC "-mppc%{m64:64}"
 #defineASM_SPEC "%{!m64:%(asm_spec32)}%{m64:%(asm_spec64)} 
%(asm_spec_common)"
 #defineLINK_OS_LINUX_SPEC 
"%{!m64:%(link_os_linux_spec32)}%{m64:%(link_os_linux_spec64)}"
+#defineLINK_SECURE_PLT_SPEC "%{!m64: " LINK_SECURE_PLT_DEFAULT_SPEC "}"
 #endif
 #endif
 
diff --git a/gcc/config/rs6000/sysv4.h b/gcc/config/rs6000/sysv4.h
index 93499e8..1bb400f 100644
--- a/gcc/config/rs6000/sysv4.h
+++ b/gcc/config/rs6000/sysv4.h
@@ -570,6 +570,7 @@ ENDIAN_SELECT(" -mbig", " -mlittle", DEFAULT_ASM_ENDIAN)
: %(link_start_default) }"
 
 #define LINK_START_DEFAULT_SPEC ""
+#define LINK_SECURE_PLT_SPEC LINK_SECURE_PLT_DEFAULT_SPEC
 
 #undef LINK_SPEC
 #defineLINK_SPEC "\
@@ -577,7 +578,7 @@ ENDIAN_SELECT(" -mbig", " -mlittle", DEFAULT_ASM_ENDIAN)
 %{R*} \
 %(link_shlib) \
 %{!T*: %(link_start) } \
-%{!static: %{!mbss-plt: %(link_secure_plt_default)}} \
+%{!static: %{!mbss-plt: %(link_secure_plt)}} \
 %(link_os)"
 
 /* Shared libraries are not default.  */
@@ -893,7 +894,7 @@ ncrtn.o%s"
   { "link_os_openbsd", LINK_OS_OPENBSD_SPEC }, \
   { "link_os_default", LINK_OS_DEFAULT_SPEC }, \
   { "cc1_secure_plt_default",  CC1_SECURE_PLT_DEFAULT_SPEC },  \
-  { "link_secure_plt_default", LINK_SECURE_PLT_DEFAULT_SPEC }, \
+  { "link_secure_plt", LINK_SECURE_PLT_SPEC }, \
   { "cpp_os_ads",  CPP_OS_ADS_SPEC },  \
   { "cpp_os_yellowknife",  CPP_OS_YELLOWKNIFE_SPEC },  \
   { "cpp_os_mvme", CPP_OS_MVME_SPEC }, \


-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH, rs6000] Pass --secure-plt to the linker

2015-10-20 Thread Alan Modra
On Tue, Oct 20, 2015 at 03:40:14PM -0500, David Edelsohn wrote:
> On Tue, Oct 20, 2015 at 3:38 PM, Szabolcs Nagy  wrote:
> > 2015-10-20  Gregor Richards  
> > Szabolcs Nagy  
> >
> > * config/rs6000/secureplt.h (LINK_SECURE_PLT_DEFAULT_SPEC): Define.
> > * config/rs6000/sysv4.h (LINK_SECURE_PLT_SPEC): Define.
> > (LINK_SPEC): Add %(link_secure_plt).
> > (SUBTARGET_EXTRA_SPECS): Add "link_secure_plt".
> > * config/rs6000/linux64.h (LINK_SECURE_PLT_SPEC): Redefine.
> >
> 
> I'm okay if Alan is satisfied.

Committed revision 229102.

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH 0/7] 64-bit obstack support in libiberty

2015-11-07 Thread Alan Modra
This patchset imports new obstack support to libiberty, to better
support 64-bit systems, and fix an old gdb bug.  Most of the necessary
changes outside of libiberty were committed October last year, but a
few more incompatibilities have crept in since then.  The first three
patches fix these problems.  Patch 4 does the import from gnulib, and
edits the docs as if they had been imported from glibc.  Patch 5 makes
modifications for libiberty.  Patch 6 is a warning fix that I'll see
about pushing upstream, and finally, patch 7 supplies a define used to
determine whether libiberty needs obstack.o.

The cumulative patch series was bootstrapped and regression tested on
x86_64-linux, and also after just the first three patches.

Alan Modra (7):
  New obstack_next_free is not an lvalue
  Correct libvtv obstack use
  Update libsanitizer obstack interceptors
  Copy gnulib obstack files
  Modify obstack.[hc] to avoid having to include other gnulib files
  Silence obstack.c -Wc++compat warning
  Configury changes for obstack optimization

 gcc/gensupport.c   |   9 +-
 gcc/objc/objc-encoding.c   |  10 +-
 include/obstack.h  | 910 ++---
 libiberty/configure|  58 ++
 libiberty/configure.ac |   1 +
 libiberty/obstack.c| 570 +
 libiberty/obstacks.texi| 257 +++---
 libsanitizer/Makefile.in   |   1 +
 libsanitizer/asan/Makefile.am  |   2 +-
 libsanitizer/asan/Makefile.in  |   3 +-
 libsanitizer/configure |  38 +-
 libsanitizer/configure.ac  |  24 +
 libsanitizer/interception/Makefile.in  |   1 +
 libsanitizer/libbacktrace/Makefile.in  |   1 +
 libsanitizer/lsan/Makefile.in  |   1 +
 libsanitizer/sanitizer_common/Makefile.in  |   1 +
 .../sanitizer_common_interceptors.inc  |  14 +-
 libsanitizer/tsan/Makefile.am  |   2 +-
 libsanitizer/tsan/Makefile.in  |   3 +-
 libsanitizer/ubsan/Makefile.in |   1 +
 libvtv/vtv_malloc.cc   |   7 +-
 21 files changed, 957 insertions(+), 957 deletions(-)


[PATCH 1/7] New obstack_next_free is not an lvalue

2015-11-07 Thread Alan Modra
New obstack.h casts obstack_next_free to (void *), resulting in it
being a non-lvalue, and warnings on pointer arithmetic.

gcc/
* gensupport.c (add_mnemonic_string): Make len param a size_t.
(gen_mnemonic_setattr): Make "size" var a size_t.  Use
obstack_blank_fast to shrink obstack.  Cast obstack_next_free
return value.
gcc/objc/
* objc-encoding.c (encode_aggregate_within): Cast obstack_next_free
return value.

diff --git a/gcc/gensupport.c b/gcc/gensupport.c
index 0480e17..484ead2 100644
--- a/gcc/gensupport.c
+++ b/gcc/gensupport.c
@@ -2253,7 +2253,7 @@ htab_eq_string (const void *s1, const void *s2)
and a permanent heap copy of STR is created.  */
 
 static void
-add_mnemonic_string (htab_t mnemonic_htab, const char *str, int len)
+add_mnemonic_string (htab_t mnemonic_htab, const char *str, size_t len)
 {
   char *new_str;
   void **slot;
@@ -2306,7 +2306,7 @@ gen_mnemonic_setattr (htab_t mnemonic_htab, rtx insn)
   for (i = 0; *cp; )
 {
   const char *ep, *sp;
-  int size = 0;
+  size_t size = 0;
 
   while (ISSPACE (*cp))
cp++;
@@ -2333,8 +2333,7 @@ gen_mnemonic_setattr (htab_t mnemonic_htab, rtx insn)
{
  /* Don't set a value if there are more than one
 instruction in the string.  */
- obstack_next_free (&string_obstack) =
-   obstack_next_free (&string_obstack) - size;
+ obstack_blank_fast (&string_obstack, -size);
  size = 0;
 
  cp = sp;
@@ -2346,7 +2345,7 @@ gen_mnemonic_setattr (htab_t mnemonic_htab, rtx insn)
obstack_1grow (&string_obstack, '*');
   else
add_mnemonic_string (mnemonic_htab,
-obstack_next_free (&string_obstack) - size,
+(char *) obstack_next_free (&string_obstack) - 
size,
 size);
   i++;
 }
diff --git a/gcc/objc/objc-encoding.c b/gcc/objc/objc-encoding.c
index 4848021..9c577e9 100644
--- a/gcc/objc/objc-encoding.c
+++ b/gcc/objc/objc-encoding.c
@@ -495,13 +495,14 @@ encode_aggregate_within (tree type, int curtype, int 
format, int left,
 
   if (flag_next_runtime)
 {
-  if (ob_size > 0  &&  *(obstack_next_free (&util_obstack) - 1) == '^')
+  if (ob_size > 0
+ && *((char *) obstack_next_free (&util_obstack) - 1) == '^')
pointed_to = true;
 
   if ((format == OBJC_ENCODE_INLINE_DEFS || generating_instance_variables)
  && (!pointed_to || ob_size - curtype == 1
  || (ob_size - curtype == 2
- && *(obstack_next_free (&util_obstack) - 2) == 'r')))
+ && *((char *) obstack_next_free (&util_obstack) - 2) == 'r')))
inline_contents = true;
 }
   else
@@ -512,9 +513,10 @@ encode_aggregate_within (tree type, int curtype, int 
format, int left,
 comment above applies: in that case we should avoid encoding
 the names of instance variables.
   */
-  char c1 = ob_size > 1 ? *(obstack_next_free (&util_obstack) - 2) : 0;
-  char c0 = ob_size > 0 ? *(obstack_next_free (&util_obstack) - 1) : 0;
+  char c0, c1;
 
+  c1 = ob_size > 1 ? *((char *) obstack_next_free (&util_obstack) - 2) : 0;
+  c0 = ob_size > 0 ? *((char *) obstack_next_free (&util_obstack) - 1) : 0;
   if (c0 == '^' || (c1 == '^' && c0 == 'r'))
pointed_to = true;
 


[PATCH 2/7] Correct libvtv obstack use

2015-11-07 Thread Alan Modra
Fixes a compile error with both old and new obstacks due to
obstack_chunk_free having the wrong signature.  Also, setting chunk
size and alignment before obstack_init is pointless since they are
overwritten.

* vtv_malloc.cc (obstack_chunk_free): Correct param type.
(__vtv_malloc_init): Use obstack_specify_allocation.

diff --git a/libvtv/vtv_malloc.cc b/libvtv/vtv_malloc.cc
index ecd07eb..ea26b82 100644
--- a/libvtv/vtv_malloc.cc
+++ b/libvtv/vtv_malloc.cc
@@ -194,7 +194,7 @@ obstack_chunk_alloc (size_t size)
 }
 
 static void
-obstack_chunk_free (size_t)
+obstack_chunk_free (void *)
 {
   /* Do nothing. For our purposes there should be very little
  de-allocation. */
@@ -217,14 +217,13 @@ __vtv_malloc_init (void)
 #endif
 VTV_error ();
 
-  obstack_chunk_size (&vtv_obstack) = VTV_PAGE_SIZE;
-  obstack_alignment_mask (&vtv_obstack) = sizeof (long) - 1;
   /* We guarantee that the obstack alloc failed handler will never be
  called because in case the allocation of the chunk fails, it will
  never return */
   obstack_alloc_failed_handler = NULL;
 
-  obstack_init (&vtv_obstack);
+  obstack_specify_allocation (&vtv_obstack, VTV_PAGE_SIZE, sizeof (long),
+ obstack_chunk_alloc, obstack_chunk_free);
   malloc_initialized = 1;
 }
 


[PATCH 3/7] Update libsanitizer obstack interceptors

2015-11-07 Thread Alan Modra
New obstack uses sensible types, size_t instead of int for length
params.  Since libsanitizer does not use prototypes from obstack.h to
call the real functions, it's necessary to update the libsanitizer
function declarations emitted by the INTERCEPTOR macro.

As per the comment added to configure.ac, it would be nice if we could
update to a more recent autoconf, but what I have should do given the
limited target support for libsanitizer.

I'll be pushing this one upstream too, when I figure out something
reasonable for cmake.

* sanitizer_common/sanitizer_common_interceptors.inc: Update size
params for _obstack_begin_1, _obstack_begin, _obstack_newchunk
interceptors.
* configure.ac: Substitute OBSTACK_DEFS.
* asan/Makefile.am: Add OBSTACK_DEFS to DEFS.
* tsan/Makefile.am: Likewise.
* configure: Regenerate.
* Makefile.in: Regenerate.
* asan/Makefile.in: Regenerate.
* interception/Makefile.in: Regenerate.
* libbacktrace/Makefile.in: Regenerate.
* lsan/Makefile.in: Regenerate.
* sanitizer_common/Makefile.in: Regenerate.
* tsan/Makefile.in: Regenerate.
* ubsan/Makefile.in: Regenerate.

diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 
b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index 9b8c77e..92b9027 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -4874,8 +4874,9 @@ static void initialize_obstack(__sanitizer_obstack 
*obstack) {
 sizeof(*obstack->chunk));
 }
 
-INTERCEPTOR(int, _obstack_begin_1, __sanitizer_obstack *obstack, int sz,
-int align, void *(*alloc_fn)(uptr arg, uptr sz),
+INTERCEPTOR(int, _obstack_begin_1, __sanitizer_obstack *obstack,
+_OBSTACK_SIZE_T sz, _OBSTACK_SIZE_T align,
+void *(*alloc_fn)(uptr arg, SIZE_T sz),
 void (*free_fn)(uptr arg, void *p)) {
   void *ctx;
   COMMON_INTERCEPTOR_ENTER(ctx, _obstack_begin_1, obstack, sz, align, alloc_fn,
@@ -4884,8 +4885,10 @@ INTERCEPTOR(int, _obstack_begin_1, __sanitizer_obstack 
*obstack, int sz,
   if (res) initialize_obstack(obstack);
   return res;
 }
-INTERCEPTOR(int, _obstack_begin, __sanitizer_obstack *obstack, int sz,
-int align, void *(*alloc_fn)(uptr sz), void (*free_fn)(void *p)) {
+INTERCEPTOR(int, _obstack_begin, __sanitizer_obstack *obstack,
+_OBSTACK_SIZE_T sz, _OBSTACK_SIZE_T align,
+void *(*alloc_fn)(SIZE_T sz),
+void (*free_fn)(void *p)) {
   void *ctx;
   COMMON_INTERCEPTOR_ENTER(ctx, _obstack_begin, obstack, sz, align, alloc_fn,
free_fn);
@@ -4893,7 +4896,8 @@ INTERCEPTOR(int, _obstack_begin, __sanitizer_obstack 
*obstack, int sz,
   if (res) initialize_obstack(obstack);
   return res;
 }
-INTERCEPTOR(void, _obstack_newchunk, __sanitizer_obstack *obstack, int length) 
{
+INTERCEPTOR(void, _obstack_newchunk, __sanitizer_obstack *obstack,
+_OBSTACK_SIZE_T length) {
   void *ctx;
   COMMON_INTERCEPTOR_ENTER(ctx, _obstack_newchunk, obstack, length);
   REAL(_obstack_newchunk)(obstack, length);
diff --git a/libsanitizer/configure.ac b/libsanitizer/configure.ac
index 81fd46d..72b13a1 100644
--- a/libsanitizer/configure.ac
+++ b/libsanitizer/configure.ac
@@ -335,6 +335,30 @@ fi
 
 AC_SUBST([RPC_DEFS], [$rpc_defs])
 
+dnl If this file is processed by autoconf-2.67 or later then the CPPFLAGS
+dnl "-o conftest.iii" can disappear, conftest.iii be replaced with
+dnl conftest.i in the sed command line, and the rm deleted.
+dnl Not all cpp's accept -o, and gcc -E does not accept a second file
+dnl argument as the output file.
+AC_CACHE_CHECK([obstack params],
+[libsanitizer_cv_sys_obstack],
+[save_cppflags=$CPPFLAGS
+CPPFLAGS="-I${srcdir}/../include -o conftest.iii $CPPFLAGS"
+AC_PREPROC_IFELSE([AC_LANG_SOURCE([
+#include "obstack.h"
+#ifdef _OBSTACK_SIZE_T
+_OBSTACK_SIZE_T
+#else
+int
+#endif
+])],
+[libsanitizer_cv_sys_obstack=`sed -e '/^#/d;/^[ ]*$/d' conftest.iii | 
sed -e '$!d;s/size_t/SIZE_T/'`],
+[libsanitizer_cv_sys_obstack=int])
+CPPFLAGS=$save_cppflags
+rm -f conftest.iii
+])
+AC_SUBST([OBSTACK_DEFS], [-D_OBSTACK_SIZE_T=\"$libsanitizer_cv_sys_obstack\"])
+
 AM_CONDITIONAL(LIBBACKTRACE_SUPPORTED,
   [test "x${BACKTRACE_SUPPORTED}x${BACKTRACE_USES_MALLOC}" = 
"x1x0"])
 
diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am
index bd3cd73..4500e21 100644
--- a/libsanitizer/asan/Makefile.am
+++ b/libsanitizer/asan/Makefile.am
@@ -3,7 +3,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 
-DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_LIMIT_MACROS -DASAN_HAS_EXCEPTIONS=1 -DASAN_NEEDS_SEGV=1 
-DCAN_SANITIZE_UB=0
+DEFS =

[PATCH 4/7] Copy gnulib obstack files

2015-11-07 Thread Alan Modra
This copies obstack.[ch] from gnulib, and updates the docs.  The next
patch should be applied if someone repeats the import at a later date.

include/
PR gdb/17133
* obstack.h: Import current gnulib file.
libiberty/
PR gdb/17133
* obstack.c: Import current gnulib file.
* obstacks.texi: Updated doc, from glibc's manual/memory.texi.

diff --git a/include/obstack.h b/include/obstack.h
index 9759af4..0ff3309 100644
--- a/include/obstack.h
+++ b/include/obstack.h
@@ -1,106 +1,102 @@
 /* obstack.h - object stack macros
Copyright (C) 1988-2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
 
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
 
-   NOTE: The canonical source of this file is maintained with the GNU C 
Library.
-   Bugs can be reported to bug-gl...@gnu.org.
-
-   This program is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published by the
-   Free Software Foundation; either version 2, or (at your option) any
-   later version.
-
-   This program is distributed in the hope that it will be useful,
+   The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
 
-   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301,
-   USA.  */
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   .  */
 
 /* Summary:
 
-All the apparent functions defined here are macros. The idea
-is that you would use these pre-tested macros to solve a
-very specific set of problems, and they would run fast.
-Caution: no side-effects in arguments please!! They may be
-evaluated MANY times!!
-
-These macros operate a stack of objects.  Each object starts life
-small, and may grow to maturity.  (Consider building a word syllable
-by syllable.)  An object can move while it is growing.  Once it has
-been "finished" it never changes address again.  So the "top of the
-stack" is typically an immature growing object, while the rest of the
-stack is of mature, fixed size and fixed address objects.
-
-These routines grab large chunks of memory, using a function you
-supply, called `obstack_chunk_alloc'.  On occasion, they free chunks,
-by calling `obstack_chunk_free'.  You must define them and declare
-them before using any obstack macros.
-
-Each independent stack is represented by a `struct obstack'.
-Each of the obstack macros expects a pointer to such a structure
-as the first argument.
-
-One motivation for this package is the problem of growing char strings
-in symbol tables.  Unless you are "fascist pig with a read-only mind"
---Gosper's immortal quote from HAKMEM item 154, out of context--you
-would not like to put any arbitrary upper limit on the length of your
-symbols.
-
-In practice this often means you will build many short symbols and a
-few long symbols.  At the time you are reading a symbol you don't know
-how long it is.  One traditional method is to read a symbol into a
-buffer, realloc()ating the buffer every time you try to read a symbol
-that is longer than the buffer.  This is beaut, but you still will
-want to copy the symbol from the buffer to a more permanent
-symbol-table entry say about half the time.
-
-With obstacks, you can work differently.  Use one obstack for all symbol
-names.  As you read a symbol, grow the name in the obstack gradually.
-When the name is complete, finalize it.  Then, if the symbol exists already,
-free the newly read name.
-
-The way we do this is to take a large chunk, allocating memory from
-low addresses.  When you want to build a symbol in the chunk you just
-add chars above the current "high water mark" in the chunk.  When you
-have finished adding chars, because you got to the end of the symbol,
-you know how long the chars are, and you can create a new object.
-Mostly the chars will not burst over the highest address of the chunk,
-because you would typically expect a chunk to be (say) 100 times as
-long as an average object.
-
-In case that isn't clear, when we have enough chars to make up
-the object, THEY ARE ALREADY CONTIGUOUS IN THE CHUNK (guaranteed)
-so we just point to it where it lies.  No moving of chars is
-needed and this is the second win: potentially 

[PATCH 5/7] Modify obstack.[hc] to avoid having to include other gnulib files

2015-11-07 Thread Alan Modra
Using the standard gnulib obstack source requires importing quite a
lot of other files from gnulib, and requires build changes.

If one did want to use gnulib obstack directly, then it would need to
go in a sub-directory and after ".../gnulib-tool --import obstack"
we'd have the following:

./lib:
alignof.h   gettext.hobstack.hstdlib.in.h unistd.in.h
exitfail.c  Makefile.am  stddef.in.h  sys_types.in.h
exitfail.h  obstack.cstdint.in.h  unistd.c

./m4:
00gnulib.m4 gnulib-comp.m4   obstack.m4   stdint.m4   wchar_t.m4
absolute-header.m4  gnulib-tool.m4   off_t.m4 stdlib_h.m4
extern-inline.m4include_next.m4  onceonly.m4  sys_types_h.m4
gnulib-cache.m4 longlong.m4  ssize_t.m4   unistd_h.m4
gnulib-common.m4multiarch.m4 stddef_h.m4  warn-on-use.m4

./snippet:
arg-nonnull.h  c++defs.h  _Noreturn.h  warn-on-use.h

include/
PR gdb/17133
* obstack.h (__attribute_pure__): Expand _GL_ATTRIBUTE_PURE.
libiberty/
PR gdb/17133
* obstack.c (__alignof__): Expand alignof_type from alignof.h.
(obstack_exit_failure): Don't use exitfail.h.
(_): Include libintl.h when HAVE_LIBINTL_H and nls enabled.
Provide default.  Don't include gettext.h.
(_Noreturn): Define.
* obstacks.texi: Adjust node references to external libc info files.

diff --git a/include/obstack.h b/include/obstack.h
index 0ff3309..0d13c72 100644
--- a/include/obstack.h
+++ b/include/obstack.h
@@ -142,7 +142,11 @@
 P, A)
 
 #ifndef __attribute_pure__
-# define __attribute_pure__ _GL_ATTRIBUTE_PURE
+# if defined __GNUC_MINOR__ && __GNUC__ * 1000 + __GNUC_MINOR__ >= 2096
+#  define __attribute_pure__ __attribute__ ((__pure__))
+# else
+#  define __attribute_pure__
+# endif
 #endif
 
 #ifdef __cplusplus
diff --git a/libiberty/obstack.c b/libiberty/obstack.c
index 3b99dfa..9f34da1 100644
--- a/libiberty/obstack.c
+++ b/libiberty/obstack.c
@@ -51,9 +51,14 @@
 /* If GCC, or if an oddball (testing?) host that #defines __alignof__,
use the already-supplied __alignof__.  Otherwise, this must be Gnulib
(as glibc assumes GCC); defer to Gnulib's alignof_type.  */
-# if !defined __GNUC__ && !defined __alignof__
-#  include 
-#  define __alignof__(type) alignof_type (type)
+# if !defined __GNUC__ && !defined __IBM__ALIGNOF__ && !defined __alignof__
+#  if defined __cplusplus
+template  struct alignof_helper { char __slot1; type __slot2; };
+#   define __alignof__(type) offsetof (alignof_helper, __slot2)
+#  else
+#   define __alignof__(type) \
+  offsetof (struct { char __slot1; type __slot2; }, __slot2)
+#  endif
 # endif
 # include 
 # include 
@@ -309,17 +314,34 @@ _obstack_memory_used (struct obstack *h)
 #  ifdef _LIBC
 int obstack_exit_failure = EXIT_FAILURE;
 #  else
-#   include "exitfail.h"
-#   define obstack_exit_failure exit_failure
+#   ifndef EXIT_FAILURE
+#define EXIT_FAILURE 1
+#   endif
+#   define obstack_exit_failure EXIT_FAILURE
 #  endif
 
-#  ifdef _LIBC
+#  if defined _LIBC || (HAVE_LIBINTL_H && ENABLE_NLS)
 #   include 
+#   ifndef _
+#define _(msgid) gettext (msgid)
+#   endif
 #  else
-#   include "gettext.h"
+#   ifndef _
+#define _(msgid) (msgid)
+#   endif
 #  endif
-#  ifndef _
-#   define _(msgid) gettext (msgid)
+
+#  if !(defined _Noreturn\
+|| (defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112))
+#   if ((defined __GNUC__\
+&& (__GNUC__ >= 3 || (__GNUC__ == 2 && __GNUC_MINOR__ >= 8)))\
+   || (defined __SUNPRO_C && __SUNPRO_C >= 0x5110))
+#define _Noreturn __attribute__ ((__noreturn__))
+#   elif defined _MSC_VER && _MSC_VER >= 1200
+#define _Noreturn __declspec (noreturn)
+#   else
+#define _Noreturn
+#   endif
 #  endif
 
 #  ifdef _LIBC
diff --git a/libiberty/obstacks.texi b/libiberty/obstacks.texi
index 1bfc878..b2d2403 100644
--- a/libiberty/obstacks.texi
+++ b/libiberty/obstacks.texi
@@ -93,7 +93,7 @@ them are freed.  These macros should appear before any use of 
obstacks
 in the source file.
 
 Usually these are defined to use @code{malloc} via the intermediary
-@code{xmalloc} (@pxref{Unconstrained Allocation}).  This is done with
+@code{xmalloc} (@pxref{Unconstrained Allocation, , , libc, The GNU C Library 
Reference Manual}).  This is done with
 the following pair of macro definitions:
 
 @smallexample
@@ -172,8 +172,8 @@ The value of this variable is a pointer to a function that
 @code{obstack} uses when @code{obstack_chunk_alloc} fails to allocate
 memory.  The default action is to print a message and abort.
 You should supply a function that either calls @code{exit}
-(@pxref{Program Termination}) or @code{longjmp} (@pxref{Non-Local
-Exits}) and doesn't return.
+(@pxref{Program Termination, , , libc, The GNU C Library Reference Manual}) or 
@code{longjmp} (@pxref{Non-Local
+Exits, , , 

[PATCH 6/7] Silence obstack.c -Wc++compat warning

2015-11-07 Thread Alan Modra
Fixes
warning: request for implicit conversion from ‘void *’ to ‘struct 
_obstack_chunk *’ not permitted in C++ [-Wc++-compat]

I moved the assignment to h->chunk to fix an overlong line, then
decided it would be better after the alloc failure check just to do
things the same way as in _obstack_newchunk.

* obstack.c (_obstack_newchunk): Silence -Wc++compat warning.
(_obstack_begin_worker): Likewise.  Move assignment to h->chunk
after alloc failure check.

diff --git a/libiberty/obstack.c b/libiberty/obstack.c
index 9f34da1..6d8d672 100644
--- a/libiberty/obstack.c
+++ b/libiberty/obstack.c
@@ -138,9 +138,10 @@ _obstack_begin_worker (struct obstack *h,
   h->chunk_size = size;
   h->alignment_mask = alignment - 1;
 
-  chunk = h->chunk = call_chunkfun (h, h->chunk_size);
+  chunk = (struct _obstack_chunk *) call_chunkfun (h, h->chunk_size);
   if (!chunk)
 (*obstack_alloc_failed_handler) ();
+  h->chunk = chunk;
   h->next_free = h->object_base = __PTR_ALIGN ((char *) chunk, chunk->contents,
alignment - 1);
   h->chunk_limit = chunk->limit = (char *) chunk + h->chunk_size;
@@ -202,7 +203,7 @@ _obstack_newchunk (struct obstack *h, _OBSTACK_SIZE_T 
length)
 
   /* Allocate and initialize the new chunk.  */
   if (obj_size <= sum1 && sum1 <= sum2)
-new_chunk = call_chunkfun (h, new_size);
+new_chunk = (struct _obstack_chunk *) call_chunkfun (h, new_size);
   if (!new_chunk)
 (*obstack_alloc_failed_handler)();
   h->chunk = new_chunk;


[PATCH 7/7] Configury changes for obstack optimization

2015-11-07 Thread Alan Modra
Provides defines used to determine whether glibc obstacks are
compatible.  Generally speaking, 32-bit targets won't need to use
obstack.o from libiberty if glibc is used, while 64-bit targets will,
until glibc gets the new obstack code.

* configure.ac: Check size of size_t.
* configure: Regenerate.

diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 868be8e..1ab5235 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -276,6 +276,7 @@ libiberty_AC_DECLARE_ERRNO
 # Determine sizes of some types.
 AC_CHECK_SIZEOF([int])
 AC_CHECK_SIZEOF([long])
+AC_CHECK_SIZEOF([size_t])
 
 # Check for presense of long long
 AC_CHECK_TYPE([long long],


POWERPC64_TOC_POINTER_ALIGNMENT

2015-11-17 Thread Alan Modra
David noticed that gcc112 was generating gcc/auto-host.h with
#define POWERPC64_TOC_POINTER_ALIGNMENT 32768

This is not the correct value of either 8 or 256 depending on how old
ld is.  On investigating I found the cause is Fedora 21 modifying the
toolchain to default to -z relro.  ld -z relro puts the relro gap just
before .got (prior to my patches reordering sections for relro on
powerpc64).  That unfortunately aligns .got, defeating the deliberate
mis-alignment of .got in the testcase.

Fixed with the following obvious patch and committed to mainline.

Incidentally, bootstrap fails for me on powerpc64 due to "comparison
is always true due to limited range of data type [-Wtype-limits]"
&& GET_MODE_SIZE (mode) <= POWERPC64_TOC_POINTER_ALIGNMENT));
since my POWERPC64_TOC_POINTER_ALIGNMENT is 256 and mode_size is an
unsigned char array.  Grrr, so what code obfuscation do we use here to
work around this annoying warning?

Cross compiling from x86_64..
/src/gcc-virgin/configure \
--with-sysroot=/powerpc64le-linux --prefix=/usr/local \
--target=powerpc64le-linux --with-cpu=power8 \
--enable-targets=powerpc64-linux,powerpc-linux,powerpcle-linux \
--disable-multilib --disable-nls --enable-__cxa_atexit \
--enable-gnu-indirect-function --enable-secureplt --with-long-double-128 \
--enable-languages=all,go

..fails with:
In file included from /src/gcc-virgin/libgcc/libgcov-driver.c:49:0:
/src/gcc-virgin/libgcc/../gcc/gcov-io.c: In function 'gcov_do_dump':
/src/gcc-virgin/libgcc/../gcc/gcov-io.c:731:51: internal compiler error: in 
convert_move, at expr.c:286
   r = sizeof (long long) * __CHAR_BIT__ - 1 - __builtin_clzll (v);
   ^~~
0x7d5250 convert_move(rtx_def*, rtx_def*, int)
/src/gcc-virgin/gcc/expr.c:286
0x8b0e67 expand_direct_optab_fn
/src/gcc-virgin/gcc/internal-fn.c:2132
0x6cfd9b expand_call_stmt
/src/gcc-virgin/gcc/cfgexpand.c:2565
0x6cfd9b expand_gimple_stmt_1
/src/gcc-virgin/gcc/cfgexpand.c:3525
0x6cfd9b expand_gimple_stmt
/src/gcc-virgin/gcc/cfgexpand.c:3688
0x6d171e expand_gimple_basic_block
/src/gcc-virgin/gcc/cfgexpand.c:5694
0x6d7f96 execute
/src/gcc-virgin/gcc/cfgexpand.c:6309

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 230508)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2015-11-18  Alan Modra  
+
+   * configure.ac (POWERPC64_TOC_POINTER_ALIGNMENT): Pass -z norelro
+   to ld.
+   * configure: Regenerate.
+
 2015-11-17  Tom de Vries  
 
* tree-ssa-loop.c (pass_tree_loop_init::execute): Improve comments.
Index: gcc/configure.ac
===
--- gcc/configure.ac(revision 230508)
+++ gcc/configure.ac(working copy)
@@ -5257,7 +5257,7 @@
 x: .quad .TOC.
 EOF
   if $gcc_cv_as -a64 -o conftest.o conftest.s > /dev/null 2>&1 \
- && $gcc_cv_ld $emul_name -o conftest conftest.o > /dev/null 2>&1; then
+ && $gcc_cv_ld $emul_name -z norelro -o conftest conftest.o > 
/dev/null 2>&1; then
 gcc_cv_ld_toc_align=`$gcc_cv_nm conftest | ${AWK} '/\.TOC\./ { match 
($0, "0[[[:xdigit:]]]*", a); print strtonum ("0x" substr(a[[0]], 
length(a[[0]])-3)) }'`
   fi
   rm -f conftest conftest.o conftest.s

-- 
Alan Modra
Australia Development Lab, IBM


Re: POWERPC64_TOC_POINTER_ALIGNMENT

2015-11-17 Thread Alan Modra
On Tue, Nov 17, 2015 at 07:53:18PM -0500, Michael Meissner wrote:
> Here is the temporary patch I'm using to get past rs6000.c.  But I suspect the
> TOC alignment should never be 256.

Yes, it should be.  Recent GNU ld aligns .TOC. to a 256 byte boundary.
I have this patch in my tree.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index abc8eaa..e3ec042 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -8059,12 +8059,17 @@ rs6000_cannot_force_const_mem (machine_mode mode 
ATTRIBUTE_UNUSED, rtx x)
 static bool
 use_toc_relative_ref (rtx sym, machine_mode mode)
 {
+  /* Silence complaint that the POWERPC64_TOC_POINTER_ALIGNMENT test
+ is always true.  */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wtype-limits"
   return ((constant_pool_expr_p (sym)
   && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (sym),
   get_pool_mode (sym)))
  || (TARGET_CMODEL == CMODEL_MEDIUM
  && SYMBOL_REF_LOCAL_P (sym)
  && GET_MODE_SIZE (mode) <= POWERPC64_TOC_POINTER_ALIGNMENT));
+#pragma GCC diagnostic pop
 }
 
 /* Our implementation of LEGITIMIZE_RELOAD_ADDRESS.  Returns a value to

-- 
Alan Modra
Australia Development Lab, IBM


  1   2   3   4   5   6   7   8   9   10   >