Re: RS6000/PowerPC -mpaired

2016-01-06 Thread Segher Boessenkool
On Tue, Jan 05, 2016 at 05:43:47PM +, Alan Lawrence wrote:
> Can anyone give me any pointers on how to use the -mpaired option ("PowerPC 
> 750CL paired-single instructions") ?

Configure your compiler with --target=powerpc-elf-paired, maybe add
--with-cpu=750 ?

> I'm trying to get it to use the 
> reduc_s{min,max,plus}_v2sf patterns in gcc/config/rs6000/paired.md, so far 
> I haven't managed to find a configuration that supports loading a vector of 
> floats but that doesn't also have superior altivec/v4sf support...

There are no processors with both paired and altivec.


Segher


ivopts vs. garbage collection

2016-01-06 Thread Ian Lance Taylor
The bug report https://golang.org/issue/13662 describes a case in
which ivopts appears to be breaking garbage collection for the Go
compiler.  There is an array allocated in memory, and there is a loop
over that array.  The ivopts optimization is taking the only pointer
to the array and subtracting 8 from it before entering the loop.
There are function calls in between this subtraction and the actual
use of the array.  During those function calls, a garbage collection
occurs.  Since the only pointer to the array no longer points to the
actual memory being used, the array is unexpectedly freed.

That all seems clear enough, although I'm not sure how best to fix it.
What I'm wondering in particular is whether Java does anything to
avoid this kind of problem.  I don't see anything obvious.  Thanks for
any pointers.

Ian


Re: ivopts vs. garbage collection

2016-01-06 Thread Jeff Law

On 01/06/2016 08:17 AM, Ian Lance Taylor wrote:

The bug report https://golang.org/issue/13662 describes a case in
which ivopts appears to be breaking garbage collection for the Go
compiler.  There is an array allocated in memory, and there is a loop
over that array.  The ivopts optimization is taking the only pointer
to the array and subtracting 8 from it before entering the loop.
There are function calls in between this subtraction and the actual
use of the array.  During those function calls, a garbage collection
occurs.  Since the only pointer to the array no longer points to the
actual memory being used, the array is unexpectedly freed.

That all seems clear enough, although I'm not sure how best to fix it.
What I'm wondering in particular is whether Java does anything to
avoid this kind of problem.  I don't see anything obvious.  Thanks for
any pointers.
The only solution here is for ivopts to keep a pointer to the array, not 
a pointer to some location near, but outside of the array.


Java doesn't do anything special to the best of my knowledge, it just 
relies on the fact that this kind of situation is very rare.


This is related, but not the same as the issue we have with Ada's 
virtual origins for array accesses where the front-end would set up a 
similar situation, which resulted in a "pointer" that points outside the 
object.  When we actually dereference the pointer, it's done so with a 
base+index access which brings the effective address back into the 
object.  That kind of scheme wrecks havoc with segmented targets where 
the segment selection may be from the base register rather than the full 
effective address.


jeff



GNU C library's libmvec and the GNU Compiler *Collection*.

2016-01-06 Thread Toon Moene

All,

I noticed, around half a year ago, that the incredible team around glibc 
found the time to implement vector math (libm) routines.


Previously, free software adherents like me were dependent on vendor 
libraries via the -mveclibabi={svml|acml} (on Intel/AMD) for instance.


However, the examples given on the glibc wiki 
(https://sourceware.org/glibc/wiki/libmvec, Example 1/Example 2) suggest 
that this is a C-only thing (this might make sense given that glibc is 
an implementation of the *C* library), but the above vendor-level 
options at least work for every front-end language, as far as I know.


Would it be possible to add an option -mveclibabi=glibc to cater for 
this *for all languages*; or is this too low level (after all, the glibc 
libmvec has code for multiple architectures). If so, at what level 
should this be implemented ?


[ This is relevant for our code, because just the switch to *actual*
  single precision exp/log/sin/cos implementations in glibc's libm
  resulted in a decrease of the running time of our weather forecasting
  code by 25 % (this was in glibc 2.16, IIRC). ]

Thanks in advance for your suggestions.

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news


Re: Strange C++ function pointer test

2016-01-06 Thread Jonathan Wakely
On 2 January 2016 at 11:42, Jonathan Wakely wrote:
> On 31 December 2015 at 18:49, James Dennett  wrote:
>> On Thu, Dec 31, 2015 at 4:42 AM, Jonathan Wakely 
>> wrote:
>>>
>>> On 31 December 2015 at 11:54, Dominik Vogt wrote:
>>> > Is there a requirement for a certain minimum Glibc version for
>>> > this to work?
>>>
>>> It doesn't work with any glibc, because it doesn't declare the C++
>>> overloads.
>>>
>>> Libstdc++ has an include/c_compatibility/math.h header that would
>>> include  (which declares the C++ overloads) and then pull them
>>> into the global namespace, but that isn't used on GNU/Linux, and would
>>> create other problems.
>>>
>>
>> What other problems?
>>
>> It's something of an assumption of the C++ Standard that it's practical for
>> C++ implementations to provide such wrappers to add overloads for C++.  If
>> that's causing some fundamental problem then we should document it (and
>> ideally address it).
>
> Not fundamental problems in the standard, just with the implementation
> of that header. It won't work as is and would need changes

Specifically, that header assumes that  is include/c/cmath, but
for GNU/Linux we use include/c_global/cmath, and IIUC it assumes that
the libc header defines some of the C++ overloads (but not all?),
which isn't true for glibc.

So the combination of include/c/cmath and
include/c_compatibility/math.h wouldn't work. We could change it to
work, but that might break targets using those headers already
(possibly just QNX? I don't know).

If we want to fix it in libstdc++ then I think we need a different
math.h, written from scratch, not starting from
include/c+compatibility/math.h


Re: Strange C++ function pointer test

2016-01-06 Thread Jonathan Wakely
On 4 January 2016 at 09:32, Florian Weimer wrote:
> On 12/31/2015 01:31 PM, Jonathan Wakely wrote:
>> On 31 December 2015 at 11:37, Marc Glisse wrote:
>>> That's what I called "bug" in my message (there are a few bugzilla PRs for
>>> this). It would probably work on Solaris.
>>
>> Yes, the  case is still a mess in the standard and in glibc.
>> The "only in namespace std in the second case" part is what I meant
>> was not accurate. C++11 changed to allow  to declare it in the
>> global namespace, but as you say didn't go far enough.
>
> What changes are needed in glibc?

(Should we move this to the libstdc++ list?)

The problem can either be solved purely in libstdc++, by providing our
own  that does #include_next  to get the libc header
and then adds the overloads required by C++, or it can be solved by
adding those overloads to the libc header directly.

The latter is what Solaris does, and is similar to what glibc already
does for strchr etc. in . It would mean something like:

#if __cplusplus
namespace std {
  inline float abs(float __x) { return ::fabsf(__x); }
  inline double abs(double __x) { return ::fabs(__x); }
  inline double abs(long double __x) { return ::fabsl(__x); }
  inline float fabs(float __x) { return ::fabsf(__x); }
  inline double fabs(double __x) { return ::fabs(__x); }
  inline double fabs(long double __x) { return ::fabsl(__x); }

  // Similar overloads for other math functions
}
using std::abs;
// ...
#if __cplusplus > 201103L
namespace std {
  // similar overloads for C99 isfinite etc.
}
using std::isfinite;
// ...
#endif
#endif

However, this is a lot more work for glibc than the relatively simple
strchr/memchr case.

I have been meaning to try solving it in libstdc++ with a new 
that includes the libc one and extends it, to see how well that works.
I haven't had time to try that, so it would be premature to ask for
changes to be made to glibc when I don't know if they are necessary or
would even be the best solution.


gcc-4.9-20160106 is now available

2016-01-06 Thread gccadmin
Snapshot gcc-4.9-20160106 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160106/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 232115

You'll find:

 gcc-4.9-20160106.tar.bz2 Complete GCC

  MD5=8d55766e64963dd687907fd389070627
  SHA1=3e2ecdf0cec93b9c529b2d889233631e4fafe13f

Diffs from 4.9-20151230 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: GNU C library's libmvec and the GNU Compiler *Collection*.

2016-01-06 Thread Toon Moene

On 01/06/2016 07:46 PM, Toon Moene wrote:


[ This is relevant for our code, because just the switch to *actual*
   single precision exp/log/sin/cos implementations in glibc's libm
   resulted in a decrease of the running time of our weather forecasting
   code by 25 % (this was in glibc 2.16, IIRC). ]


The reference for this is (on an Ivy Bridge system):

https://gcc.gnu.org/ml/gcc-help/2013-01/msg00175.html

"I have made a home-build glibc-2.17 (on a core-avx system). It works 
great - linking against it (instead of using the current 
Debian-Testing's eglibc-2.13) brought the wall-clock time of my weather 
forecasting job down from 3:35 hours to 2:45 (mostly due to a more 
efficient implementation of powf, expf and logf)."


So, in minutes of compute time:

This is (215-165) / 215 = 0.23 (23 %). However, that number included a 
part that ran for an hour (60 minutes) in double precision.


Excluding that we get (155 - 105) / 155 = 0.32 (32 %) improvement in 
performance for the single precision part of our weather forecasting code.


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news


Re: [RFC] MIPS ABI Extension for IEEE Std 754 Non-Compliant Interlinking

2016-01-06 Thread Maciej W. Rozycki
On Fri, 27 Nov 2015, Joseph Myers wrote:

> > I find it highly unlikely though that the writers will (be able to) chase 
> > individual targets and any obscure hardware-dependent options the targets 
> > may provide.  And we cannot expect people compiling software to be 
> 
> What that says to me is that there should be an architecture-independent 
> option (-fieee?) that, for architectures where the default configuration 
> may have architecture-specific deviations from the normal defaults 
> regarding conformance to IEEE 754 language bindings but there are options 
> to disable those deviations, disables those deviations.  For example, on 
> alpha that would imply -mieee-with-inexact.  On architectures without such 
> issues (beyond bugs that should be fixed unconditionally, not conditional 
> on a command-line option, or issues with the hardware ISA that are 
> infeasible to fix in software), that option would do nothing (beyond any 
> architecture-independent effects it might have such as implying 
> -fno-fast-math).

 I'm fine with `-fieee'/`-fno-ieee', and corresponding 
`--with-ieee=yes/no' configuration options.

 I have been aware of the Alpha processor's imprecise IEEE 754 exception 
mode and while working on the specification concerned here I had in my 
mind the possibility of expanding the semantics of the MIPS target's 
`-mieee=' option to cover the somewhat similar imprecise IEEE 754 
exception mode of the MIPS R8000 processor, or any optimisations of the 
same or a different kind the MIPS architecture might introduce in the 
future.

 Since (regrettably) by now the Alpha architecture has become a legacy one 
I didn't consider it important enough to create a generic option which 
would only control the Alpha target, in addition to the MIPS target being 
considered here.  So I am actually glad you proposed it as I agree it will 
make things cleaner.

> >  As you may see in the GCC patches I have just posted the `-mieee=strict' 
> > option I've implemented sets `-fno-fast-math', and `-mrelaxed-nan=none', 
> > the only target-specific option so far.  So this does exactly what I 
> > outlined above.
> 
> I am doubtful about the architecture-specific option setting 
> architecture-independent options here.  Having it the other way round as I 
> suggested above would make more sense to me.

 Agreed, as noted above.

> > "Any or all of these options may have effects beyond propagating the IEEE 
> > Std 754 compliance mode down to the assembler and the linker.  In 
> > particular `-mieee=strict' is expected to produce strictly compliant code, 
> > which in the context of this specification is defined as: following IEEE 
> > Std 754 as closely as the programming language binding to the standard 
> > (defined in the relevant language standard), the compiler implementation 
> > and target hardware permit.  This means the use of this option may affect 
> > code produced in ways beyond NaN representation only."
> > 
> > > >  Does this answer address your concerns?
> > > 
> > > No, the option concept as described seems too irremediably vague.
> > 
> >  Does this explanation give you a better idea of what I have in mind?  Do 
> > you still have concerns about the feasibility of the idea?
> 
> It's better defined, but I think it would be better for -fieee to imply 
> -mieee=strict -fno-fast-math (or whatever) rather than for -mieee=strict 
> to imply architecture-independent options.  Cf. i386 and sh where 
> -ffinite-math-only affects architecture-specific options.

 Thanks for the references.  I'll have a look in the course of updating 
the implementation.

 With `-fieee' in the picture I think we can get rid of GCC's 
target-specific high-level `-mieee=' option, as having become redundant, 
and retain the low-level `-mrelaxed-nan=' only, with the assumption that 
only power users will need to control this setting directly and they will 
necessarily have studied and understood all the implications.  Additional 
low-level options can be added in the future as needed to control the 
R8000 exception mode or other architectural features affecting IEEE 754 
arithmetic, wired to `-fieee' as appropriate.

 I'm going to retain the assembler and linker options and directives of 
the `*ieee*' form though as their purpose is a bit different -- to set 
flags in a binary file rather than affecting code generation -- and I 
don't think it makes sense to expand the namespace there.  The effects of 
these options are cumulative rather than mutually exclusive and it's the 
names of individual bits or enumeration fields within binary file's 
control structures, referred in the option's or directive's argument, that 
tell features apart.

 I'll be updating the specification and the proposed implementation 
shortly, and I also think the addition of `-fieee' will then better be 
done as a separate preparatory change, initially affecting the Alpha 
target only.  Please let me know if I missed anything, or if you have any 
other