Re: RS6000/PowerPC -mpaired
On Tue, Jan 05, 2016 at 05:43:47PM +, Alan Lawrence wrote: > Can anyone give me any pointers on how to use the -mpaired option ("PowerPC > 750CL paired-single instructions") ? Configure your compiler with --target=powerpc-elf-paired, maybe add --with-cpu=750 ? > I'm trying to get it to use the > reduc_s{min,max,plus}_v2sf patterns in gcc/config/rs6000/paired.md, so far > I haven't managed to find a configuration that supports loading a vector of > floats but that doesn't also have superior altivec/v4sf support... There are no processors with both paired and altivec. Segher
ivopts vs. garbage collection
The bug report https://golang.org/issue/13662 describes a case in which ivopts appears to be breaking garbage collection for the Go compiler. There is an array allocated in memory, and there is a loop over that array. The ivopts optimization is taking the only pointer to the array and subtracting 8 from it before entering the loop. There are function calls in between this subtraction and the actual use of the array. During those function calls, a garbage collection occurs. Since the only pointer to the array no longer points to the actual memory being used, the array is unexpectedly freed. That all seems clear enough, although I'm not sure how best to fix it. What I'm wondering in particular is whether Java does anything to avoid this kind of problem. I don't see anything obvious. Thanks for any pointers. Ian
Re: ivopts vs. garbage collection
On 01/06/2016 08:17 AM, Ian Lance Taylor wrote: The bug report https://golang.org/issue/13662 describes a case in which ivopts appears to be breaking garbage collection for the Go compiler. There is an array allocated in memory, and there is a loop over that array. The ivopts optimization is taking the only pointer to the array and subtracting 8 from it before entering the loop. There are function calls in between this subtraction and the actual use of the array. During those function calls, a garbage collection occurs. Since the only pointer to the array no longer points to the actual memory being used, the array is unexpectedly freed. That all seems clear enough, although I'm not sure how best to fix it. What I'm wondering in particular is whether Java does anything to avoid this kind of problem. I don't see anything obvious. Thanks for any pointers. The only solution here is for ivopts to keep a pointer to the array, not a pointer to some location near, but outside of the array. Java doesn't do anything special to the best of my knowledge, it just relies on the fact that this kind of situation is very rare. This is related, but not the same as the issue we have with Ada's virtual origins for array accesses where the front-end would set up a similar situation, which resulted in a "pointer" that points outside the object. When we actually dereference the pointer, it's done so with a base+index access which brings the effective address back into the object. That kind of scheme wrecks havoc with segmented targets where the segment selection may be from the base register rather than the full effective address. jeff
GNU C library's libmvec and the GNU Compiler *Collection*.
All, I noticed, around half a year ago, that the incredible team around glibc found the time to implement vector math (libm) routines. Previously, free software adherents like me were dependent on vendor libraries via the -mveclibabi={svml|acml} (on Intel/AMD) for instance. However, the examples given on the glibc wiki (https://sourceware.org/glibc/wiki/libmvec, Example 1/Example 2) suggest that this is a C-only thing (this might make sense given that glibc is an implementation of the *C* library), but the above vendor-level options at least work for every front-end language, as far as I know. Would it be possible to add an option -mveclibabi=glibc to cater for this *for all languages*; or is this too low level (after all, the glibc libmvec has code for multiple architectures). If so, at what level should this be implemented ? [ This is relevant for our code, because just the switch to *actual* single precision exp/log/sin/cos implementations in glibc's libm resulted in a decrease of the running time of our weather forecasting code by 25 % (this was in glibc 2.16, IIRC). ] Thanks in advance for your suggestions. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: Strange C++ function pointer test
On 2 January 2016 at 11:42, Jonathan Wakely wrote: > On 31 December 2015 at 18:49, James Dennett wrote: >> On Thu, Dec 31, 2015 at 4:42 AM, Jonathan Wakely >> wrote: >>> >>> On 31 December 2015 at 11:54, Dominik Vogt wrote: >>> > Is there a requirement for a certain minimum Glibc version for >>> > this to work? >>> >>> It doesn't work with any glibc, because it doesn't declare the C++ >>> overloads. >>> >>> Libstdc++ has an include/c_compatibility/math.h header that would >>> include (which declares the C++ overloads) and then pull them >>> into the global namespace, but that isn't used on GNU/Linux, and would >>> create other problems. >>> >> >> What other problems? >> >> It's something of an assumption of the C++ Standard that it's practical for >> C++ implementations to provide such wrappers to add overloads for C++. If >> that's causing some fundamental problem then we should document it (and >> ideally address it). > > Not fundamental problems in the standard, just with the implementation > of that header. It won't work as is and would need changes Specifically, that header assumes that is include/c/cmath, but for GNU/Linux we use include/c_global/cmath, and IIUC it assumes that the libc header defines some of the C++ overloads (but not all?), which isn't true for glibc. So the combination of include/c/cmath and include/c_compatibility/math.h wouldn't work. We could change it to work, but that might break targets using those headers already (possibly just QNX? I don't know). If we want to fix it in libstdc++ then I think we need a different math.h, written from scratch, not starting from include/c+compatibility/math.h
Re: Strange C++ function pointer test
On 4 January 2016 at 09:32, Florian Weimer wrote: > On 12/31/2015 01:31 PM, Jonathan Wakely wrote: >> On 31 December 2015 at 11:37, Marc Glisse wrote: >>> That's what I called "bug" in my message (there are a few bugzilla PRs for >>> this). It would probably work on Solaris. >> >> Yes, the case is still a mess in the standard and in glibc. >> The "only in namespace std in the second case" part is what I meant >> was not accurate. C++11 changed to allow to declare it in the >> global namespace, but as you say didn't go far enough. > > What changes are needed in glibc? (Should we move this to the libstdc++ list?) The problem can either be solved purely in libstdc++, by providing our own that does #include_next to get the libc header and then adds the overloads required by C++, or it can be solved by adding those overloads to the libc header directly. The latter is what Solaris does, and is similar to what glibc already does for strchr etc. in . It would mean something like: #if __cplusplus namespace std { inline float abs(float __x) { return ::fabsf(__x); } inline double abs(double __x) { return ::fabs(__x); } inline double abs(long double __x) { return ::fabsl(__x); } inline float fabs(float __x) { return ::fabsf(__x); } inline double fabs(double __x) { return ::fabs(__x); } inline double fabs(long double __x) { return ::fabsl(__x); } // Similar overloads for other math functions } using std::abs; // ... #if __cplusplus > 201103L namespace std { // similar overloads for C99 isfinite etc. } using std::isfinite; // ... #endif #endif However, this is a lot more work for glibc than the relatively simple strchr/memchr case. I have been meaning to try solving it in libstdc++ with a new that includes the libc one and extends it, to see how well that works. I haven't had time to try that, so it would be premature to ask for changes to be made to glibc when I don't know if they are necessary or would even be the best solution.
gcc-4.9-20160106 is now available
Snapshot gcc-4.9-20160106 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160106/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 232115 You'll find: gcc-4.9-20160106.tar.bz2 Complete GCC MD5=8d55766e64963dd687907fd389070627 SHA1=3e2ecdf0cec93b9c529b2d889233631e4fafe13f Diffs from 4.9-20151230 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: GNU C library's libmvec and the GNU Compiler *Collection*.
On 01/06/2016 07:46 PM, Toon Moene wrote: [ This is relevant for our code, because just the switch to *actual* single precision exp/log/sin/cos implementations in glibc's libm resulted in a decrease of the running time of our weather forecasting code by 25 % (this was in glibc 2.16, IIRC). ] The reference for this is (on an Ivy Bridge system): https://gcc.gnu.org/ml/gcc-help/2013-01/msg00175.html "I have made a home-build glibc-2.17 (on a core-avx system). It works great - linking against it (instead of using the current Debian-Testing's eglibc-2.13) brought the wall-clock time of my weather forecasting job down from 3:35 hours to 2:45 (mostly due to a more efficient implementation of powf, expf and logf)." So, in minutes of compute time: This is (215-165) / 215 = 0.23 (23 %). However, that number included a part that ran for an hour (60 minutes) in double precision. Excluding that we get (155 - 105) / 155 = 0.32 (32 %) improvement in performance for the single precision part of our weather forecasting code. Kind regards, -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
Re: [RFC] MIPS ABI Extension for IEEE Std 754 Non-Compliant Interlinking
On Fri, 27 Nov 2015, Joseph Myers wrote: > > I find it highly unlikely though that the writers will (be able to) chase > > individual targets and any obscure hardware-dependent options the targets > > may provide. And we cannot expect people compiling software to be > > What that says to me is that there should be an architecture-independent > option (-fieee?) that, for architectures where the default configuration > may have architecture-specific deviations from the normal defaults > regarding conformance to IEEE 754 language bindings but there are options > to disable those deviations, disables those deviations. For example, on > alpha that would imply -mieee-with-inexact. On architectures without such > issues (beyond bugs that should be fixed unconditionally, not conditional > on a command-line option, or issues with the hardware ISA that are > infeasible to fix in software), that option would do nothing (beyond any > architecture-independent effects it might have such as implying > -fno-fast-math). I'm fine with `-fieee'/`-fno-ieee', and corresponding `--with-ieee=yes/no' configuration options. I have been aware of the Alpha processor's imprecise IEEE 754 exception mode and while working on the specification concerned here I had in my mind the possibility of expanding the semantics of the MIPS target's `-mieee=' option to cover the somewhat similar imprecise IEEE 754 exception mode of the MIPS R8000 processor, or any optimisations of the same or a different kind the MIPS architecture might introduce in the future. Since (regrettably) by now the Alpha architecture has become a legacy one I didn't consider it important enough to create a generic option which would only control the Alpha target, in addition to the MIPS target being considered here. So I am actually glad you proposed it as I agree it will make things cleaner. > > As you may see in the GCC patches I have just posted the `-mieee=strict' > > option I've implemented sets `-fno-fast-math', and `-mrelaxed-nan=none', > > the only target-specific option so far. So this does exactly what I > > outlined above. > > I am doubtful about the architecture-specific option setting > architecture-independent options here. Having it the other way round as I > suggested above would make more sense to me. Agreed, as noted above. > > "Any or all of these options may have effects beyond propagating the IEEE > > Std 754 compliance mode down to the assembler and the linker. In > > particular `-mieee=strict' is expected to produce strictly compliant code, > > which in the context of this specification is defined as: following IEEE > > Std 754 as closely as the programming language binding to the standard > > (defined in the relevant language standard), the compiler implementation > > and target hardware permit. This means the use of this option may affect > > code produced in ways beyond NaN representation only." > > > > > > Does this answer address your concerns? > > > > > > No, the option concept as described seems too irremediably vague. > > > > Does this explanation give you a better idea of what I have in mind? Do > > you still have concerns about the feasibility of the idea? > > It's better defined, but I think it would be better for -fieee to imply > -mieee=strict -fno-fast-math (or whatever) rather than for -mieee=strict > to imply architecture-independent options. Cf. i386 and sh where > -ffinite-math-only affects architecture-specific options. Thanks for the references. I'll have a look in the course of updating the implementation. With `-fieee' in the picture I think we can get rid of GCC's target-specific high-level `-mieee=' option, as having become redundant, and retain the low-level `-mrelaxed-nan=' only, with the assumption that only power users will need to control this setting directly and they will necessarily have studied and understood all the implications. Additional low-level options can be added in the future as needed to control the R8000 exception mode or other architectural features affecting IEEE 754 arithmetic, wired to `-fieee' as appropriate. I'm going to retain the assembler and linker options and directives of the `*ieee*' form though as their purpose is a bit different -- to set flags in a binary file rather than affecting code generation -- and I don't think it makes sense to expand the namespace there. The effects of these options are cumulative rather than mutually exclusive and it's the names of individual bits or enumeration fields within binary file's control structures, referred in the option's or directive's argument, that tell features apart. I'll be updating the specification and the proposed implementation shortly, and I also think the addition of `-fieee' will then better be done as a separate preparatory change, initially affecting the Alpha target only. Please let me know if I missed anything, or if you have any other