Re: Issues when emitting sjlj dispatch table

2017-09-12 Thread Martin Liška
On 09/05/2017 01:18 PM, Richard Biener wrote:
> On Tue, Sep 5, 2017 at 12:20 PM, Claudiu Zissulescu
>  wrote:
>> Hi guys,
>>
>> I found an ICE when emitting sjlj dispatch table for ARC. Namely, in 
>> sjlj_emit_dispatch_table() function, we create a dispatch table where the 
>> case elements are having the high value is set to NULL (except.c:1326). 
>> Later, these case statements are used by expand_sjlj_dispatch_table() 
>> (stmt.c:1006) where we create a case list requiring also the high element 
>> (stmt.c:1066). This leads to an error when we try to compute the high bounds 
>> in emit_case_dispatch_table() (stmt.c:786), due to the fact that high value 
>> is null.
>>
>> In gcc7.x, we were initializing the high value of case elements in 
>> sjlj_emit_dispatch_table() with the CASE_LOW. Shouldn't we do the same 
>> thing, or do I miss something.
> 
> A NULL CASE_HIGH means the case covers a single value, CASE_LOW.
> 
> Probably broken by Martins reorg.

Yes, it's mine and probably dup of PR82154. I've got tested patch that will 
land soon in @gcc-patches.

Martin

> 
> Richard.
> 
>> A test is attached, the error is visible for ARC backend, the option for the 
>> compiler should be -O2.
>>
>> Thanks,
>> Claudiu



Invalid free in standard library in trivial example with C++17 on gcc 7.2

2017-09-12 Thread Shane Matley
Hi,

Apologies if I am coming about this in the wrong way, I am new to the
mailing list. During our preliminary work to upgrade to gcc 7.2 (from
6.3) at my workplace, we have come across a bug that is blocking our
move to C++17. I have raised a bug report here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82172.

There is an invalid free in a string within basic_stringbuf when
inserting a character into an empty stringbuf when using the pre CXX11
ABI, LTO, O1 and C++17.

Could anyone offer some advice on diagnosing this further, or working
around this issue that doesn't involve moving to the CXX11 ABI?

Thanks in advance,

-- Shane


Byte swapping support

2017-09-12 Thread Jürg Billeter
Hi,

To support applications that assume big-endian memory layout on little-
endian systems, I'm considering adding support for reversing the
storage order to GCC. In contrast to the existing scalar storage order
support for structs, the goal is to reverse the storage order for all
memory operations to achieve maximum compatibility with the behavior on
big-endian systems, as far as observable by the application.

The plan is to insert byte swapping instructions as part of the RTL
expansion of GIMPLE assignments that access memory. This would leverage
code that was added for -fsso-struct, keeping the code simple and
maintainable. It should not be necessary to insert byte swapping
instructions for spilled registers.

Is this something that GCC upstream would be interested to accept?

To facilitate byte swapping at endian boundaries (kernel or libraries),
I'm also considering developing a new GCC builtin that can byte-swap
whole structs in memory. There are limitations to this, e.g., unions
could not be supported in general. However, I still expect this to be
very useful.

Any comments or suggestions?

Best regards,
Jürg


Re: Help out/New to the Project

2017-09-12 Thread Nathan Sidwell

On 09/09/2017 08:00 AM, Ramana Radhakrishnan wrote:


There are a few getting started guides in the wiki
(http://gcc.gnu.org/wiki - Look for tutorials) which can help you get
started in terms of reading up on the internals, there are a few Easy
hacks listed in the wiki page here https://gcc.gnu.org/wiki/EasyHacks


I've lust added a link to EasyHacks from GettingStarted (it wasn't 
obvious to me how to find it).


Is there a different page for 'more involved changes'?  We could likn to 
that from EasyHacks.


nathan
--
Nathan Sidwell


RFC: Improving GCC8 default option settings

2017-09-12 Thread Wilco Dijkstra
Hi all,

At the GNU Cauldron I was inspired by several interesting talks about improving
GCC in various ways. While GCC has many great optimizations, a common theme is
that its default settings are rather conservative. As a result users are 
required to enable several additional optimizations by hand to get good code.
Other compilers enable more optimizations at -O2 (loop unrolling in LLVM was
mentioned repeatedly) which GCC could/should do as well.

Here are a few concrete proposals to improve GCC's option settings which will
enable better code generation for most targets:

* Make -fno-math-errno the default - this mostly affects the code generated for
  sqrt, which should be treated just like floating point division and not set
  errno by default (unless you explicitly select C89 mode).

* Make -fno-trapping-math the default - another obvious one. From the docs:
  "Compile code assuming that floating-point operations cannot generate 
   user-visible traps."
  There isn't a lot of code that actually uses user-visible traps (if any -
  many CPUs don't even support user traps as it's an optional IEEE feature). 
  So assuming trapping math by default is way too conservative since there is
  no obvious benefit to users. 

* Make -fno-common the default - this was originally needed for pre-ANSI C, but
  is optional in C (not sure whether it is still in C99/C11). This can
  significantly improve code generation on targets that use anchors for globals
  (note the linker could report a more helpful message when ancient code that
  requires -fcommon fails to link).

* Make -fomit-frame-pointer the default - various targets already do this at
  higher optimization levels, but this could easily be done for all targets.
  Frame pointers haven't been needed for debugging for decades, however if there
  are still good reasons to keep it enabled with -O0 or -O1 (I can't think of 
any
  unless it is for last-resort backtrace when there is no unwind info at a 
crash),
  we could just disable the frame pointer from -O2 onwards.

These are just a few ideas to start. What do people think? I'd welcome 
discussion
and other proposals for similar improvements.

Wilco


Re: Byte swapping support

2017-09-12 Thread Paul.Koning

> On Sep 12, 2017, at 5:32 AM, Jürg Billeter  
> wrote:
> 
> Hi,
> 
> To support applications that assume big-endian memory layout on little-
> endian systems, I'm considering adding support for reversing the
> storage order to GCC. In contrast to the existing scalar storage order
> support for structs, the goal is to reverse the storage order for all
> memory operations to achieve maximum compatibility with the behavior on
> big-endian systems, as far as observable by the application.

I've done this in the past by C++ type magic.  As a general setting it doesn't 
make sense that I can see.  As an attribute applied to a particular data item, 
it does.  But I'm not sure why you'd put this in the compiler when programmers 
can do it easily enough by defining a "big endian int32" class, etc.

paul



Re: Byte swapping support

2017-09-12 Thread David Brown
On 12/09/17 16:15, paul.kon...@dell.com wrote:
> 
>> On Sep 12, 2017, at 5:32 AM, Jürg Billeter  
>> wrote:
>>
>> Hi,
>>
>> To support applications that assume big-endian memory layout on little-
>> endian systems, I'm considering adding support for reversing the
>> storage order to GCC. In contrast to the existing scalar storage order
>> support for structs, the goal is to reverse the storage order for all
>> memory operations to achieve maximum compatibility with the behavior on
>> big-endian systems, as far as observable by the application.
> 
> I've done this in the past by C++ type magic. As a general setting
> it doesn't make sense that I can see. As an attribute applied to a
> particular data item, it does. But I'm not sure why you'd put this in
> the compiler when programmers can do it easily enough by defining a "big
> endian int32" class, etc.
> 

Some people use the compiler for C rather than C++ ...

If someone wants to improve on the endianness support in gcc, I can
think of a few ideas that /I/ think might be useful.  I have no idea how
difficult they might be to put in practice, and can't say if they would
be of interest to others.

First, I would like to see endianness given as a named address space,
rather than as a type attribute.  A key point here is that named address
spaces are effectively qualifiers, like "const" and "volatile" - and you
can then use them in pointers:

big_endian uint32_t be_buffer[20];
little_endian uint32_t le_buffer[20];

void copy_buffers(const big_endian uint32_t * src,
little_endian uint32_t * dest)
{
for (int i = 0; i < 20; i++) {
dest[i] = src[i];   // Swaps endianness on copy
}
}

That would also let you use them for scaler types, not just structs, and
you could use typedefs:

typedef big_endian uint32_t be_uint32_t;


Secondly, I would add more endian types.  As well as big_endian and
little_endian, I would add native_endian and reverse_endian.  These
could let you write a little clearer definitions sometimes.  And ideally
I would like mixed endian with big-endian 16-bit ordering and
little-endian ordering for bigger types (i.e., 0x87654321 would be
stored 0x43, 0x21, 0x87, 0x65).  That order matches some protocols, such
as Modbus.

Third, I'd like to be able to attach the attribute to particular
variables and to scalers, not just struct and union types.

Forth, I would like type-punning through unions with different ordering
to be allowed.  I'd like to be able to define:

union U {
__attribute__((scalar_storage_order("big-endian"))) uint16_t
protocol[32];
__attribute__((scalar_storage_order("little-endian"))) struct {
uint32_t id;
uint16_t command;
uint16_t param1;
...
}
}

and then I could access data in little ordering in the structures, then
in 16-bit big-endian lumps via the "protocol" array.


David





Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Theodore Papadopoulo
Another one that might be interesting is -funsafe-loop-optimizations.
In most cases people write loops assuming simple finite loops (no
overflow). Crippling optimization for the small amount of people (system
programmers ?) that use such strange loops seems counterproductive. It
would be best if such loops can be marked with an attribute in some way
and that the general case just assumes that all loops are finite...


0x4F273D5D.asc
Description: application/pgp-keys


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Joseph Myers
On Tue, 12 Sep 2017, Wilco Dijkstra wrote:

> * Make -fno-trapping-math the default - another obvious one. From the docs:
>   "Compile code assuming that floating-point operations cannot generate 
>user-visible traps."
>   There isn't a lot of code that actually uses user-visible traps (if any -
>   many CPUs don't even support user traps as it's an optional IEEE feature). 
>   So assuming trapping math by default is way too conservative since there is
>   no obvious benefit to users. 

"traps" here means "raising IEEE exception flags" not just "invoking trap 
handlers".  That is, -ftrapping-math disables a range of local 
transformations that would change the set of flags raised by an operation.  
(Transformations that change the nonzero number of times a flag is raised 
to a different nonzero number are always OK; that is, the possibility of a 
trap handler counting how many times it is invoked is never considered.  
Transformations that might move flag raising across function calls or asms 
that might inspect or modify the flags should not be OK, at least with a 
stricter version of -ftrapping-math that might be another option, but we 
don't have that stricter version at present; -ftrapping-math generally 
does not disable code movement, or removal of code that is dead apart from 
its effect on exception flags.)

That is, lack of trap support on processors that only support exception 
flags is not relevant to -ftrapping-math, beyond any question of whether 
-ftrapping-math should disable transformations that only affect whether an 
exact underflow exception occurs (the case where default exception 
handling does not raise the flag), if we have any such transformations 
(constant folding on exact underflow?).

It's true that a stricter version of -ftrapping-math that inhibits code 
movement and removal would probably inhibit *more* optimizations than 
-frounding-math (which is off by default), as -frounding-math only makes 
floating-point operations read thread-local state but -ftrapping-math 
makes them write it as well.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Andrew Pinski
 .On Tue, Sep 12, 2017 at 8:29 AM, Theodore Papadopoulo
 wrote:
> Another one that might be interesting is -funsafe-loop-optimizations.
> In most cases people write loops assuming simple finite loops (no
> overflow). Crippling optimization for the small amount of people (system
> programmers ?) that use such strange loops seems counterproductive. It
> would be best if such loops can be marked with an attribute in some way
> and that the general case just assumes that all loops are finite...

-funsafe-loop-optimizations is a nop in GCC 7 and above.
Since https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00956.html .

Thanks,
Andrew


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Theodore Papadopoulo
On 09/12/2017 05:32 PM, Andrew Pinski wrote:
>  .On Tue, Sep 12, 2017 at 8:29 AM, Theodore Papadopoulo
>  wrote:
>> Another one that might be interesting is -funsafe-loop-optimizations.
>> In most cases people write loops assuming simple finite loops (no
>> overflow). Crippling optimization for the small amount of people (system
>> programmers ?) that use such strange loops seems counterproductive. It
>> would be best if such loops can be marked with an attribute in some way
>> and that the general case just assumes that all loops are finite...
> 
> -funsafe-loop-optimizations is a nop in GCC 7 and above.
> Since https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00956.html .
> 
> Thanks,
> Andrew
> 

Thank's for the notice. For some reason, I missed that piece of
information... Too bad that making such an assumption generates bogus
code in some common cases.

Theo.


0x4F273D5D.asc
Description: application/pgp-keys


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Joseph Myers
On Tue, 12 Sep 2017, Wilco Dijkstra wrote:

> * Make -fno-math-errno the default - this mostly affects the code generated 
> for
>   sqrt, which should be treated just like floating point division and not set
>   errno by default (unless you explicitly select C89 mode).
> 
> * Make -fno-trapping-math the default - another obvious one. From the docs:

Note these would both have implications for library math_errhandling 
settings (since the compiler options can affect built-in functions).  In 
the absence of -ffast-math glibc defines it to (MATH_ERRNO | 
MATH_ERREXCEPT).  __NO_MATH_ERRNO__ exists since GCC 5 to allow it to be 
defined to just MATH_ERREXCEPT in the -fno-math-errno case, but the header 
needs updating to respect that.  And we don't have a macro for 
-fno-trapping-math to say whether MATH_ERREXCEPT should be part of the 
value.

My assumption is that with changed defaults glibc would need to compile 
with -ftrapping-math just as it uses -frounding-math; code may expect 
transformations that add exceptions not to occur.  It probably does not 
require -fmath-errno (glibc functions do not generally rely on other 
functions setting errno).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Byte swapping support

2017-09-12 Thread H.J. Lu
On Tue, Sep 12, 2017 at 2:32 AM, Jürg Billeter
 wrote:
> Hi,
>
> To support applications that assume big-endian memory layout on little-
> endian systems, I'm considering adding support for reversing the
> storage order to GCC. In contrast to the existing scalar storage order
> support for structs, the goal is to reverse the storage order for all
> memory operations to achieve maximum compatibility with the behavior on
> big-endian systems, as far as observable by the application.
>
> The plan is to insert byte swapping instructions as part of the RTL
> expansion of GIMPLE assignments that access memory. This would leverage
> code that was added for -fsso-struct, keeping the code simple and
> maintainable. It should not be necessary to insert byte swapping
> instructions for spilled registers.
>
> Is this something that GCC upstream would be interested to accept?
>
> To facilitate byte swapping at endian boundaries (kernel or libraries),
> I'm also considering developing a new GCC builtin that can byte-swap
> whole structs in memory. There are limitations to this, e.g., unions
> could not be supported in general. However, I still expect this to be
> very useful.
>
> Any comments or suggestions?
>
> Best regards,
> Jürg

Can you use __attribute__ ((scalar_storage_order)) in GCC 7?

-- 
H.J.


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Alexander Monakov
On Tue, 12 Sep 2017, Wilco Dijkstra wrote:
> * Make -fno-math-errno the default - this mostly affects the code generated 
> for
>   sqrt, which should be treated just like floating point division and not set
>   errno by default (unless you explicitly select C89 mode).

(note that this can be selectively enabled by targets where libm never sets
errno in the first place, docs call out Darwin as one such target, but musl-libc
targets have this property too)

> * Make -fno-trapping-math the default - another obvious one. From the docs:
>   "Compile code assuming that floating-point operations cannot generate 
>user-visible traps."
>   There isn't a lot of code that actually uses user-visible traps (if any -
>   many CPUs don't even support user traps as it's an optional IEEE feature). 
>   So assuming trapping math by default is way too conservative since there is
>   no obvious benefit to users. 

OTOH -O options are understood to _never_ sacrifice standards compliance, with
the exception of -Ofast.  I believe that's an important property to keep.

Maybe it's possible to treat -fno-trapping-math similar to -ffp-contract=fast,
i.e. implicitly enable it in the default C-with-GNU-extensions mode, keeping
strict-compliance mode (-std=c11 as opposed to gnu11) untouched?

In any case it shouldn't be hard to issue a warning if fenv.h functions are
used when -fno-trapping-math/-fno-rounding-math is enabled.

If the above doesn't fly, I believe adopting and promoting a single option for
non-value-changing math optimizations (-fno-math-errno -fno-trapping-math, plus
-fno-rounding-math -fno-signaling-nans when they're no longer default) would
be nice.

> * Make -fno-common the default - this was originally needed for pre-ANSI C, 
> but
>   is optional in C (not sure whether it is still in C99/C11). This can
>   significantly improve code generation on targets that use anchors for 
> globals
>   (note the linker could report a more helpful message when ancient code that
>   requires -fcommon fails to link).

I think in ISO C situations where -fcommon allows link to succeed fall under
undefined behavior, which in GNU toolchain is defined to match the historical
behavior.

I assume the main issue with this is the amount of legacy code that would cause
a link failure if -fno-common is made default - thus, is there anybody in
position to trigger a full-distro rebuild with gcc patched to enable
-fno-common, and compare before/after build failure stats?

Thanks.
Alexander


Re: Byte swapping support

2017-09-12 Thread Michael Meissner
On Tue, Sep 12, 2017 at 05:26:29PM +0200, David Brown wrote:
> On 12/09/17 16:15, paul.kon...@dell.com wrote:
> > 
> >> On Sep 12, 2017, at 5:32 AM, Jürg Billeter 
> >>  wrote:
> >>
> >> Hi,
> >>
> >> To support applications that assume big-endian memory layout on little-
> >> endian systems, I'm considering adding support for reversing the
> >> storage order to GCC. In contrast to the existing scalar storage order
> >> support for structs, the goal is to reverse the storage order for all
> >> memory operations to achieve maximum compatibility with the behavior on
> >> big-endian systems, as far as observable by the application.
> > 
> > I've done this in the past by C++ type magic. As a general setting
> > it doesn't make sense that I can see. As an attribute applied to a
> > particular data item, it does. But I'm not sure why you'd put this in
> > the compiler when programmers can do it easily enough by defining a "big
> > endian int32" class, etc.
> > 
> 
> Some people use the compiler for C rather than C++ ...
> 
> If someone wants to improve on the endianness support in gcc, I can
> think of a few ideas that /I/ think might be useful.  I have no idea how
> difficult they might be to put in practice, and can't say if they would
> be of interest to others.
> 
> First, I would like to see endianness given as a named address space,
> rather than as a type attribute.  A key point here is that named address
> spaces are effectively qualifiers, like "const" and "volatile" - and you
> can then use them in pointers:

When I gave the talk at the 2009 GCC summit on the named address support, I
thought that it could be used to add endianess support.  In fact at one time, I
had a trial PowerPC compiler that added endianess support.  Unfortunately, that
was in 2009, and I lost the directory of the work.  I tried again a few years
ago, but I didn't get far enough into it to get a working compiler before being
pulled back into work.

Back when I worked at Cygnus Solutions, we used to get requests to add endian
support every so often, but nobody wanted to pay the cost that we were then
quoting to add the support.  Now that named address support is in, it could be
done better.

I suspect however, you want to do this at the higher tree level, adding in the
endianess bits in a separate area than the named address support.  Or perhaps,
growing the named address support, and adding several standard named addresses.

The paper where I talked about the named address support was from the 2009 GCC
summit.  You can download the proceedings of the 2009 summit from here (my
paper is pages 67-74):
https://en.wikipedia.org/wiki/GCC_Summit

> big_endian uint32_t be_buffer[20];
> little_endian uint32_t le_buffer[20];
> 
> void copy_buffers(const big_endian uint32_t * src,
>   little_endian uint32_t * dest)
> {
>   for (int i = 0; i < 20; i++) {
>   dest[i] = src[i];   // Swaps endianness on copy
>   }
> }
> 
> That would also let you use them for scaler types, not just structs, and
> you could use typedefs:
> 
>   typedef big_endian uint32_t be_uint32_t;
> 
> 
> Secondly, I would add more endian types.  As well as big_endian and
> little_endian, I would add native_endian and reverse_endian.  These
> could let you write a little clearer definitions sometimes.  And ideally
> I would like mixed endian with big-endian 16-bit ordering and
> little-endian ordering for bigger types (i.e., 0x87654321 would be
> stored 0x43, 0x21, 0x87, 0x65).  That order matches some protocols, such
> as Modbus.

It depends, you can add so many different combinations, that in the end you
don't add the support you want because of th 53 other variants.

Note, if you use it in named addresses, you are currently limited to 15 new
keywords for adding named address support.  This can be grown, but you should
know about the limit ahead of time.  Of course if you add it in a parallel,
machine independent version, then you don't have to worry about the existing
limits.

As I write this, another usage for named addresses occurs to me -- and that is
restricting addressing for memory mapped I/O regions that don't allow certain
types of accesses.

> Third, I'd like to be able to attach the attribute to particular
> variables and to scalers, not just struct and union types.
> 
> Forth, I would like type-punning through unions with different ordering
> to be allowed.  I'd like to be able to define:
> 
> union U {
> __attribute__((scalar_storage_order("big-endian"))) uint16_t
>   protocol[32];
> __attribute__((scalar_storage_order("little-endian"))) struct {
>   uint32_t id;
>   uint16_t command;
>   uint16_t param1;
>   ...
> }
> }

This definately requires support at the higher levels of the compiler.

> and then I could access data in little ordering in the structures, then
> in 16-bit big-endian lumps via the "protocol" array.

One of the things you have to do is be prepared to do a 

Successful bootstrap and install of gcc (GCC) 7.2.0 on hppa2.0-unknown-linux-gnu

2017-09-12 Thread Aaro Koskinen
Hi,

Here's a report of a successful build and install of GCC:

$ gcc-7.2.0/config.guess
hppa2.0-unknown-linux-gnu

$ newcompiler/bin/gcc -v
Using built-in specs.
COLLECT_GCC=newcompiler/bin/gcc
COLLECT_LTO_WRAPPER=/home/aaro/gcctest/newcompiler/libexec/gcc/hppa-unknown-linux-gnu/7.2.0/lto-wrapper
Target: hppa-unknown-linux-gnu
Configured with: ../gcc-7.2.0/configure --disable-nls 
--prefix=/home/aaro/gcctest/newcompiler --enable-languages=c,c++ 
--host=hppa-unknown-linux-gnu --build=hppa-unknown-linux-gnu 
--target=hppa-unknown-linux-gnu --with-system-zlib --with-sysroot=/
Thread model: posix
gcc version 7.2.0 (GCC) 

-- Build environment --

host: hp-c3700
distro:   los.git rootfs=dc818 native=dc818
kernel:   Linux 4.13.0-los_dc818
binutils: GNU binutils 2.29
make: GNU Make 4.2.1
libc: GNU C Library (GNU libc) stable release version 2.26
zlib: 1.2.11
mpfr: 3.1.3
gmp:  60102

-- Time consumed --

configure:  real0m 25.55s
user0m 12.21s
sys 0m 11.55s

bootstrap:  real22h 30m 18s
user21h 32m 25s
sys 54m 12.95s

install:real1m 19.96s
user0m 22.57s
sys 0m 54.13s

-- Hardware details ---

MemTotal: 3108288 kB

processor   : 0
cpu family  : PA-RISC 2.0
cpu : PA8700 (PCX-W2)
cpu MHz : 750.00
capabilities: os32 os64 iopdir_fdc nva_supported (0x07)
model   : 9000/785/C3700
model name  : Allegro W2
hversion: 0x5dc0
sversion: 0x0481
I-cache : 768 KB
D-cache : 1536 KB (WB, direct mapped)
ITLB entries: 240
DTLB entries: 240 - shared with ITLB
bogomips: 1495.85
software id : 2004755634

A.


Re: Power 8 in-core crypto not working as expected

2017-09-12 Thread Segher Boessenkool
On Thu, Sep 07, 2017 at 10:35:18AM -0400, Jeffrey Walton wrote:
> We are using the key and subkey schedule from FIPS 197, Appendix A. We
> are using it because the key schedule is fully specified.
> 
> We lack the known answers for a single round using a subkey like one
> specified in FIPS 197. IBM does not appear to provide them.

197 appendices B and C are full of such examples though.  First see if
you can get a single round (one vcipher call) working; you have to get
byte ordering correct, etc.

> I don't have access to Power ISA 3.0B. It seems to be hidden behind a
> paywall. https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0.

It's not a paywall, you just have to register.  Yes, not ideal.


Segher


Re: Byte swapping support

2017-09-12 Thread Eric Botcazou
> To support applications that assume big-endian memory layout on little-
> endian systems, I'm considering adding support for reversing the
> storage order to GCC.

That was also the goal of the scalar_storage_order attribute.

> In contrast to the existing scalar storage order support for structs, the
> goal is to reverse the storage order for all memory operations to achieve
> maximum compatibility with the behavior on big-endian systems, as far as
> observable by the application.

I presume that you'll well aware of this, but you cannot just reverse the 
storage order for any memory operation; for example, an array of 4 chars in C 
is stored the same way in big-endian and little-endian order, so you ought not 
to do byte swapping when you access it as a whole.  So the above sentence must 
be read as "to reverse the storage order for all scalar memory operations".

When the scalar_storage_order attribute was designed, discussions lead to the 
conclusion that doing the swapping for any scalar memory operation, as opposed 
to any access to a scalar within a structure, would not be a significant step 
forward to warrant the significantly more complex implementation (or the big 
performance penalty if you do things very roughly).

> The plan is to insert byte swapping instructions as part of the RTL
> expansion of GIMPLE assignments that access memory. This would leverage
> code that was added for -fsso-struct, keeping the code simple and
> maintainable.

How do you discriminate scalars stored in native order and scalars stored in 
reverse order though?  That's the main difficulty of the implementation.

-- 
Eric Botcazou


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Joseph Myers
On Tue, 12 Sep 2017, Alexander Monakov wrote:

> > * Make -fno-trapping-math the default - another obvious one. From the docs:
> >   "Compile code assuming that floating-point operations cannot generate 
> >user-visible traps."
> >   There isn't a lot of code that actually uses user-visible traps (if any -
> >   many CPUs don't even support user traps as it's an optional IEEE 
> > feature). 
> >   So assuming trapping math by default is way too conservative since there 
> > is
> >   no obvious benefit to users. 
> 
> OTOH -O options are understood to _never_ sacrifice standards compliance, with
> the exception of -Ofast.  I believe that's an important property to keep.

ISO C allows the FENV_ACCESS pragma to be OFF by default (we don't support 
the standard pragmas, but FENV_ACCESS ON is equivalent to a stricter 
version of -frounding-math -ftrapping-math).  Thus, this is not a 
standards compliance issue (unlike various parts of -ffast-math that break 
IEEE 754 semantics even with FENV_ACCESS OFF).  And since 
-fno-rounding-math is the default, the default is already a form of 
FENV_ACCESS OFF.

I don't think any -O implication of -fno-trapping-math was proposed; the 
proposal was about the default (independent of -O options).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Michael Clark

> On 13 Sep 2017, at 1:57 AM, Wilco Dijkstra  wrote:
> 
> Hi all,
> 
> At the GNU Cauldron I was inspired by several interesting talks about 
> improving
> GCC in various ways. While GCC has many great optimizations, a common theme is
> that its default settings are rather conservative. As a result users are 
> required to enable several additional optimizations by hand to get good code.
> Other compilers enable more optimizations at -O2 (loop unrolling in LLVM was
> mentioned repeatedly) which GCC could/should do as well.

There are some nuances to -O2. Please consider -O2 users who wish use it like 
Clang/LLVM’s -Os (-O2 without loop vectorisation IIRC).

Clang/LLVM has an -Os that is like -O2 so adding optimisations that increase 
code size can be skipped from -Os without drastically effecting performance.

This is not the case with GCC where -Os is a size at all costs optimisation 
mode. GCC users option for size not at the expense of speed is to use -O2.

Clang   GCC
-Oz ~=  -Os
-Os ~=  -O2

So if adding optimisations to -O2 that increase code size, please considering 
adding an -O2s that maintains the compact code size of -O2. -O2 generates 
pretty compact code as many performance optimisations tend to reduce code size, 
or otherwise add optimisations that increase code size to -O3. Adding loop 
unrolling on makes sense in the Clang/LLVM context where they have a compact 
code model with good performance i.e. -Os. In GCC this is -O2.

So if you want to enable more optimisations at -O2, please copy -O2 
optimisations to -O2s or rename -Os to -Oz and copy -O2 optimisation defaults 
to a new -Os.

The present reality is that any project that wishes to optimize for size at all 
costs will need to run a configure test for -Oz, and then fall back to -Os, 
given the current disparity between Clang/LLVM and GCC flags here.

> Here are a few concrete proposals to improve GCC's option settings which will
> enable better code generation for most targets:
> 
> * Make -fno-math-errno the default - this mostly affects the code generated 
> for
>  sqrt, which should be treated just like floating point division and not set
>  errno by default (unless you explicitly select C89 mode).
> 
> * Make -fno-trapping-math the default - another obvious one. From the docs:
>  "Compile code assuming that floating-point operations cannot generate 
>   user-visible traps."
>  There isn't a lot of code that actually uses user-visible traps (if any -
>  many CPUs don't even support user traps as it's an optional IEEE feature). 
>  So assuming trapping math by default is way too conservative since there is
>  no obvious benefit to users. 
> 
> * Make -fno-common the default - this was originally needed for pre-ANSI C, 
> but
>  is optional in C (not sure whether it is still in C99/C11). This can
>  significantly improve code generation on targets that use anchors for globals
>  (note the linker could report a more helpful message when ancient code that
>  requires -fcommon fails to link).
> 
> * Make -fomit-frame-pointer the default - various targets already do this at
>  higher optimization levels, but this could easily be done for all targets.
>  Frame pointers haven't been needed for debugging for decades, however if 
> there
>  are still good reasons to keep it enabled with -O0 or -O1 (I can't think of 
> any
>  unless it is for last-resort backtrace when there is no unwind info at a 
> crash),
>  we could just disable the frame pointer from -O2 onwards.
> 
> These are just a few ideas to start. What do people think? I'd welcome 
> discussion
> and other proposals for similar improvements.
> 
> Wilco



gcc-5-20170912 is now available

2017-09-12 Thread gccadmin
Snapshot gcc-5-20170912 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/5-20170912/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-5-branch 
revision 252045

You'll find:

 gcc-5-20170912.tar.xzComplete GCC

  SHA256=4e0afc2d86fa9bb3b25b64fd9a47ae045a51382fe8baffabcf03df6fddf6ab28
  SHA1=e33aa0044e10d45a84a6b88e3b0922df64245858

Diffs from 5-20170905 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Invalid free in standard library in trivial example with C++17 on gcc 7.2

2017-09-12 Thread Dave Gittins
I confirmed this issue on x86_64 CentOS, and independently here:
https://wandbox.org/permlink/ncWqA9Zu3YEofqri

Also fails on gcc trunk.

Possibly related to bug 81338 "stringstream remains empty after being
moved into multiple times"? Although I see that one is fixed by Mr
Wakely.

Dave



On Tue, Sep 12, 2017 at 5:59 PM, Shane Matley  wrote:
> Hi,
>
> Apologies if I am coming about this in the wrong way, I am new to the
> mailing list. During our preliminary work to upgrade to gcc 7.2 (from
> 6.3) at my workplace, we have come across a bug that is blocking our
> move to C++17. I have raised a bug report here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82172.
>
> There is an invalid free in a string within basic_stringbuf when
> inserting a character into an empty stringbuf when using the pre CXX11
> ABI, LTO, O1 and C++17.
>
> Could anyone offer some advice on diagnosing this further, or working
> around this issue that doesn't involve moving to the CXX11 ABI?
>
> Thanks in advance,
>
> -- Shane


Re: Help out/New to the Project

2017-09-12 Thread Segher Boessenkool
On Tue, Sep 12, 2017 at 08:10:13AM -0400, Nathan Sidwell wrote:
> On 09/09/2017 08:00 AM, Ramana Radhakrishnan wrote:
> 
> >There are a few getting started guides in the wiki
> >(http://gcc.gnu.org/wiki - Look for tutorials) which can help you get
> >started in terms of reading up on the internals, there are a few Easy
> >hacks listed in the wiki page here https://gcc.gnu.org/wiki/EasyHacks
> 
> I've lust added a link to EasyHacks from GettingStarted (it wasn't 
> obvious to me how to find it).
> 
> Is there a different page for 'more involved changes'?  We could likn to 
> that from EasyHacks.

There is always https://gcc.gnu.org/wiki/CC0Transition ;-)

Also there is https://gcc.gnu.org/wiki/ImprovementProjects , which links
to many more pages.


Segher


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Segher Boessenkool
On Wed, Sep 13, 2017 at 09:27:22AM +1200, Michael Clark wrote:
> > Other compilers enable more optimizations at -O2 (loop unrolling in LLVM was
> > mentioned repeatedly) which GCC could/should do as well.
> 
> There are some nuances to -O2. Please consider -O2 users who wish use it like 
> Clang/LLVM’s -Os (-O2 without loop vectorisation IIRC).
> 
> Clang/LLVM has an -Os that is like -O2 so adding optimisations that increase 
> code size can be skipped from -Os without drastically effecting performance.
> 
> This is not the case with GCC where -Os is a size at all costs optimisation 
> mode. GCC users option for size not at the expense of speed is to use -O2.

"Size not at the expense of speed" exists in neither compiler.  Just the
tradeoffs are different between GCC and LLVM.  It would be a silly
optimisation target -- it's exactly the same as just "speed"!  Unless
"speed" means "let's make it faster, and bigger just because" ;-)

GCC's -Os is not "size at all costs" either; there are many options (mostly
--params) that can decrease code size significantly.  To tune code size
down for your particular program you have to play with options a bit.  This
shouldn't be news to anyone.

'-Os'
 Optimize for size.  '-Os' enables all '-O2' optimizations that do
 not typically increase code size.  It also performs further
 optimizations designed to reduce code size.

> So if adding optimisations to -O2 that increase code size, please considering 
> adding an -O2s that maintains the compact code size of -O2. -O2 generates 
> pretty compact code as many performance optimisations tend to reduce code 
> size, or otherwise add optimisations that increase code size to -O3. Adding 
> loop unrolling on makes sense in the Clang/LLVM context where they have a 
> compact code model with good performance i.e. -Os. In GCC this is -O2.
> 
> So if you want to enable more optimisations at -O2, please copy -O2 
> optimisations to -O2s or rename -Os to -Oz and copy -O2 optimisation defaults 
> to a new -Os.

'-O2'
 Optimize even more.  GCC performs nearly all supported
 optimizations that do not involve a space-speed tradeoff.  As
 compared to '-O', this option increases both compilation time and
 the performance of the generated code.

> The present reality is that any project that wishes to optimize for size at 
> all costs will need to run a configure test for -Oz, and then fall back to 
> -Os, given the current disparity between Clang/LLVM and GCC flags here.

The present reality is that any project that wishes to support both GCC and
LLVM needs to do configure tests, because LLVM chose to do many things
differently (sometimes unavoidably).  If GCC would change some options
to be more like LLVM, all users only ever using GCC would be affected,
while all other incompatibilities would remain.  Not a good tradeoff at
all.


Segher


Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Michael Clark

> On 13 Sep 2017, at 12:47 PM, Segher Boessenkool  
> wrote:
> 
> On Wed, Sep 13, 2017 at 09:27:22AM +1200, Michael Clark wrote:
>>> Other compilers enable more optimizations at -O2 (loop unrolling in LLVM was
>>> mentioned repeatedly) which GCC could/should do as well.
>> 
>> There are some nuances to -O2. Please consider -O2 users who wish use it 
>> like Clang/LLVM’s -Os (-O2 without loop vectorisation IIRC).
>> 
>> Clang/LLVM has an -Os that is like -O2 so adding optimisations that increase 
>> code size can be skipped from -Os without drastically effecting performance.
>> 
>> This is not the case with GCC where -Os is a size at all costs optimisation 
>> mode. GCC users option for size not at the expense of speed is to use -O2.
> 
> "Size not at the expense of speed" exists in neither compiler.  Just the
> tradeoffs are different between GCC and LLVM.  It would be a silly
> optimisation target -- it's exactly the same as just "speed"!  Unless
> “speed" means "let's make it faster, and bigger just because" ;-)

I would like to be able to quantify stats on a well known benchmark suite, say 
SPECint 2006 or SPECint 2017 but in my own small benchmark suite I saw a 
disproportionate difference in size between -O2 and -Os, but a significant drop 
in performance with -O2 vs -Os.

- https://rv8.io/bench#optimisation
- https://rv8.io/bench#executable-file-sizes

-O2 is 98% perf of -O3 on x86-64
-Os is 81% perf of -O3 on x86-64

-O2 saves 5% space on -O3 on x86-64
-Os saves 8% space on -Os on x86-64

17% drop in performance for 3% saving in space is not a good trade for a 
“general” size optimisation. It’s more like executable compression.

-O2 seems to be a suite spot for size versus speed.

I could only recommend GCC’s -Os if the user is trying to squeeze something 
down to fit the last few bytes of a ROM and -Oz seems like a more appropriate 
name.

-O2 the current suite spot in GCC and is likely closest in semantics to 
LLVM/Clang -Os and I’d like -O2 binaries to stay lean.

I don’t think O2 should slow down nor should the binariesget larger. Turning up 
knobs that effect code size should be reserved for -O3 until GCC makes a 
distinction between -O2/-O2s and -Os/-Oz.

On RISC-V I believe we could shrink binaries at -O2 further with no sacrifice 
in performance, perhaps with a performance improvement by reducing icache 
bandwidth…

BTW -O2 gets great compression and performance improvements compared to -O0 ;-D 
it’s the points after -O2 where the trade offs don’t correlate.

I like -O2

My 2c.

> GCC's -Os is not "size at all costs" either; there are many options (mostly
> --params) that can decrease code size significantly.  To tune code size
> down for your particular program you have to play with options a bit.  This
> shouldn't be news to anyone.
> 
> '-Os'
> Optimize for size.  '-Os' enables all '-O2' optimizations that do
> not typically increase code size.  It also performs further
> optimizations designed to reduce code size.
> 
>> So if adding optimisations to -O2 that increase code size, please 
>> considering adding an -O2s that maintains the compact code size of -O2. -O2 
>> generates pretty compact code as many performance optimisations tend to 
>> reduce code size, or otherwise add optimisations that increase code size to 
>> -O3. Adding loop unrolling on makes sense in the Clang/LLVM context where 
>> they have a compact code model with good performance i.e. -Os. In GCC this 
>> is -O2.
>> 
>> So if you want to enable more optimisations at -O2, please copy -O2 
>> optimisations to -O2s or rename -Os to -Oz and copy -O2 optimisation 
>> defaults to a new -Os.
> 
> '-O2'
> Optimize even more.  GCC performs nearly all supported
> optimizations that do not involve a space-speed tradeoff.  As
> compared to '-O', this option increases both compilation time and
> the performance of the generated code.
> 
>> The present reality is that any project that wishes to optimize for size at 
>> all costs will need to run a configure test for -Oz, and then fall back to 
>> -Os, given the current disparity between Clang/LLVM and GCC flags here.
> 
> The present reality is that any project that wishes to support both GCC and
> LLVM needs to do configure tests, because LLVM chose to do many things
> differently (sometimes unavoidably).  If GCC would change some options
> to be more like LLVM, all users only ever using GCC would be affected,
> while all other incompatibilities would remain.  Not a good tradeoff at
> all.
> 
> 
> Segher



Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Michael Clark

> On 13 Sep 2017, at 1:15 PM, Michael Clark  wrote:
> 
> - https://rv8.io/bench#optimisation
> - https://rv8.io/bench#executable-file-sizes
> 
> -O2 is 98% perf of -O3 on x86-64
> -Os is 81% perf of -O3 on x86-64
> 
> -O2 saves 5% space on -O3 on x86-64
> -Os saves 8% space on -Os on x86-64
> 
> 17% drop in performance for 3% saving in space is not a good trade for a 
> “general” size optimisation. It’s more like executable compression.

Sorry fixed typo:

-O2 is 98% perf of -O3 on x86-64
-Os is 81% perf of -O3 on x86-64

-O2 saves 5% space on -O3 on x86-64
-Os saves 8% space on -O3 on x86-64

The extra ~3% space saving for ~17% drop in performance doesn’t seem like a 
good general option for size based on the cost in performance.

Again. I really like GCC’s -O2 and hope that its binaries don’t grow in size 
nor slow down.

Re: RFC: Improving GCC8 default option settings

2017-09-12 Thread Jeffrey Walton
> * Make -fomit-frame-pointer the default - various targets already do this at
>   higher optimization levels, but this could easily be done for all targets.
>   Frame pointers haven't been needed for debugging for decades, however if 
> there
>   are still good reasons to keep it enabled with -O0 or -O1 (I can't think of 
> any
>   unless it is for last-resort backtrace when there is no unwind info at a 
> crash),
>   we could just disable the frame pointer from -O2 onwards.

Given there's an -Og now, maybe frame pointers could be enabled fo -O0
and -Og, off by default otherwise.

I like to use -O1 to kick-in the analysis engine and start catching
warnings. It seems like -O1 should be closer -O2/-O3, with respect to
frame pointers since it could help find issues and tickle problems
with hand crafted ASM.

Jeff