Re: Overwhelmed by GCC frustration

2017-08-01 Thread Richard Biener
On Mon, Jul 31, 2017 at 7:08 PM, Andrew Haley  wrote:
> On 31/07/17 17:12, Oleg Endo wrote:
>> On Mon, 2017-07-31 at 15:25 +0200, Georg-Johann Lay wrote:
>>> Around 2010, someone who used a code snipped that I published in
>>> a wiki, reported that the code didn't work and hang in an
>>> endless loop.  Soon I found out that it was due to some GCC
>>> problem, and I got interested in fixing the compiler so that
>>> it worked with my code.
>>>
>>> 1 1/2 years later, in 2011, [...]
>>
>> I could probably write a similar rant.  This is the life of a
>> "minority target programmer".  Most development efforts are being
>> done with primary targets in mind.  And as a result, most changes
>> are being tested only on such targets.
>>
>> To improve the situation, we'd need a lot more target specific tests
>> which test for those regressions that you have mentioned.  Then of
>> course somebody has to run all those tests on all those various
>> targets.  I think that's the biggest problem.  But still, with a
>> test case at hand, it's much easier to talk to people who have
>> silently introduced a regression on some "other" targets.  Most of
>> the time they just don't know.
>
> It's a fundamental problem for compilers, in general: every
> optimization pass wants to be the last one, and (almost?) no-one who
> writes a pass knows all the details of all the subsequent passes.  The
> more sophisticated and subtle an optimization, the more possibility
> there is of messing something up or confusing someone's back end or a
> later pass.  We've seen this multiple times, with apparently
> straightforward control flow at the source level turning into a mess
> of spaghetti in the resulting assembly.  But we know that the
> optimization makes sense for some kinds of program, or at least that
> it did at the time the optimization was written.  However, it is
> inevitable that some programs will be made worse by some
> optimizations.  We hope that they will be few in number, but it
> really can't be helped.
>
> So what is to be done?  We could abandon the eternal drive for more
> and more optimizations, back off, and concentrate on simplicity and
> robustness at the expens of ultimate code quality.  Should we?  It
> would take courage, and there will be an eternal pressume to improve
> code.  And, of course, we'd risk someone forking GCC and creating the
> "superoptimized GCC" project, starving FSF GCC of developers.  That's
> happened before, so it's not an imaginary risk.

Heh.  I suspect -Os would benefit from a separate compilation pipeline
such as -Og.  Nowadays the early optimization pipeline is what you
want (mostly simple CSE & jump optimizations, focused on code
size improvements).  That doesn't get you any loop optimizations but
loop optimizations always have the chance to increase code size
or register pressure.

But yes, targeting an architecture like AVR which is neither primary
nor secondary (so very low priority) _plus_ being quite special in
target abilities (it seems to be very easy to mess up things) is hard.

SUSE does have some testers doing (also) code size monitoring
but as much data we have somebody needs to monitor it, further
bisect and report regressions deemed worthwhile.  It's hard to
avoid slow creep -- compile-time and memory use are a similar
issue here.

Richard.

> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. 
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


RE: Overwhelmed by GCC frustration

2017-08-01 Thread Matthew Fortune
Richard Biener  writes:
> On Mon, Jul 31, 2017 at 7:08 PM, Andrew Haley  wrote:
> > On 31/07/17 17:12, Oleg Endo wrote:
> >> On Mon, 2017-07-31 at 15:25 +0200, Georg-Johann Lay wrote:
> >>> Around 2010, someone who used a code snipped that I published in a
> >>> wiki, reported that the code didn't work and hang in an endless
> >>> loop.  Soon I found out that it was due to some GCC problem, and I
> >>> got interested in fixing the compiler so that it worked with my
> >>> code.
> >>>
> >>> 1 1/2 years later, in 2011, [...]
> >>
> >> I could probably write a similar rant.  This is the life of a
> >> "minority target programmer".  Most development efforts are being
> >> done with primary targets in mind.  And as a result, most changes are
> >> being tested only on such targets.
> >>
> >> To improve the situation, we'd need a lot more target specific tests
> >> which test for those regressions that you have mentioned.  Then of
> >> course somebody has to run all those tests on all those various
> >> targets.  I think that's the biggest problem.  But still, with a test
> >> case at hand, it's much easier to talk to people who have silently
> >> introduced a regression on some "other" targets.  Most of the time
> >> they just don't know.
> >
> > It's a fundamental problem for compilers, in general: every
> > optimization pass wants to be the last one, and (almost?) no-one who
> > writes a pass knows all the details of all the subsequent passes.  The
> > more sophisticated and subtle an optimization, the more possibility
> > there is of messing something up or confusing someone's back end or a
> > later pass.  We've seen this multiple times, with apparently
> > straightforward control flow at the source level turning into a mess
> > of spaghetti in the resulting assembly.  But we know that the
> > optimization makes sense for some kinds of program, or at least that
> > it did at the time the optimization was written.  However, it is
> > inevitable that some programs will be made worse by some
> > optimizations.  We hope that they will be few in number, but it really
> > can't be helped.
> >
> > So what is to be done?  We could abandon the eternal drive for more
> > and more optimizations, back off, and concentrate on simplicity and
> > robustness at the expens of ultimate code quality.  Should we?  It
> > would take courage, and there will be an eternal pressume to improve
> > code.  And, of course, we'd risk someone forking GCC and creating the
> > "superoptimized GCC" project, starving FSF GCC of developers.  That's
> > happened before, so it's not an imaginary risk.
> 
> Heh.  I suspect -Os would benefit from a separate compilation pipeline
> such as -Og.  Nowadays the early optimization pipeline is what you want
> (mostly simple CSE & jump optimizations, focused on code size
> improvements).  That doesn't get you any loop optimizations but loop
> optimizations always have the chance to increase code size or register
> pressure.
> 
> But yes, targeting an architecture like AVR which is neither primary nor
> secondary (so very low priority) _plus_ being quite special in target
> abilities (it seems to be very easy to mess up things) is hard.
> 
> SUSE does have some testers doing (also) code size monitoring but as
> much data we have somebody needs to monitor it, further bisect and
> report regressions deemed worthwhile.  It's hard to avoid slow creep --
> compile-time and memory use are a similar issue here.

Towards the end of last year we ran a code size analysis over time for
MIPS GCC (I believe microMIPSR3 to be specific) between Oct 2013 and
Aug 2016 taking every 50th commit if memory serves. I have a whole bunch
of graphs for open source benchmarks that I may be able to share. The
net effect was a significant code size reduction with just a few short
(<2months) regressions. Not all benchmarks ended up at the best ever code
size and some regressions were countered by different optimisations than
the ones that introduced the regression (so the issue wasn't strictly
fixed in all cases). Over this period I would therefore be surprised if
GCC has caused significant code size regressions in general. I don't have
the detailed analysis to hand but a significant code size reduction
happened ~Mar/Apr 2014 but I can't remember why that was. I do remember a
spike when changing to LRA but that settled down (mostly).

Matthew


Re: Overwhelmed by GCC frustration

2017-08-01 Thread Eric Gallager
On 8/1/17, Richard Biener  wrote:
> On Mon, Jul 31, 2017 at 7:08 PM, Andrew Haley  wrote:
>> On 31/07/17 17:12, Oleg Endo wrote:
>>> On Mon, 2017-07-31 at 15:25 +0200, Georg-Johann Lay wrote:
 Around 2010, someone who used a code snipped that I published in
 a wiki, reported that the code didn't work and hang in an
 endless loop.  Soon I found out that it was due to some GCC
 problem, and I got interested in fixing the compiler so that
 it worked with my code.

 1 1/2 years later, in 2011, [...]
>>>
>>> I could probably write a similar rant.  This is the life of a
>>> "minority target programmer".  Most development efforts are being
>>> done with primary targets in mind.  And as a result, most changes
>>> are being tested only on such targets.
>>>
>>> To improve the situation, we'd need a lot more target specific tests
>>> which test for those regressions that you have mentioned.  Then of
>>> course somebody has to run all those tests on all those various
>>> targets.  I think that's the biggest problem.  But still, with a
>>> test case at hand, it's much easier to talk to people who have
>>> silently introduced a regression on some "other" targets.  Most of
>>> the time they just don't know.
>>
>> It's a fundamental problem for compilers, in general: every
>> optimization pass wants to be the last one, and (almost?) no-one who
>> writes a pass knows all the details of all the subsequent passes.  The
>> more sophisticated and subtle an optimization, the more possibility
>> there is of messing something up or confusing someone's back end or a
>> later pass.  We've seen this multiple times, with apparently
>> straightforward control flow at the source level turning into a mess
>> of spaghetti in the resulting assembly.  But we know that the
>> optimization makes sense for some kinds of program, or at least that
>> it did at the time the optimization was written.  However, it is
>> inevitable that some programs will be made worse by some
>> optimizations.  We hope that they will be few in number, but it
>> really can't be helped.
>>
>> So what is to be done?  We could abandon the eternal drive for more
>> and more optimizations, back off, and concentrate on simplicity and
>> robustness at the expens of ultimate code quality.  Should we?  It
>> would take courage, and there will be an eternal pressume to improve
>> code.  And, of course, we'd risk someone forking GCC and creating the
>> "superoptimized GCC" project, starving FSF GCC of developers.  That's
>> happened before, so it's not an imaginary risk.
>
> Heh.  I suspect -Os would benefit from a separate compilation pipeline
> such as -Og.  Nowadays the early optimization pipeline is what you
> want (mostly simple CSE & jump optimizations, focused on code
> size improvements).  That doesn't get you any loop optimizations but
> loop optimizations always have the chance to increase code size
> or register pressure.
>

Maybe in addition to the -Os optimization level, GCC mainline could
also add the -Oz optimization level like Apple's GCC had, and clang
still has? Basically -Os is -O2 with additional code size focus,
whereas -Oz is -O0 with the same code size focus. Adding it to the
FSF's GCC, too, could help reduce code size even further than -Os
currently does.

> But yes, targeting an architecture like AVR which is neither primary
> nor secondary (so very low priority) _plus_ being quite special in
> target abilities (it seems to be very easy to mess up things) is hard.
>
> SUSE does have some testers doing (also) code size monitoring
> but as much data we have somebody needs to monitor it, further
> bisect and report regressions deemed worthwhile.  It's hard to
> avoid slow creep -- compile-time and memory use are a similar
> issue here.
>
> Richard.
>
>> --
>> Andrew Haley
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. 
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>


Re: Overwhelmed by GCC frustration

2017-08-01 Thread Jakub Jelinek
On Tue, Aug 01, 2017 at 07:08:41AM -0400, Eric Gallager wrote:
> > Heh.  I suspect -Os would benefit from a separate compilation pipeline
> > such as -Og.  Nowadays the early optimization pipeline is what you
> > want (mostly simple CSE & jump optimizations, focused on code
> > size improvements).  That doesn't get you any loop optimizations but
> > loop optimizations always have the chance to increase code size
> > or register pressure.
> >
> 
> Maybe in addition to the -Os optimization level, GCC mainline could
> also add the -Oz optimization level like Apple's GCC had, and clang
> still has? Basically -Os is -O2 with additional code size focus,
> whereas -Oz is -O0 with the same code size focus. Adding it to the
> FSF's GCC, too, could help reduce code size even further than -Os
> currently does.

No, lack of optimizations certainly doesn't reduce the code size.
For small code, you need lots of optimizations, but preferrably code-size
aware ones.  For RTL that is usually easier, because you can often compare
the sizes of the old and new sequences and choose smaller, for GIMPLE
optimizations it is often just a wild guess on what optimizations generally
result in smaller and what optimizations generally result in larger code.
There are too many following passes to know for sure, and finding the right
heuristics is hard.

Jakub


Re: Overwhelmed by GCC frustration

2017-08-01 Thread David Brown
On 01/08/17 13:08, Eric Gallager wrote:
> On 8/1/17, Richard Biener  wrote:

>>
>> Heh.  I suspect -Os would benefit from a separate compilation pipeline
>> such as -Og.  Nowadays the early optimization pipeline is what you
>> want (mostly simple CSE & jump optimizations, focused on code
>> size improvements).  That doesn't get you any loop optimizations but
>> loop optimizations always have the chance to increase code size
>> or register pressure.
>>
> 
> Maybe in addition to the -Os optimization level, GCC mainline could
> also add the -Oz optimization level like Apple's GCC had, and clang
> still has? Basically -Os is -O2 with additional code size focus,
> whereas -Oz is -O0 with the same code size focus. Adding it to the
> FSF's GCC, too, could help reduce code size even further than -Os
> currently does.
> 

I would not expect that to be good at all.  With no optimisation (-O0),
gcc produces quite poor code - local variables are not put in registers
or "optimised away", there is no strength reduction, etc.  For an
architecture like the AVR with a fair number of registers (32, albeit
8-bit registers) and relatively inefficient stack access, -O0 produces
/terrible/ code.

There is also the body of existing code, projects, practice and
knowledge - all of that says "-Os" is for optimised code with an
emphasis on size, and it is the flag of choice on a large proportion of
embedded projects (not just for the AVR).

The ideal solution is to fix gcc so that -Os gives (close to) minimal
code size, or at least as good as it used to give - while retaining the
benefits of the newer optimisations and features of later gcc versions.

The question is, can this be done in a way that is practical,
maintainable, achievable with the limited resources of the AVR port, and
without detriment to any of the other ports?  As has been noted, the AVR
port is considered a minor port for gcc (though it is vital for the AVR
development community), and other ports have seen a trend of improvement
in code size - gcc can't take changes to improve the AVR if it makes
things worse for MIPS.

Is it possible to get some improvement in AVR generation by enabling or
disabling specific combinations of optimisation in addition to -Os?  Are
there tunables that could be fiddled with to improve matters (either
"--param" options, or in the AVR backend code)?  Can the "-fdump-rtl" or
"-fopt-info" flags be used to get an idea of which passes lead to code
increase?  If these led to improvements, then it should be possible to
better the situation by simply changing the defaults used by the AVR port.

Of course, this sort of analysis would require significant effort - but
not many changes to the gcc code.  Could Microchip (who now own Atmel,
the AVR manufacturer) provide the resources for the work?  It is clearly
in their interest that the AVR port of gcc is as good as it can be -
they ship it as part of their standard development tools.



Re: Overwhelmed by GCC frustration

2017-08-01 Thread David Brown
On 31/07/17 15:25, Georg-Johann Lay wrote:

> This weekend I un-mothed an old project, just an innocent game on a
> cathode-ray-tube, driven by some AVR µC.  After preparing the software
> so that it compiled with good old v3.4, the results overwhelmed me
> with complete frustration:  Any version from v4.7 up to v8 which I
> fed into the compiler (six major version to be precise), produced a
> code > 25% larger than compiled with v3.4.6 and the code is not only
> bigger, it's all needless bloat that will also slow down the result;
> optimizing for speed might bloat and slow even more.
> 

At the risk of stirring up a hornets nest, have you tried the AVR
backend of LLVM/clang ?  It is still a work in progress, and I haven't
tried it at all myself (my experience with clang is very minimal).

Ultimately, the aim of the AVR gcc users community is to have free,
cross-platform tools that will work with their existing and future code
and generate good AVR object code.  Few users really care if it is gcc
or clang (especially if the clang binaries are named "avr-gcc").  And
ultimately the aim of Microchip, who make and sell the chips themselves,
should be to keep the users happy.

If the structure of llvm/clang is such that it makes more sense for
Microchip to put resources into that project, and when it is good enough
they could move to it as their default toolchain for users, then the
quality of AVR code generation in gcc becomes a non-issue.

(Personally, I like to see friendly competition between llvm/clang and
gcc, as I think it has helped both projects, but there are severe limits
to the development resources that can be put into a minor target like this.)



Re: Overwhelmed by GCC frustration

2017-08-01 Thread Eric Gallager
On 8/1/17, Jakub Jelinek  wrote:
> On Tue, Aug 01, 2017 at 07:08:41AM -0400, Eric Gallager wrote:
>> > Heh.  I suspect -Os would benefit from a separate compilation pipeline
>> > such as -Og.  Nowadays the early optimization pipeline is what you
>> > want (mostly simple CSE & jump optimizations, focused on code
>> > size improvements).  That doesn't get you any loop optimizations but
>> > loop optimizations always have the chance to increase code size
>> > or register pressure.
>> >
>>
>> Maybe in addition to the -Os optimization level, GCC mainline could
>> also add the -Oz optimization level like Apple's GCC had, and clang
>> still has? Basically -Os is -O2 with additional code size focus,
>> whereas -Oz is -O0 with the same code size focus. Adding it to the
>> FSF's GCC, too, could help reduce code size even further than -Os
>> currently does.
>
> No, lack of optimizations certainly doesn't reduce the code size.
> For small code, you need lots of optimizations, but preferrably code-size
> aware ones.  For RTL that is usually easier, because you can often compare
> the sizes of the old and new sequences and choose smaller, for GIMPLE
> optimizations it is often just a wild guess on what optimizations generally
> result in smaller and what optimizations generally result in larger code.
> There are too many following passes to know for sure, and finding the right
> heuristics is hard.
>
>   Jakub
>

Upon rereading of the relevant docs, I guess it was a mistake to
compare -Oz to -O0. Let me quote from the apple-gcc "Optimize Options"
page:

-Oz
(APPLE ONLY) Optimize for size, regardless of performance. -Oz
enables the same optimization flags that -Os uses, but -Oz also
enables other optimizations intended solely to reduce code size.
In particular, instructions that encode into fewer bytes are
preferred over longer instructions that execute in fewer cycles.
-Oz on Darwin is very similar to -Os in FSF distributions of GCC.
-Oz employs the same inlining limits and avoids string instructions
just like -Os.

Meanwhile, their description of -Os as contrasted to -Oz reads:

-Os
Optimize for size, but not at the expense of speed. -Os enables all
-O2 optimizations that do not typically increase code size.
However, instructions are chosen for best performance, regardless
of size. To optimize solely for size on Darwin, use -Oz (APPLE
ONLY).

And the clang docs for -Oz say:

-Oz Like -Os (and thus -O2), but reduces code size further.

So -Oz does actually still optimize, so it's more like -O2 than -O0
after all, just even more size-focused than -Os.


Re: Overwhelmed by GCC frustration

2017-08-01 Thread James Greenhalgh
On Tue, Aug 01, 2017 at 11:12:12AM -0400, Eric Gallager wrote:
> On 8/1/17, Jakub Jelinek  wrote:
> > On Tue, Aug 01, 2017 at 07:08:41AM -0400, Eric Gallager wrote:
> >> > Heh.  I suspect -Os would benefit from a separate compilation pipeline
> >> > such as -Og.  Nowadays the early optimization pipeline is what you
> >> > want (mostly simple CSE & jump optimizations, focused on code
> >> > size improvements).  That doesn't get you any loop optimizations but
> >> > loop optimizations always have the chance to increase code size
> >> > or register pressure.
> >> >
> >>
> >> Maybe in addition to the -Os optimization level, GCC mainline could
> >> also add the -Oz optimization level like Apple's GCC had, and clang
> >> still has? Basically -Os is -O2 with additional code size focus,
> >> whereas -Oz is -O0 with the same code size focus. Adding it to the
> >> FSF's GCC, too, could help reduce code size even further than -Os
> >> currently does.
> >
> > No, lack of optimizations certainly doesn't reduce the code size.
> > For small code, you need lots of optimizations, but preferrably code-size
> > aware ones.  For RTL that is usually easier, because you can often compare
> > the sizes of the old and new sequences and choose smaller, for GIMPLE
> > optimizations it is often just a wild guess on what optimizations generally
> > result in smaller and what optimizations generally result in larger code.
> > There are too many following passes to know for sure, and finding the right
> > heuristics is hard.
> >
> > Jakub
> >
> 
> Upon rereading of the relevant docs, I guess it was a mistake to
> compare -Oz to -O0. Let me quote from the apple-gcc "Optimize Options"
> page:
> 
> -Oz
> (APPLE ONLY) Optimize for size, regardless of performance. -Oz
> enables the same optimization flags that -Os uses, but -Oz also
> enables other optimizations intended solely to reduce code size.
> In particular, instructions that encode into fewer bytes are
> preferred over longer instructions that execute in fewer cycles.
> -Oz on Darwin is very similar to -Os in FSF distributions of GCC.
> -Oz employs the same inlining limits and avoids string instructions
> just like -Os.
> 
> Meanwhile, their description of -Os as contrasted to -Oz reads:
> 
> -Os
> Optimize for size, but not at the expense of speed. -Os enables all
> -O2 optimizations that do not typically increase code size.
> However, instructions are chosen for best performance, regardless
> of size. To optimize solely for size on Darwin, use -Oz (APPLE
> ONLY).
> 
> And the clang docs for -Oz say:
> 
> -Oz Like -Os (and thus -O2), but reduces code size further.
> 
> So -Oz does actually still optimize, so it's more like -O2 than -O0
> after all, just even more size-focused than -Os.

The relationship between -Os and -Oz is like the relationship between -O2
and -O3.

If -O3 says, try everything you can to increase performance even at the
expense of code-size and compile time, then -Oz says, try everything you
can to reduce the code size, even at the expense of performance and
compile time.

Thanks,
James



gcc-5-20170801 is now available

2017-08-01 Thread gccadmin
Snapshot gcc-5-20170801 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/5-20170801/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-5-branch 
revision 250802

You'll find:

 gcc-5-20170801.tar.xzComplete GCC

  SHA256=e4bd00e52fd2fe2d1b5aa950b7351d60200d3fc05d2a83af9f7da48db6e05c8b
  SHA1=9e7efad644d2c365cd40bd256234a80e7f1870f6

Diffs from 5-20170725 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.