Re: issue: unexpected results in optimizations

2023-12-12 Thread Jonathan Wakely via Gcc
Ignore the troll

On Mon, 11 Dec 2023, 17:28 Dave Blanchard,  wrote:

> Hi Jingwen,
>
> This is the same GCC which in recent versions produces something like two
> dozen extraneous, useless, no-op instructions when doing a simple 64-bit
> math operation on 32-bit systems, and does not use SSE properly either. In
> each major release these problems get worse. The code generator is clearly
> in a state of slow degradation, starting about GCC version 5 or 6--not
> coincidentally the same time when the major version numbers started
> increasingly so rapidly, although it really has been junk since the
> beginning.
>
> Stefan Kanthak hammered this point home numerous times on this list, much
> to the ire of people like Jonathan Wakely who called him a noob, telling
> him to "go file a bug" in a filing cabinet in some obscure corner of a
> disused lavatory so that it can be safely ignored, and so on.
>
> It seems that if correct code generation and optimization is important to
> you (as it should be), GCC is NOT the compiler to be using. I'm all the
> time discovering new and crazy problems with this convoluted pile of junk.
> My recent foray into bootstrapping GNAT (ADA) has opened up yet another can
> of worms. It's broken on GCC 10, and even more broken on GCC 9, and this
> despite 30+ years of development.
>
> Sometimes these days I even blame GCC when it wasn't at fault after all,
> because it's making itself into more and more of a likely suspect as the
> years go by.
>
> I haven't examined the code output of Clang to see how it compares, but
> it's worth serious investigation.
>
> Dave
>


Re: issue: unexpected results in optimizations

2023-12-12 Thread David Brown via Gcc

Hi,

First, please ignore everything Dave Blanchard writes.  I don't know 
why, but he likes to post angry, rude and unhelpful messages to this list.


Secondly, this is the wrong list.  gcc-help would be the correct list, 
as you are asking for help with gcc.  This list is for discussions on 
the development of gcc.


Thirdly, if you want help, you need to provide something that other 
people can comprehend.  There is very little that anyone can do to 
understand lumps of randomly generated code, especially when it cannot 
compile without headers and additional files or libraries that we do not 
have.


So your task is to write a /minimal/ piece of stand-alone code that 
demonstrates the effect that concerns you.  It is fine to use standard 
headers like , but no external headers like this "csmith" 
stuff.  Aim to make it small enough to be included directly in the text 
of the post, not as an attachment.  Include the compiler version(s) you 
tried, the command line flags, what you expect the results to give, and 
what wrong results you got.


Always do development compiles with comprehensive sets of warnings.  I 
managed to take a section of your code (part that was different between 
the "initial.c" and "transformed.c") and compile it - there were lots of 
warnings.  There are a lot of overflows in initialisations, pointless 
calculations on the left of commas, and other indications of badly 
written code.  There were also static warnings about undefined behaviour 
in some calculations - and that, most likely, is key.


When code has undefined behaviour, you cannot expect the compiler to 
give any particular results.  It's all down to luck.  And luck varies 
with the details, such as optimisation levels.  It's "garbage in, 
garbage out", and that is the explanation for differing results.


So compile with "-Wall -Wextra -std=c99 -Wpedantic -O2" and check all 
the warnings.  (Static warnings work better when optimising the code.) 
If you have fixed the immediate problems in the code, add the 
"-fsanitize=undefined" flag before running it.  That will do run-time 
undefined behaviour checks.


If you have a code block that is small enough to comprehend, and that 
you are now confident has no undefined behaviour, and you get different 
results with different optimisations, post it to the gcc-help list. 
Then people can try it and give opinions - maybe there is a gcc bug.


I hope that all helps.

David





On 11/12/2023 18:14, Jingwen Wu via Gcc wrote:

Hello, I'm sorry to bother you. And I have some gcc compiler optimization
questions to ask you.
First of all, I used csmith tools to generate c files randomly. Meanwhile,
the final running result was the checksum for global variables in a c file.
For the two c files in the attachment, I performed the equivalent
transformation of loop from *initial.**c* to *transformed.c*. And the two
files produced different results (i.e. different checksum values) when
using *-Os* optimization level, while the results of both were the same
when using other levels of optimization such as *-O0*, -O1, -O2, -O3,
*-Ofast*.
Please help me to explain why this is, thank you.

command line: *gcc file.c -Os -lm -I $CSMITH_HOME/include && ./a.out*
version: gcc 12.2.0
os: ubuntu 22.04





Re: issue: unexpected results in optimizations

2023-12-12 Thread Jonathan Wakely via Gcc
On Mon, 11 Dec 2023, 17:08 Jingwen Wu via Gcc,  wrote:

> Hello, I'm sorry to bother you. And I have some gcc compiler optimization
> questions to ask you.
> First of all, I used csmith tools to generate c files randomly. Meanwhile,
> the final running result was the checksum for global variables in a c file.
> For the two c files in the attachment, I performed the equivalent
> transformation of loop from *initial.**c* to *transformed.c*. And the two
> files produced different results (i.e. different checksum values) when
> using *-O2* optimization level, while the results of both were the same
> when using other levels of optimization such as *-O0*, *-O1*, *-O3*, *-Os*,
> *-Ofast*.
> Please help me to explain why this is, thank you.
>

Sometimes csmith can generate invalid code that gets miscompiled. It looks
like you're compiling with no warnings, which is a terrible idea:


> command line: *gcc file.c -O2 -lm -I $CSMITH_HOME/include && ./a.out*
>

You should **at least** enable warnings and make sure gcc isn't pointing
out any problems in the code.

You should also try the options suggested at http://gcc.gnu.org/bugs/ which
help identify invalid code.


version: gcc 12.2.0
> os: ubuntu 22.04
>


Re: issue: unexpected results in optimizations

2023-12-12 Thread Alexander Monakov via Gcc


On Tue, 12 Dec 2023, Jonathan Wakely via Gcc wrote:

> On Mon, 11 Dec 2023, 17:08 Jingwen Wu via Gcc,  wrote:
> 
> > Hello, I'm sorry to bother you. And I have some gcc compiler optimization
> > questions to ask you.
> > First of all, I used csmith tools to generate c files randomly. Meanwhile,
> > the final running result was the checksum for global variables in a c file.
> > For the two c files in the attachment, I performed the equivalent
> > transformation of loop from *initial.**c* to *transformed.c*. And the two
> > files produced different results (i.e. different checksum values) when
> > using *-O2* optimization level, while the results of both were the same
> > when using other levels of optimization such as *-O0*, *-O1*, *-O3*, *-Os*,
> > *-Ofast*.
> > Please help me to explain why this is, thank you.
> >
> 
> Sometimes csmith can generate invalid code that gets miscompiled. It looks
> like you're compiling with no warnings, which is a terrible idea:
> 
> 
> > command line: *gcc file.c -O2 -lm -I $CSMITH_HOME/include && ./a.out*
> >
> 
> You should **at least** enable warnings and make sure gcc isn't pointing
> out any problems in the code.
> 
> You should also try the options suggested at http://gcc.gnu.org/bugs/ which
> help identify invalid code.

Let me also link the "Testing Compilers Using Csmith" page, which is
currently available via the Wayback Machine, but not its original URL:

https://web.archive.org/web/20230316072811/http://embed.cs.utah.edu/csmith/using.html

It was written by the developers of Csmith.

Alexander


-fcf-protection default on x86-64 (also for -fhardened)

2023-12-12 Thread Florian Weimer via Gcc
Currently, -fcf-protection defaults to both shadow stack and indirect
branch tracking (IBT) on x86_64-linux-gnu, and -fhardened follows that.
I think it should only enable shadow stack at this point.

I'm not sure if this is a good idea because there will likely be no
userspace support for IBT when GCC 14 releases, so these binaries will
not be tested.  They will carry markup that indicates compatibility with
IBT, though.  If there turns out to be a problem, we'd have to revision
the markup and disable IBT for all existing binaries (because we don't
know which ones have the toolchain fix applied).

I think we can keep the shadow stack markup because there will be ways
to test for compatibility fairly soon.  The risk is also fairly reduced
for shadow stack because there are no code generation changes in generic
code, while for IBT every function that has their address taken needs a
different prologue.

As far as I understand it, there won't be any i386 GNU/Linux support for
shadow stacks, so -fhardened shouldn't enable it on that target.
Furthermore, ENDBR32 is incompatible with the i386 baseline ISA because
it's a long NOP.

Thanks,
Florian



Register allocation problem

2023-12-12 Thread Andrew Stubbs

Hi all,

I'm trying to solve an infinite loop in the "reload" pass (LRA). I need 
early-clobber on my load instructions and it goes wrong when register 
pressure is high.


Is there a proper way to fix this? Or do I need to do something "hacky" 
like fixing a register for use with reloads?


Here's the background .

AMD GCN has a thing called XNACK mode in which load instructions can be 
interrupted (by a page miss, for example) and therefore need to be 
written such that they are "restartable". This basically means that the 
output must not overwrite the input registers (it can happen that a load 
is partially successful, especially for vectors, but I believe 
overwriting the address and offsets is never safe, even for scalars). Up 
to now we've not needed this mode, but it will be needed for Unified 
Shared Memory (and theoretically for APU devices).


So I have added new alternatives into my machine description that use 
early-clobber set:


  [v   ,RF  ;flat ,*   ,12,*,off] flat_load%o1\t%0, %A1%O1%g1
  [&v  ,RF  ;flat ,*   ,12,*,on ] ^

(The "on" and "off" represent the XNACK mode.)

LRA then generates a register "Assignment" section in the dump, but it's 
not happy for some reason and generates another, and another, each with 
more and more pseudo registers and insns, and it goes on forever until 
the dump file is gigabytes and I kill it.


This is a vague description, sorry, because I don't really understand 
what's going on here and the dump files are huge with tens of thousands 
of pseudo registers to wade through. I'm hoping somebody recognises the 
issue without me spending days on it.


I have a workaround because there's no known failure on devices that 
have the AVGPR register file (they use it as spill space and therefore 
don't need the memory loads) and I actually don't need XNACK on the 
older devices at this time, but probably this is just pushing the 
problem further down the road so if there's a better solution then I'd 
like to find it.


Thanks in advance

Andrew