https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71942

James Greenhalgh <jgreenhalgh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jgreenhalgh at gcc dot gnu.org

--- Comment #4 from James Greenhalgh <jgreenhalgh at gcc dot gnu.org> ---
(In reply to Albi from comment #3)
> Agreed, after way more google research a lot of people complain about this.
> 
> Never the less this poses a big problem since it halves the performance of
> every load on a sub-32-bit datatype.
> 
> Imho the problem issnt in the optimizer... i think the fundamental problem
> is that the instruction-set is not aware of the implicit unsigned
> zero-extention thats allready done by the load-instruction.
> 
> With that knowledge the idea of inserting the (redundant) zero-extending
> instruction like "uxth" wouldnt even come up even without optimization.
> 
> This is at least true for every unsigned short or unsigned char.
> Signed types need the sign extension of course.
> 
> In other words: there is actually no reason the instruction should be issued
> in the first place and therefore the optimizer should not really need to
> remove it afterwards.  Its fundamentally redundant.

But that isn't really how a compiler like GCC works! Think of the model as
first generating correct code, no matter the redundancy, then iteratively
transforming it down to the final output. Try compiling with -O0 to see just
how much redundancy the compiled code starts with.

Why do it this way? It gives a nicer structure to compilation. A language
front-end doesn't need to know much about the eventual target, and particularly
doesn't need to know the available instructions that are available.

As food for thought, imagine a compiler that chose to do this optimisation
immediately while parsing. Such a compiler would still need to repeat the logic
that could spot the redundant code in some future optimisation pass (imagine if
it wasn't yet clear that this was just a load followed by a zero-extend, maybe
there was some other redundant maths in the way so it wasn't immediately
clear), so for the sake of program design you'd probably want to factor the
common code out. For maximum effect you'd want the code that tries to spot the
opportunities you describe to have visibility of what comes both before and
after each individual fragment that it parses, which means you'd want to run it
after you've finished parsing. So you'd have a split between parsing and your
new redundancy removing pass. At that point you've designed a compiler which
first parses, then optimises, and you're back to the model that most production
compilers use.

Reply via email to