https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930
--- Comment #23 from Linus Torvalds <torva...@linux-foundation.org> --- (In reply to Jakub Jelinek from comment #22) > > If the wider registers are narrowed before register allocation, it is just > a pair like (reg:SI 123) (reg:SI 256) and it can be allowed anywhere. That was more what I was thinking - why is the DImode information being kept so long? I realize that you want to do a lot of the early CSE etc operations at that higher level, but by the time you are actually allocating registers and thinking about spilling them, why is it still a DImode thing? And this now brings back my memory of the earlier similar discussion - it wasn't about DImode code generation, it was about bitfield code generation being horrendous, where gcc was keeping the whole "this is a bitfield" information around for a long time and as a result generating truly horrendous code. When it looked like it instead should just have turned it into a load and shift early, and then doing all the sane optimizations at that level (ie rewriting simple bitfield code to just do loads and shifts generated *much* better code than using bitfields). But this is just my personal curiosity at this point - it looks like Roger Sayle's patch has fixed the immediate problem, so the big issue is solved. And maybe the fact that clang is doing so much better is due to something else entirely - it just _looks_ like it might be this artificial constraint by gcc that makes it do bad register and spill choices.