On 25 Sep 13:14, Jeff Law wrote: > On 09/01/14 04:29, Ilya Tocar wrote: > >>> > >>>AVX512 added new 16 xmm registers (xmm16-xmm31). > >>>Those registers require evex encoding. > >>>Only 512-bit wide versions of instructions have evex encoding with > >>>avx512f, but all versions have it with avx512vl. > >>>Most instructions have same macroized pattern for 128/256/512 vector > >>>length. They all use constraint 'v', which corresponds to > >>>class ALL_SSE_REGS (xmm0 - xmm31). To disallow e. g. xmm20 in > >>>256-bit case (avx512f) and allow it only in avx512vl case we have > >>>HARD_REGNO_MODE_OK checking for regno being evex-only and > >>>disallowing it if mode is not 512-bit. > >>Generally this kind of thing has been handled by splitting the register > >>class into two classes. I strongly suspect there are numerous places where > >>we assume that two regs in the same class are interchangeable. > >I'm not sure that there are many places where we replace hard regs > >without checks. E. g. in regrename we have HARD_REGNO_RENAME_OK. > >As far as I understand, idea behind HARD_REGNO_RENAME_OK is that we > >should always check when substituting hard reg. Why is regcprop > >different, and what's the point of HARD_REGNO_MODE_OK if it is ignored > >by some passes? > > > >> > >>I realize that's going to require some work in the x86 machine description, > >>but I think that's going to be a much better approach and save you work in > >>the long run. > >> > > > >This will approximately double sse.md, as we will need to split all > >patterns with 512-bit versions in 2 (512 and 128/256 cases) and play > >games with enabling/disabling alternatives depending on flags. > >Are you sure that this better than honoring HARD_REGNO_MODE_OK? > >As far as I understand, honoring HARD_REGNO_MODE_OK shouldn't produce > >worse code. > I don't see how it doubles the size. You split the class into two classes. > Whatever letter your second class has, you use it in conjunction with 'v' > that you're already using. Note you do not need different alternatives, you > use them in the same alternative. I'm not sure how will this help. Consider add<V2DF,V4DF,V8DF>, right now they are described in one pattern. Now in AVX512F (without AVX512VL) case we can use xmm16 for V8DF, but not for V2DF,V4DF. If we keep them in one pattern, they will have same alternatives for all modes. So we will need to either split V2DF,V4DF into separate pattern (doubling number of patterns), or disallow particular modes depending on flags (what we do now).
> > It's not a question of performance, but of design. Obviously, but I still fail to see why honoring HARD_REGNO_MODE_OK is bad design. I suspect that even without avx512 changes not honoring it will bite us sooner or later. > I suspect you're really > just at the tip of the iceberg with this stuff if you continue to go down > the path of having registers in the same class, some of which are > allocatable and some of which are not. Having class where some registers are not available is an old approach: Consider SSE_REGS class, where half of registers is not available in 32-bit case. Problem is with different modes being valid in those registers, depending on flags. And it worked fine for previous ~year in gcc 4.9. In my opinion if we check in original patch we will harm no one, and fix correctness problem. If we later discover some new problem, that is not fixable by simple patch, we may rework all of avx512 implementation. As all bugs of this kind will never generate incorrect code (all error will be caught by assembler), I see no reason not to check it in. > > The other approach that I believe has been taken has been to mark the new > registers as fixed when compiling for hardware where they're not available. > But I'm not sure offhand if that would be sufficient to fix this problem. It will not help. Registers are available. Just some modes are not supported.