erichkeane added a comment. In D108643#3000540 <https://reviews.llvm.org/D108643#3000540>, @rjmccall wrote:
> The question is whether you can rely on extension at places that receive an > arbitrary ABI-compatible value, like function parameters or loads. If nobody > who receives such a value can rely on extension having been done, then the > ABI is not meaningfully "leaving it unconstrained": it is constraining some > places by the lack of constraint elsewhere. That is why this is a trade-off. I see, thanks for that clarification! > Okay, but this is a general-purpose feature being added to general-purpose > targets. Clang cannot directly emit AMD x86-64 microcode, and I doubt that > an `and; and; cmp` instruction sequence gets fused into a masked compare in > microcode. Those masked comparisons are probably just used to implement > 8/16/32-bit comparisons, selected directly on encountering a compare > instruction of that width. I don't work on the microcode, it is just what I was told when we asked about this. SO until someone can clarify, I have no idea. > This still doesn't make any sense. If you're transmitting a `_BitInt(17)` as > exactly 17 bits on a dedicated FPGA<->x86 bus, then of course continue to do > that. The ABI rules govern the representation of values in the places that > affect the interoperation of code, such as calling conventions and in-memory > representations. They do not cover bus protocols. Again, it was an argument made at the time that is outside of my direct expertise, so if you have experience with mixed FPGA/traditional core interfaces, I'll have to defer to your expertise. > This entire discussion is about what the ABI rules should be for implementing > this feature on general-purpose devices that doesn't directly support e.g. > 17-bit arithmetic. Custom hardware that does support native 17-bit > arithmetic obviously doesn't need to care about those parts of the ABI and is > not being "punished". At some point, 17-bit values will come from that > specialized hardware and get exposed to general-purpose hardware by e.g. > being written into a GPR; this is the first point at which the ABI even > starts dreaming of being involved. Now, it's still okay under a > mandatory-extension ABI if that GPR has its upper bits undefined: you're in > the exact same situation as you would be after an addition, where it's fine > to turn around and use that in some other operation that doesn't care about > the upper bits (like a multiplication), but if you want to use it in > something that cares about those bits (like a comparison), you need to > zero/sign-extend them away first. The only difference between an ABI that > leaves the upper bits undefined and one that mandates extension is that > places which might expose the value outside of the function — like returning > the value, passing the value as an argument, and writing the value into a > pointer — have to be considered places that care about the upper bits; and > then you get to rely on that before you do things like comparisons. > > Again, I'm not trying to insist that a mandatory-extension ABI is the right > way to go. I just want to make sure that we've done a fair investigation > into this trade-off. Right now, my concern is that it sounds like that > investigation invented a lot of extra constraints for mandatory-extension > ABIs, like that somehow mandatory extension meant that you would need to > transmit a bunch of zero bits between your FPGA and the main CPU. I am not a > hardware specialist, but I know enough to know that this doesn't check out. Again at the time, my FPGA-CPU interconnect experts expressed issue with making the extra-bits 0, and it is filtered by my memory/ the "ELI5" explanation that was given to me, so I apologize it didn't come through correctly. > I have a lot of concerns about turning "whatever LLVM does when you pass an > i17 as an argument" into platform ABI. My experience is that LLVM does a lot > of things that you wouldn't expect when you push it outside of simple cases > like power-of-two integers. Different targets may even use different rules, > because the IR specification doesn't define this stuff. That seems like a better argument for leaving them unspecified I would think. If we can't count on our backends to act consistently, then it is obviously going to be some level of behavior-change/perf-hit to force a decision on them. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108643/new/ https://reviews.llvm.org/D108643 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits