erichkeane added a comment.

In D108643#3000540 <https://reviews.llvm.org/D108643#3000540>, @rjmccall wrote:

> The question is whether you can rely on extension at places that receive an 
> arbitrary ABI-compatible value, like function parameters or loads.  If nobody 
> who receives such a value can rely on extension having been done, then the 
> ABI is not meaningfully "leaving it unconstrained": it is constraining some 
> places by the lack of constraint elsewhere.  That is why this is a trade-off.

I see, thanks for that clarification!

> Okay, but this is a general-purpose feature being added to general-purpose 
> targets.  Clang cannot directly emit AMD x86-64 microcode, and I doubt that 
> an `and; and; cmp` instruction sequence gets fused into a masked compare in 
> microcode.  Those masked comparisons are probably just used to implement 
> 8/16/32-bit comparisons, selected directly on encountering a compare 
> instruction of that width.

I don't work on the microcode, it is just what I was told when we asked about 
this.  SO until someone can clarify, I have no idea.

> This still doesn't make any sense.  If you're transmitting a `_BitInt(17)` as 
> exactly 17 bits on a dedicated FPGA<->x86 bus, then of course continue to do 
> that.  The ABI rules govern the representation of values in the places that 
> affect the interoperation of code, such as calling conventions and in-memory 
> representations.  They do not cover bus protocols.

Again, it was an argument made at the time that is outside of my direct 
expertise, so if you have experience with mixed FPGA/traditional core 
interfaces, I'll have to defer to your expertise.

> This entire discussion is about what the ABI rules should be for implementing 
> this feature on general-purpose devices that doesn't directly support e.g. 
> 17-bit arithmetic.  Custom hardware that does support native 17-bit 
> arithmetic obviously doesn't need to care about those parts of the ABI and is 
> not being "punished".  At some point, 17-bit values will come from that 
> specialized hardware and get exposed to general-purpose hardware by e.g. 
> being written into a GPR; this is the first point at which the ABI even 
> starts dreaming of being involved.  Now, it's still okay under a 
> mandatory-extension ABI if that GPR has its upper bits undefined: you're in 
> the exact same situation as you would be after an addition, where it's fine 
> to turn around and use that in some other operation that doesn't care about 
> the upper bits (like a multiplication), but if you want to use it in 
> something that cares about those bits (like a comparison), you need to 
> zero/sign-extend them away first.  The only difference between an ABI that 
> leaves the upper bits undefined and one that mandates extension is that 
> places which might expose the value outside of the function — like returning 
> the value, passing the value as an argument, and writing the value into a 
> pointer — have to be considered places that care about the upper bits; and 
> then you get to rely on that before you do things like comparisons.
>
> Again, I'm not trying to insist that a mandatory-extension ABI is the right 
> way to go.  I just want to make sure that we've done a fair investigation 
> into this trade-off.  Right now, my concern is that it sounds like that 
> investigation invented a lot of extra constraints for mandatory-extension 
> ABIs, like that somehow mandatory extension meant that you would need to 
> transmit a bunch of zero bits between your FPGA and the main CPU.  I am not a 
> hardware specialist, but I know enough to know that this doesn't check out.

Again at the time, my FPGA-CPU interconnect experts expressed issue with making 
the extra-bits 0, and it is filtered by my memory/ the "ELI5" explanation that 
was given to me, so I apologize it didn't come through correctly.

> I have a lot of concerns about turning "whatever LLVM does when you pass an 
> i17 as an argument" into platform ABI.  My experience is that LLVM does a lot 
> of things that you wouldn't expect when you push it outside of simple cases 
> like power-of-two integers.  Different targets may even use different rules, 
> because the IR specification doesn't define this stuff.

That seems like a better argument for leaving them unspecified I would think.  
If we can't count on our backends to act consistently, then it is obviously 
going to be some level of behavior-change/perf-hit to force a decision on them.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108643/new/

https://reviews.llvm.org/D108643

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to