[clang] [clang] Lower _BitInt(129+) to a different type in LLVM IR (PR #91364)

Aaron Ballman via cfe-commits Thu, 11 Jul 2024 05:51:07 -0700

AaronBallman wrote:

> This is generally looking great, and I think it's ready to go as soon as you 
> can finish the tests. (You said you weren't able to update all the tests — 
> did you have questions about the remaining tests?)
> 
> I did have a thought, though. Are we confident that the in-memory layout that 
> LLVM is using for these large integer types matches the layout specified by 
> the ABI? I know this patch makes the overall sizes match, but there's also an 
> endianness question. When LLVM stores an i96, I assume it always stores them 
> using the overall endianness of the target; for example, on i386, it might do 
> three 32-bit stores with the low 32 bits at offset 0, the middle 32 bits at 
> offset 4, and the high 32 bits at offset 8. I just want to make sure that the 
> ABI specification for _BitInt always matches that. In particular, I'm worried 
> that it might do some middle-endian thing where it breaks the integer into 
> chunks and then stores those chunks in little-endian order even on a 
> big-endian machine. (That is generally the right thing to do for BigInt types 
> because most arithmetic operations access the chunks in little-endian order, 
> and doing adjacent memory accesses in increasing order is generally more 
> architecture-friendly.)


FWIW, I was chasing down ABI documents yesterday, and found:

x86-64 (https://gitlab.com/x86-psABIs/x86-64-ABI): 
```
_BitInt(N) types are signed by default, and unsigned _BitInt(N) types
are unsigned.
• _BitInt(N) types are stored in little-endian order in memory. Bits in each 
byte
are allocated from right to left.
• For N <= 64, they have the same size and alignment as the smallest of (signed 
and
unsigned) char, short, int, long and long long types that can contain them.
• For N > 64, they are treated as struct of 64-bit integer chunks. The number of
chunks is the smallest number that can contain the type. _BitInt(N) types are
byte-aligned to 64 bits. The size of these types is the smallest multiple of 
the 64-bit
chunks greater than or equal to N.
• The value of the unused bits beyond the width of the _BitInt(N) value but 
within
the size of the _BitInt(N) are unspecified when stored in memory or register.
```

ARM 32-bit 
(https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst):
```
_BitInt(N <= 64)        Smallest of the signed Fundamental Integral Data Types 
where byte-size*8 >= N.  C2x Only. Significant bits are allocated from least 
significant end of the Machine Type. Non-significant bits within the Machine 
Type are sign-extended.
unsigned _BitInt(N <= 64)       Smallest of the unsigned Fundamental Integral 
Data Types where byte-size*8 >= N.        C2x Only. Significant bits are 
allocated from least significant end of the Machine Type. Non-significant bits 
within the Machine Type are zero-extended.
_BitInt(N > 64)         Allocated as if unsigned int64_t[M] array where M*64 >= 
N. Last element contains sign bit.      C2x Only. Significant bits are 
allocated from least significant end of the Machine Type. The lower addressed 
double-word contains the least significant bits of the type on a little-endian 
view and the most significant bits of the type in a big-endian view. 
Non-significant bits within the last double-word are sign-extended.
unsigned _Bitint(N > 64)        Allocated as if unsigned int64_t[M] where M*64 
>= N.    C2x Only. Significant bits are allocated from least significant end of 
the Machine Type. The lower addressed double-word contains the least 
significant bits of the type on a little-endian view and the most significant 
bits of the type in a big-endian view. Non-significant bits within the last 
double-word are zero-extended.
```

ARM 64-bit 
(https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst):
```
_BitInt(N <= 128)       Smallest of the signed Fundamental Integral Data Types 
where byte-size*8 >= N.  C2x Only. Significant bits are allocated from least 
significant end of the Machine Type. Non-significant bits within the Machine 
Type are unspecified.
unsigned _BitInt(N <= 128)      Smallest of the unsigned Fundamental Integral 
Data Types where byte-size*8 >= N.        C2x Only. Significant bits are 
allocated from least significant end of the Machine Type. Non-significant bits 
within the Machine Type are unspecified.
_BitInt(N > 128)        Mapped as if unsigned __int128[M] array where M*128 >= 
N. Last element contains sign bit.       C2x Only. Significant bits are 
allocated from least significant end of the Machine Type. The lower addressed 
quad-word contains the least significant bits of the type on a little-endian 
view and the most significant bits of the type in a big-endian view. 
Non-significant bits within the last quad-word are unspecified.
unsigned _Bitint(N > 128)       Mapped as if unsigned __int128[M] where M*128 
>= N.     C2x Only. Significant bits are allocated from least significant end 
of the Machine Type. The lower addressed quad-word contains the least 
significant bits of the type on a little-endian view and the most significant 
bits of the type in a big-endian view. Non-significant bits within the last 
quad-word are unspecified.
```

The latest RISC-V, LoongArch, and CSKY ABI documents I could find did not 
mention `_BitInt`. I could not find any modern ABI document for PowerPC 
(power.org seems to no longer be about powerpc), but the one on Internet 
Archive also does not mention `_BitInt`.

https://github.com/llvm/llvm-project/pull/91364
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Lower _BitInt(129+) to a different type in LLVM IR (PR #91364)

Reply via email to