rsmith added a comment.
In D68115#1946668 <https://reviews.llvm.org/D68115#1946668>,
@hubert.reinterpretcast wrote:
> It sounds like we are looking for `-fzero-union-padding`. That's been where
> the discussion has left off twice for months.
I believe the state of Clang prior to this patch is actually wrong. We reach
this code regardless of the kind of initialization used for the union, and some
forms of initialization require zeroing of padding whereas some do not. If we
want to model this precisely, we'll need to track on `APValue` whether the
padding bits of the union constant are zeroed or not. For example, given:
union U { char c; int n; };
U u = U(); // value-initialization
U v = U{}; // aggregate initialization
Clang emits `{ i8 0, [3 x i8] undef }` for both `u` and `v`. That's correct for
`v`, but incorrect for `u`: it should emit `{ i8 0, [3 x i8] zeroinitializer }`
or perhaps simply `zeroinitializer`, because value-initialization invokes
zero-initialization, which zeroes the padding.
Fixing this properly is likely not very hard, but it would require more changes
than are present in this patch. (This patch is conservatively correct, but
initializes more than we need to initialize.)
We also need to track in `APValue` whether padding bits are zeroed in order to
correctly support `bit_cast` from structs with padding. Per discussion in
committee, the intended behavior for `bit_cast` is:
> A bit in the value representation of the result is indeterminate if does not
> correspond to a bit in the value representation of from or corresponds to a
> bit of an object that is not within its lifetime or has an indeterminate
> value ([basic.indet]). For each bit in the value representation of the result
> that is indeterminate, the smallest object containing that bit has an
> indeterminate value; the behavior is undefined unless that object is of
> unsigned ordinary character type or std::byte type. The result does not
> otherwise contain any indeterminate values.
So in particular:
struct A { char c; int n; };
constexpr long n = bit_cast<long>(A()); // ok, 0
constexpr long m = bit_cast<long>(A{}); // ill-formed, indeterminate value
due to uninitialized padding between c and n
So I would propose we take the following path:
1. Extend Clang's constant evaluator and `APValue` to track, for a struct or
union value, whether all padding bits are zeroed. (Should always be `true` for
a value with no padding bits.)
2. Land this patch behind a flag to zero all padding bits for unions, ideally
extended to cover struct padding as well as union padding.
3. After doing (1), extend `__builtin_bit_cast` support to properly handle
padding bits.
4. After doing (1) and (2), extend constant aggregate emission to always zero
padding when required by the language standard. (If you want, make the flag be
three-way: never zero, zero as required by language standard, always zero,
maybe: `-fzero-padding=never` / `-fzero-padding=std`, `-fzero-padding=always`.)
Note that (1) and (2) are independent, so I don't think we need to block this
patch (2) on the implementation of (1), but we should be aware that we're not
done here until we do steps (3) and (4).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D68115/new/
https://reviews.llvm.org/D68115
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits