[clang] [clang][CodeGen] Zero init unspecified fields in initializers in C (PR #97121)

via cfe-commits Thu, 01 Aug 2024 16:50:14 -0700

================
@@ -361,6 +368,13 @@ CodeGenFunction::AddInitializerToStaticVarDecl(const 
VarDecl &D,
     }
     return GV;
   }
+  if (!getLangOpts().CPlusPlus) {
+    // In C, when an initializer is given, the Linux kernel relies on clang to
+    // zero-initialize all members not explicitly initialized, including 
padding
+    // bits.
----------------
yabinc wrote:


> But I assume that similar code is required in order to explicitly zero out 
> any padding when dynamically initializing a local variable.

I guess you mean variables assigned by Compound literals after variable 
definition. An example is in https://godbolt.org/z/caaEKnjWe. From the ast 
dump, it is treated as an assignment from InitListExpr. The source code seems 
to be in AggExprEmitter::VisitCXXParenListOrInitListExpr() in 
https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGExprAgg.cpp#L1757.
 Each field of the InitListExpr is stored to the variable separately. One 
solution is to always do memset in CheckAggExprForMemSetUse() in 
https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGExprAgg.cpp#L1974.
 The memset can be optimized away later if not needed. 
Another place is compound literals for global values. It seems handled in 
CGM.GetAddrOfConstantCompoundLiteral().
To not introduce a big change at once, Shall I handle them in a follow up patch?

> Also, is adding constant padding fields retroactively really the best way to 
> handle this instead of just expecting the constant emitter to fill them in 
> itself?

Currently I limit the change to C and reuse code for trivial auto var zero 
init. So it's easier to do it outside the constant emitter. If the change can 
be extended to C++, I feel it's better to always do it in the constant emitter. 

> Comments in the compiler generally shouldn't refer to specific projects. We 
> are making a guarantee based on our interpretation of the standard.

The standard is ambiguous on this, so can't be used as a reason for the change. 
Please let me know if you have better idea about how to reword the comment.

> Similarly, the PR title is also inappropriate. It is fine to note in the PR 
> description that Linux relies on this for correctness, but the PR title 
> should describe the semantic change.

Changed the PR title back to the commit title, which describes the semantic 
change.

> Also, I think we should do this unconditionally in all language modes. It's 
> not like this is forbidden in C++.

I agree. But when I was trying to do it before, I need to fix a lot more tests. 
And the Linux kernel only has C code. I don't have strong reason to push this 
to C++. So I want to limit the change to C at first. If reviewers think we 
should also do it in C++, shall I do it in a follow up change?


https://github.com/llvm/llvm-project/pull/97121
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Zero init unspecified fields in initializers in C (PR #97121)

Reply via email to