================
@@ -546,6 +546,13 @@ class TargetTransformInfo {
   /// optimize away.
   LLVM_ABI unsigned getFlatAddressSpace() const;
 
+  /// Return the maximum size in bytes of a homogeneous struct that SROA should
+  /// canonicalize to a vector type. This enables better optimization of
+  /// tightly-packed structs on targets where scratch memory is expensive.
+  ///
+  /// \returns 0 to disable the transformation, or the maximum struct size.
----------------
yxsamliu wrote:

Thanks for flagging this. I looked into the delta-rs regression and found the 
root cause. When SROA splits a heterogeneous struct like { ptr, i64, i64, i64 
}, the sub-partition at [16,32) gets a synthetic type { i64, i64 } from 
getTypePartition, and tryCanonicalizeStructToVector was converting it to <2 x 
i64> even though the partition was in the non-promotable fallback path. That 
vector type then propagated through memcpy splits to other allocas, adding 
insertelement/extractelement overhead and changing function cost profiles 
enough to affect inlining decisions — which is where most of the +10 lines came 
from. I've pushed a fix that restricts the conversion to only fire when the 
partition spans the full alloca and the alloca is actually involved in 
phi/select patterns or has non-splittable typed uses, and added a lit test to 
cover this and re-triggered 
https://github.com/dtcxzyw/llvm-opt-benchmark/issues/1312

https://github.com/llvm/llvm-project/pull/165159
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to