[clang] [clang] Better bitfield access units (PR #65742)

John McCall via cfe-commits Mon, 08 Jan 2024 16:40:59 -0800

================
@@ -442,79 +455,235 @@ 
CGRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field,
     return;
   }
 
-  // Check if OffsetInRecord (the size in bits of the current run) is better
-  // as a single field run. When OffsetInRecord has legal integer width, and
-  // its bitfield offset is naturally aligned, it is better to make the
-  // bitfield a separate storage component so as it can be accessed directly
-  // with lower cost.
-  auto IsBetterAsSingleFieldRun = [&](uint64_t OffsetInRecord,
-                                      uint64_t StartBitOffset) {
-    if (!Types.getCodeGenOpts().FineGrainedBitfieldAccesses)
-      return false;
-    if (OffsetInRecord < 8 || !llvm::isPowerOf2_64(OffsetInRecord) ||
-        !DataLayout.fitsInLegalInteger(OffsetInRecord))
-      return false;
-    // Make sure StartBitOffset is naturally aligned if it is treated as an
-    // IType integer.
-    if (StartBitOffset %
-            Context.toBits(getAlignment(getIntNType(OffsetInRecord))) !=
-        0)
-      return false;
-    return true;
+  // The SysV ABI can overlap bitfield storage units with both other bitfield
+  // storage units /and/ other non-bitfield data members. Such overlap, in the
+  // absence of packing, is always complete -- one storage unit is entirely
+  // within another. However, llvm cannot represent that -- it's structures are
+  // entirely flat. We place bitfields in 'access units', which are similar to
+  // the SysV storage units, but a clang-specific concept.
----------------
rjmccall wrote:


Hmm.  I'm not really sure this is right — it seems to be contradicted by 
`struct { char array[5]; int bitfield: 24; };` — but I don't think it matters 
anyway.

If I had to explain the philosophy here, I would say something like:

> We split runs of bit-fields into a sequence of "access units".  When we emit 
> a load or store of a bit-field, we'll load/store the entire containing access 
> unit.  The standard requires that these loads and stores must not interfere 
> with accesses to other memory locations, and it defines the bit-field's 
> memory location as the current run of non-zero-width bit-fields.  So an 
> access unit must never overlap with non-bit-field storage or cross a 
> zero-width bit-field.  Otherwise, we're free to draw the lines as we see fit.
>
> Drawing these lines well can be complicated. LLVM generally can't modify a 
> program to access memory that it didn't before, so using very narrow access 
> units can prevent the compiler from using optimal access patterns.  For 
> example, suppose a run of bit-fields occupies four bytes in a struct.  If we 
> split that into four 1-byte access units, then a sequence of assignments that 
> doesn't touch all four bytes may have to be emitted with multiple 8-bit 
> stores instead of a single 32-bit store. On the other hand, if we use very 
> wide access units, we may find ourselves emitting accesses to bit-fields we 
> didn't really need to touch, just because LLVM was unable to clean up after 
> us.

And then you can get into explaining your algorithm.

https://github.com/llvm/llvm-project/pull/65742
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Better bitfield access units (PR #65742)

Reply via email to