Hi!

I think I've mentioned it earlier, but -ftrivial-auto-var-init= doesn't
work at all for C++.
With C++26 P2795R5 being voted in, if we were to default to say
-ftrivial-auto-var-init=zero for -std=c++26/-std=gnu++26, that would mean
the paper isn't really implemented.

A short testcase:
struct S { S () : a (42) {} int a, b, c; };
struct T : S { T () : d (43) {} int d, e, f; };
struct U : virtual S { U () : g (44) {} int g, h, i; };
struct V : virtual U, T { V () : j (45) {} int j, k, l; };
void bar (S &, T &, V &, int &);

int
foo ()
{
  S s;
  T t;
  V v;
  int i;
  int j;
  bar (s, t, v, i);
  return j;
}
-ftrivial-auto-var-init= adds .DEFERRED_INIT ifn call for automatic
vars which don't have DECL_INITIAL and .DEFERRED_INIT is turned into
clearing of the var (or pattern initialization) during RTL expansion
but for -Wuninitialized it acts as if the var is uninitialized.
I guess it kind of can work in LLVM which doesn't have -flifetime-dse2,
but with -flifetime-dse2 doesn't really work at all.

Take a look at the testcase after inlining ctors but before DSE1:
  s = .DEFERRED_INIT (12, 2, &"s"[0]);
  s ={v} {CLOBBER(bob)};
  s.a = 42;
  t = .DEFERRED_INIT (24, 2, &"t"[0]);
  t ={v} {CLOBBER(bob)};
  MEM[(struct S *)&t] ={v} {CLOBBER(bob)};
  MEM[(struct S *)&t].a = 42;
  t.d = 43;
  v = .DEFERRED_INIT (80, 2, &"v"[0]);
  v ={v} {CLOBBER(bob)};
  MEM[(struct S *)&v + 68B] ={v} {CLOBBER(bob)};
  MEM[(struct S *)&v + 68B].a = 42;
  MEM[(struct U *)&v + 48B].g = 44;
  MEM[(struct T *)&v + 8B] ={v} {CLOBBER(bob)};
  MEM[(struct S *)&v + 8B] ={v} {CLOBBER(bob)};
  MEM[(struct S *)&v + 8B].a = 42;
  MEM[(struct T *)&v + 8B].d = 43;
  v._vptr.V = &MEM <int (*) ()[7]> [(void *)&_ZTV1V + 32B];
  MEM[(struct U *)&v + 48B]._vptr.U = &MEM <int (*) ()[7]> [(void *)&_ZTV1V + 
56B];
  v.j = 45;
  _1 = .DEFERRED_INIT (4, 2, &"i"[0]);
  i = _1;
  j_10 = .DEFERRED_INIT (4, 2, &"j"[0]);
  bar (&s, &t, &v, &i);
Obviously, DSE1 will remove the .DEFERRED_INIT calls for s, t and v,
they are followed by clobbers of the vars.

Now, one possibility is drop -flifetime-dse2 for -std=c++26/-std=gnu++26,
but besides regressing in optimizations it also means we won't be able to
diagnose some invalid code anymore, e.g. when we don't see the
.DEFERRED_INIT visible but are accessing something that wasn't yet
constructed and the bob clobber would help diagnose.
Another one is emit the .DEFERRED_INIT calls also (or only in for
types with non-trivial ctors) the ctors, right after the bob CLOBBERs
and have some optimization added which attempts to merge smaller
.DEFERRED_INIT into a larger one with cropping the subobject clobbers
if there are no accesses to the memory in between, so for all inlined
ctors could turn well defined code like
  v = .DEFERRED_INIT (80, 2, &"v"[0]);
  v ={v} {CLOBBER(bob)};
  v = .DEFERRED_INIT (80, 2, 0);
  MEM[(struct S *)&v + 68B] ={v} {CLOBBER(bob)};
  MEM[(struct S *)&v + 68B] = .DEFERRED_INIT (12, 2, 0);
  MEM[(struct S *)&v + 68B].a = 42;
  MEM[(struct U *)&v + 48B].g = 44;
  MEM[(struct T *)&v + 8B] ={v} {CLOBBER(bob)};
  MEM[(struct T *)&v + 88B] = .DEFERRED_INIT (24, 2, 0);
  MEM[(struct S *)&v + 8B] ={v} {CLOBBER(bob)};
  MEM[(struct S *)&v + 88B] = .DEFERRED_INIT (12, 2, 0);
  MEM[(struct S *)&v + 8B].a = 42;
  MEM[(struct T *)&v + 8B].d = 43;
  v._vptr.V = &MEM <int (*) ()[7]> [(void *)&_ZTV1V + 32B];
  MEM[(struct U *)&v + 48B]._vptr.U = &MEM <int (*) ()[7]> [(void *)&_ZTV1V + 
56B];
  v.j = 45;
into
  v ={v} {CLOBBER(bob)};
  v = .DEFERRED_INIT (80, 2, &"v"[0]);
  MEM[(struct S *)&v + 68B].a = 42;
  MEM[(struct U *)&v + 48B].g = 44;
  MEM[(struct S *)&v + 8B].a = 42;
  MEM[(struct T *)&v + 8B].d = 43;
  v._vptr.V = &MEM <int (*) ()[7]> [(void *)&_ZTV1V + 32B];
  MEM[(struct U *)&v + 48B]._vptr.U = &MEM <int (*) ()[7]> [(void *)&_ZTV1V + 
56B];
  v.j = 45;

One problem with this are [[indeterminate]] vars, if .DEFERRED_INIT is
emitted in the ctors, the vars will be cleared even if they are
[[indeterminate]] (unless the ctors are inlined and some optimization
figures out, these vars are [[indeterminate]], let's drop all .DEFERRED_INIT
calls for those and their subparts.  On the other side, dropping
-flifetime-dse2 would mean we don't optimize even heap allocations even when
those are UB and not erroneous behavior.
And another possibility would be just change the behavior of bob CLOBBERs
if some option is enabled (either -ftrivial-auto-var-init= or whatever
is implied for -std=c++26/-std=gnu++26), don't treat those as completely
discarding previous content if the previous stores to the same memory
is .DEFERRED_INIT (or some other bob CLOBBERs with .DEFERRED_INIT at the
end).

In the C++26 paper, padding bits are still considered indeterminate, so
the __builtin_clear_padding calls added during gimplification are
unnecessary unless users are asking for -ftrivial-auto-var-init= is used
explicitly.  If even padding bits were erroneous, then is_var_need_auto_init
ignores empty types (which would be wrong for non-zero size empty types),
and cases like x86 long double with padding bits inside of it are handled
incorrectly by the current -ftrivial-auto-var-init= code - unless those
vars are address taken, they are in SSA form and so their .DEFERRED_INIT
is irrelevant, but what matters is when they are stored into memory,
which is where HW just writes 10 bytes and not the 12 or 16.

        Jakub

Reply via email to