https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113419
Bug ID: 113419 Summary: SRA should replace some aggregate copies by load/store Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- For example gcc.dg/tree-ssa/pr94969.c has int a = 0, b = 0, c = 0; struct S { signed m : 7; signed e : 2; }; struct S f[2] = {{0, 0}, {0, 0}}; struct S g = {0, 0}; void __attribute__((noinline)) k() { for (; c <= 1; c++) { f[b] = g; f[b].e ^= 1; } } the aggregate copy f[b] = g isn't touched by SRA because it's global variables. For locals we'd end up with sth like <unnamed-signed:2> g$e; <unnamed-signed:7> g$m; struct S g; struct S a[2]; <bb 2> : g$m_7 = 0; g$e_8 = 0; MEM[(struct S *)&a + 4B].m = g$m_7; MEM[(struct S *)&a + 4B].e = g$e_8; so bit-precision integers. That might be good, esp. if there's field uses around. When the global variable variant is expaneded on RTL we see a simple SImode load and a SImode store. That means we should ideally treat aggregate copies like memcpy (&dest, &src, sizeof (dest)) and then fold it that way. But we should let SRA have a chance at decomposing first so this shouldn't be done as part of general folding but instead by late SRA for aggregate copies it didn't touch. For the gcc.dg/tree-ssa/pr94969.c testcase this then allows GIMPLE invariant motion to hoist the load from g, otherwise we rely on RTL PRE for this which is prone to PR113395.