> Am 10.06.2025 um 22:18 schrieb Andrew MacLeod <amacl...@redhat.com>: > > > I had a question asked of me, and now I'm passing the buck. > > extern void *memcpy(void *, const void *, unsigned int); > extern int memcmp(const void *, const void *, unsigned int); > typedef unsigned long bits32; > typedef unsigned char byte; > > static const byte orig[10] = { > 'J', '2', 'O', 'Z', 'F', '5', '0', 'F', 'Y', 'L' }; > > static byte test[10]; > > int > verify (void) > { > return 0 == memcmp (test, orig, 10 * sizeof (orig[0])); > } > > int > benchmark (void) > { > memcpy (test, orig, 10 * sizeof (orig[0])); > return 0; > } > > > > Target is arm-none-eabi, and when compiled with -Os > > After the gimple lowering, the verify routine remains the same, but the > benchmark () routine is transformed from a memcpy and becomes: > > ;; Function benchmark (benchmark, funcdef_no=1, decl_uid=4718, cgraph_uid=4, > symbol_order=3) > > int benchmark () > { > int D.4726; > > MEM <unsigned char[10]> [(char * {ref-all})&test] = MEM <unsigned char[10]> > [(char * {ref-all})&orig]; > D.4726 = 0; > goto <D.4727>; > <D.4727>: > return D.4726; > } > > > It appears that forwprop is then transforming the statement to > <bb 2> : > MEM <unsigned char[10]> [(char * {ref-all})&test] = "J2OZF50FYL"; > return 0; > > And in the final output, there are now 2 copies of the original character > data: > > orig: > .ascii "J2OZF50FYL" > .space 2 > .LC0: > .ascii "J2OZF50FYL" > .bss > > > > and I presume that new string is a copy of the orig text that forwprop has > created for some reason. > > Whats going on, and is there a way to disable this? Either at the lowering > stage or in forwprop? At -Os, they are not thrilled that a bunch more > redundant text is being generated in the object file. This is a reduced > testcase to demonstrate a much larger problem. > The hope is the static var can be elided and the read might be just a small part. In this case heuristics are misfiring I guess. You’d have to track down where exactly in folding we are replacing the RHS of an aggregate copy. I can’t recall off my head.
Richard > I don't see this happening on my x86 box. the memcpy's are not lowered to > MEMs there under any circumstances I can find. > > This is true for at least gcc13 through trunk. It was not true back in the > heyday of gcc8. > > Andrew