https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415
--- Comment #16 from Jeffrey A. Law <jeffreyalaw at gmail dot com> --- On 11/8/23 03:09, manolis.tsamis at vrull dot eu wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 > > --- Comment #15 from Manolis Tsamis <manolis.tsamis at vrull dot eu> --- > (In reply to Sam James from comment #13) >> Created attachment 56527 [details] >> compile.c.323r.fold_mem_offsets.bad.xz >> >> Output from >> ``` >> hppa2.0-unknown-linux-gnu-gcc -c -DNDEBUG -g -fwrapv -O3 -Wall -O2 >> -std=c11 -Werror=implicit-function-declaration -fvisibility=hidden >> -I/home/sam/git/cpython/Include/internal -IObjects -IInclude -IPython -I. >> -I/home/sam/git/cpython/Include -DPy_BUILD_CORE -o Python/compile.o >> /home/sam/git/cpython/Python/compile.c -fdump-rtl-fold_mem_offsets-all >> ``` >> >> If I instrument certain functions in compile.c with no optimisation >> attribuet or build the file with -fno-fold-mem-offsets, Python works, so I'm >> reasonably sure this is the relevant object. > > Thanks for the dump file! There are 66 folded/eliminated instructions in this > object file; I did look at each case and there doesn't seem to be anything > strange. In fact most of the transformations are straightforward: > > - All except a couple of cases don't involve any arithmetic, so it's just > moving a constant around. > - The majority of the transformations are 'trivial' and consist of a single > add and then a memory operation: a sequence like X = Y + Const, R = MEM[X + 0] > is folded to X = Y, R = MEM[X + Const]. I wonder why so many of these exist > and > are not optimized elsewhere. > - There are some cases with negative offsets, but the calculations look > correct. > - There are few more complicated cases, but I've done these on paper and > also > look correct. The PA port is "weird". It's addressing modes aren't a good match for GCC (they're not symmetrical across loads vs stores and across fp vs integer) and they have the implicit space register problem. But I don't immediately recall needing to avoid propagation of constants into memory references or anything like that. I'd probably continue with the process of narrowing down what code is affected using the attributes. We already know the file, narrowing it down to a function might help considerably with the evaluation effort. Note that QEMU has a functional PA port. So you might be able to just take a root filesystem, add the tarball referenced earlier and play around to narrow things down further. I haven't done work on the PA in about 20 years at this point, but I can probably still grok its code. Between David and myself I'm sure we can help interpret what's going on Jeff