> On 29 May 2025, at 5:58 pm, Jan Hubicka <hubi...@ucw.cz> wrote: > > External email: Use caution opening links or attachments > > >> Hi, >> autofdo tests are now running only for x86. This patch makes it >> run for aarch64 too. Verified that perf and create_gcov are running >> as expected. >> >> gcc/ChangeLog: >> >> * config/aarch64/gcc-auto-profile: Make script executable. >> >> gcc/testsuite/ChangeLog: >> >> * lib/target-supports.exp: Enable autofdo tests for aarch64. >> >> Is this OK? > OK. > What is your set of failures? > I now get on AMD > > FAIL: gcc.dg/tree-prof/indir-call-prof-2.c scan-ipa-dump afdo "Inlining > add1/1 into main/4." > FAIL: gcc.dg/tree-prof/indir-call-prof-2.c scan-ipa-dump afdo "Inlining > sub1/2 into main/4." > FAIL: gcc.dg/tree-prof/indir-call-prof.c scan-tree-dump-not optimized > "Invalid sum" > FAIL: gcc.dg/tree-prof/inliner-1.c scan-tree-dump optimized "cold_function > ..;" > FAIL: gcc.dg/tree-prof/peel-1.c scan-tree-dump cunroll "Peeled loop ., 1 > times" > FAIL: gcc.dg/tree-prof/peel-2.c scan-tree-dump cunroll "Peeled loop 2, 1 > times" > > and on Intel > > ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/indir-call-prof-2.c > scan-ipa-dump afdo "Inlining add1/1 into main/4." > ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/indir-call-prof-2.c > scan-ipa-dump afdo "Inlining sub1/2 into main/4." > ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/inliner-1.c scan-tree-dump > optimized "cold_function ..;" > ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/peel-1.c scan-tree-dump > cunroll "Peeled loop ., 1 times" > ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/peel-2.c scan-tree-dump ch2 > "Peeled all exits: decreased number of iterations of loop 2" > ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/peel-2.c scan-tree-dump ch2 > "Peeled likely exits: likely decreased number of iterations of loop 1" > ./testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/peel-2.c scan-tree-dump > cunroll "Peeled loop 2, 1 times" > > (i.e. some of the failures are gone after fixing the autofdo 0 issues) > > The peeling tests have loop with low iteration count which is not > visible to inliner and tests that profile feedback determines it. I do > not see how auto-FDO (at least in current form) can do this reliably. > Even if we measure taken branches their count wil differ i.e. with > unrolling or vectorization. So I think we can just diable those tests > for AFDO. Now sure what happens with indir call and inliner yet. > > The difference there is that Intel produces more events then AMD (which > is probably due to different default sampling count).
I am seeing: ./gcc/testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/indir-call-prof-2.c scan-ipa-dump afdo "Inlining add1/1 into main/4." ./gcc/testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/indir-call-prof-2.c scan-ipa-dump afdo "Inlining sub1/2 into main/4." ./gcc/testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/inliner-1.c scan-tree-dump optimized "cold_function ..;" ./gcc/testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/peel-1.c scan-tree-dump cunroll "Peeled loop ., 1 times" ./gcc/testsuite/gcc/gcc.sum:FAIL: gcc.dg/tree-prof/peel-2.c scan-tree-dump cunroll "Peeled loop 2, 1 times” I also noticed that some tests are only enabled for x86. I am also seeing: ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/pr66295.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/split-1.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/val-prof-10.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/val-prof-7.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/pr66295.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/split-1.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/val-prof-10.c ./gcc/testsuite/gcc/gcc.sum:UNSUPPORTED: gcc.dg/tree-prof/val-prof-7.c Thanks, Kugan > Honza