https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68523
Bug ID: 68523 Summary: CFG Expansion Computes Incorrect Block Frequencies for Nested Loops Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kelvin at gcc dot gnu.org Target Milestone: --- With a debug version of gcc, compile the following program with the command-line options: -S -O3 -fno-tree-vectorize -da void foo(double *d, unsigned long int n) { unsigned long int i, j; for (j = 0; j < 10000002; j++) { for (i=0;i<n;i++) { d[i*2] = 0.0; } } } Note that the outer-loop iteration count is a compile-time constant and the inner-nested loop's iteration count is determined at run time. The dump file with .expand file name extension that is produced by the gcc compiler represents the inner-nested loop as block B5, and the outer-nested loop as blocks B7, B5, and B6. As reported by the dump file, the complete function, following control-flow graph expansion, is represented by the following control-flow graph: ENTRY:9 -> B2(100%) B2:9 -> B4(100%) B4:9 -> B7(100%) B7:900 -> B5(100%) B5:9100 -> B5(91%); B6(9%) B6:900 -> B7(99%); B9(1%) B9:9 -> EXIT(100%) EXIT:9 There are several inconsistencies in the reported frequencies. Note that the predecessors of B5 have combined frequency 900 + 91% of 9100 = 900 + 8281 = 9181 which does not equal the frequency of B5 (9100). Also, note that predecessors of B6 have combined frequency 9% of 9100 = 819, which does not equal 900, the frequency of B6. To correct the error, block B5 should have frequency 10,000. The value 10,000 can be computed by dividing 900, the combined frequency of edges entering this run-time bounded loop, by 0.09. Dividing by this magic number 0.09 corresponds to the heuristic that each conditional exit from a loop whose iteration bounds are determined at run-time has probability 9% of exiting the loop and 91% of remaining inside the loop.