On 9/16/22 13:05, Jørgen Kvalsvik wrote: > Gentle ping. Any idea if the edge split is still useful and/or how to test > for it?
@Honza: Can you please reply here? Thanks, Martin > > Thanks, > Jørgen > > On 08/09/2022 12:30, Jørgen Kvalsvik wrote: >> On 02/09/2022 14:22, Richard Biener wrote: >>> On Fri, Sep 2, 2022 at 11:50 AM Jørgen Kvalsvik <j...@lambda.is> wrote: >>>> >>>> >>>> Hello, >>>> >>>> I played some more with odd programs and the effect on control flow >>>> graph construction (as a part of condition coverage support [1]) and >>>> came across this: >>>> >>>> int fn (int a, int b, int c) { >>>> int x = 0; >>>> if (a && b) { >>>> if (c) { >>>> goto a_; >>>> } else { >>>> x = a; >>>> } >>>> } else { >>>> a_: >>>> x = (a - b); >>>> } >>>> >>>> return x; >>>> } >>>> >>>> Run through gcov --conditions I get: >>>> >>>> 4: 5: if (a && b) { >>>> condition outcomes covered 2/2 >>>> condition outcomes covered 2/2 >>>> 2: 6: if (c) { >>>> condition outcomes covered 2/2 >>>> >>>> Which is clearly not correct. So I started digging into why and dump the >>>> CFG as the coverage profiling sees it https://i.imgur.com/d0q72rA.png >>>> [2]. I apologize for the labeling, but A2 = a, A3 = b, A5 = c and A9 the >>>> else block. The problem, which is what confuses the algorithm, is that a >>>> and b don't share A9 as a successor (on false) as I would expect. >>>> >>>> If I add some operation before the label the problem disappears and a >>>> and b share false-destination again https://i.imgur.com/PSrfaLC.png [3]. >>>> >>>> } else { >>>> x++; >>>> a_: >>>> x = (a - b); >>>> } >>>> >>>> 4: 5: if (a && b) { >>>> condition outcomes covered 4/4 >>>> 2: 6: if (c) { >>>> condition outcomes covered 2/2 >>>> >>>> >>>> When dumping the cfg in the former case with -fdump-tree-cfg-graph I get >>>> a CFG without the split destinations in a and b >>>> https://i.imgur.com/05MCjzp.png [3]. I would assume from this that the >>>> graph dump happens after _more_ CFG transformations than the branch >>>> profiling. >>>> >>>> So my questions are: >>>> >>>> 1. Is the control flow graph expected to be constructed as such where a >>>> and b don't share outcome, or is it to be considered a bug? >>>> 2. If yes, would it be problematic to push the branch coverage and >>>> condition profiling to a later stage where the cfg has been fixed? >>> >>> I would say you should only see more nodes merged. It's a bit hard to >>> follow >>> what you say with the namings - I usually run cc1 in gdb, breaking at >>> execute_build_cfg where you can do, after build_gimple_cfg finished >>> (and before cleanup_tree_cfg ()) do a 'dot-fn' in gdb which produces a nice >>> picture of the CFG and code with graphviz. >>> >>> It looks like I would have expected, in particular we do not force a >>> new basic-block to be generated for a_: after the D.1991: artificial >>> label we have for the else. That might be premature optimization >>> for your case (but the cleanup_tree_cfg () would immediately do >>> that as well). >>> >>> Richard. >> >> I did some more digging into this and have isolated the problem to edge >> splitting inside the branch_prob () function itself. >> >> gcc/profile.cc:1248 >> >> if (last >> && gimple_has_location (last) >> && !RESERVED_LOCATION_P (e->goto_locus) >> && !single_succ_p (bb) >> && (LOCATION_FILE (e->goto_locus) >> != LOCATION_FILE (gimple_location (last)) >> || (LOCATION_LINE (e->goto_locus) >> != LOCATION_LINE (gimple_location (last))))) >> { >> basic_block new_bb = split_edge (e); >> edge ne = single_succ_edge (new_bb); >> ne->goto_locus = e->goto_locus; >> } >> >> Based on the cleaned-up cfg that gcc dumps later it looks like this split >> only lives through the branch coverage/profiling phase (it may bleed >> slightly later but it shouldn't be of significance). >> >> Out of curiosity I removed the splitting altogether and no tests failed when >> running make check-gcc check-g++ RUNTESTFLAGS="gcov.exp". Either it was not >> covered by tests in the first place, or whatever behaviour this check is >> meant to fix is resolved elsewhere. I have to admit I don't really see a >> difference with/without this patch, but I don't know what to look for. >> >> The check was first introduced in 2005 by Jan (cc): >> >> commit d783b2a2dc91e1d2c1fea78cac2b6c6c73b3680d >> Author: Jan Hubicka <j...@suse.cz> >> Date: Thu Aug 4 00:10:54 2005 +0200 >> >> profile.c (branch_prob): Split edges with goto locus on them to get >> proper line counts. >> >> >> * profile.c (branch_prob): Split edges with goto locus on them >> to get proper line counts. >> * tree-cfg.c (make_cond_expr_edges): Record user goto locuses, >> if any. >> >> * gcov-1.C: Fix switch counts. >> * gcov-4b.c: Likewise. >> >> What stands out to me in the check is that it uses location-file and >> location-line to decide if to split the edge. I added a few prints to see >> when the file/line is set: >> >> 2 int goto1 (int a) { >> >> >> 3 if (a) >> 4 goto end; >> 5 >> 6 return 1; >> 7 end: >> 8 x += a; >> 9 return 0; >> 10 } >> >> if (a_5(D) != 0) >> >> >> edge (true) >> last goto2.c:3 >> goto (null):0 >> >> if (a_5(D) != 0) >> edge (false) >> last goto2.c:3 >> goto (null):0 >> >> // predicted unlikely by goto predictor. >> edge (fallthru) >> last goto2.c:4 >> goto goto2.c:4 >> >> The goto statement is the only with with a location for both the basic block >> and the edge. >> >> 12 int goto2 (int a) { >> >> >> 13 if (a) { goto end; } >> 14 else { label: a++; } >> 15 >> 16 return 1; >> 17 end: >> 18 x += a; >> 19 return 0; >> 20 } >> >> if (a_5(D) != 0) >> edge (true) >> last goto2.c:13 >> goto (null):0 >> >> if (a_5(D) != 0) >> edge (false) >> last goto2.c:13 >> goto goto2.c:14 >> >> // predicted unlikely by goto predictor. >> edge (fallthru) >> last goto2.c:13 >> goto goto2.c:13 >> >> Now the else block has two locations as well, with the edge label >> e->goto_locus being inside the else block. Note that this label is >> _unrelated_ to the edge jump a (false) -> else. >> >> Now a function without gotos: >> >> 22 int goto3 (int a, int b) { >> 23 if (a && b) { >> 24 x += a * b; >> 25 } else { >> 26 x -= 1; >> 27 } >> 28 return 0; >> 29 } >> >> if (a_7(D) != 0) >> edge (true) >> last goto2.c:23 >> goto (null):0 >> >> if (a_7(D) != 0) >> edge (false) >> last goto2.c:23 >> goto (null):0 >> >> if (b_8(D) != 0) >> edge (true) >> last goto2.c:23 >> goto (null):0 >> >> if (b_8(D) != 0) >> edge (false) >> last goto2.c:23 >> goto (null):0 >> >> x = _3; >> edge (fallthru) >> last goto2.c:24 >> goto goto2.c:24 >> >> x = _5; >> edge (fallthru) >> last goto2.c:26 >> goto (null):0 >> >> Now the checks if (a) and (b) don't have goto_locus on the edges. For >> completeness I included the implied jumps in the then/else blocks which _do_ >> have edge locus in the then case. >> >> Finally, the case that expose the problem for me: >> >> 31 int goto4 (int a, int b) { >> 32 if (a) { >> 33 if (b) goto elseblock; >> 34 else a++; >> 35 } else { >> 36 elseblock: >> 37 a--; >> 38 } >> 39 >> 40 return a; >> 41 } >> >> if (a_2(D) != 0) >> edge (true) >> last goto2.c:32 >> goto (null):0 >> >> if (a_2(D) != 0) >> edge (false) >> last goto2.c:32 >> goto goto2.c:36 >> >> if (b_3(D) != 0) >> edge (true) >> last goto2.c:33 >> goto (null):0 >> >> if (b_3(D) != 0) >> edge (false) >> last goto2.c:33 >> goto (null):0 >> >> Again, a (false) has a goto_locus because there is an unrelated label at the >> top of the else block. For completeness, it also applies happens when >> there's a label on top of then: >> >> 43 int goto5 (int a, int b) { >> 44 if (a) { >> 45 then: >> 46 a++; >> 47 } else { >> 48 a--; >> >> >> 49 } >> 50 >> 51 return a; >> 52 } >> >> if (a_2(D) != 0) >> edge (true) >> last goto2.c:44 >> goto goto2.c:45 >> >> if (a_2(D) != 0) >> edge (false) >> last goto2.c:44 >> goto (null):0 >> >> This causes the edge to split which probably isn't a problem for the branch >> coverage, but it is problematic for my condition coverage algorithm. So how >> do we solve this? >> >> 1. Is the edge splitting necessary? I didn't find a test that covers this >> (there might be one, any idea?). If the edge splitting is not necessary >> anymore then removing it should be fine for my coverage needs. >> 2. Assuming the edge split is necessary making a decision on source file + >> line seems vulnerable to source formatting: >> >> 54 int goto6 (int a) { >> 55 if (a) label: goto end; >> >> >> 56 return 0; >> 57 end: >> 58 return 1; >> 59 >> 60 } >> >> if (a_2(D) != 0) >> edge (true) >> last goto2.c:55 >> goto goto2.c:55 >> >> if (a_2(D) != 0) >> edge (false) >> last goto2.c:55 >> goto (null):0 >> >> // predicted unlikely by goto predictor. >> edge (fallthru) >> last goto2.c:55 >> goto goto2.c:55 >> >> Now unrelated gotos all have the same file/line signature. Since the edge to >> the goto_locus is unrelated to the label itself. >> 3. Assuming the test is fine and necessary I *could* work around the problem >> by recording the original edge somewhere (it is very important that all >> conditions in an expression share the same then/else basic blocks), or >> enough metadata to make a virtual edge, but that is a hack I hope I don't >> have to do. >> >> Anyway, the problem is not with the cfg construction itself, but an edge >> splitting that happens specifically for profiling. >> >> CC Martin, maybe you have any idea? >> >> Thanks, >> Jørgen. >