Richard Sandiford <[email protected]> writes: > Alexandre Oliva <[email protected]> writes: >> On Nov 25, 2011, Jakub Jelinek <[email protected]> wrote: >>> The numbers I got with your patch (RTL checking) are below, seems >>> the cumulative numbers other than 100% are all bigger with patched stage2, >>> which means unfortunately debug info quality degradation. >> >> Not really. I found some actual degradation after finally getting back >> to it. In some cases, I failed to reset NO_LOC_P, and this caused >> expressions that depended on it to resort to alternate values or end up >> unset. In other cases, we created different cselib values for debug >> temps and implicit ptrs, and merging them at dataflow confluences no >> longer found a common value because the common value was in cselib's >> static equivalence table. I've fixed (and added an assertion to catch) >> left-over NO_LOC_Ps, and arranged for values created for debug exprs, >> implicit ptr, entry values and parameter refs to be preserved across >> basic blocks as constants within cselib. >> >> With that, the debug info we get is a strict improvement in terms of >> coverage, even though a bunch of .o files still display a decrease in >> 100% coverage. In the handful files I examined, the patched compiler >> was emitting a loc list without full coverage, while the original >> compiler was emitting a single loc expr, that implicitly got full >> coverage even though AFAICT it should really cover a narrower range. >> Full coverage was a false positive, and less-than-100% coverage in these >> cases is not a degradation, but rather an improvement. >> >> Now, the reason why we emit additional expressions now is that the new >> algorithm is more prone to emitting different (and better) expressions >> when entering basic block, because we don't try as hard as before to >> keep on with the same location expression. Instead we recompute all the >> potentially-changed expressions, which will tend to select better >> expressions if available. >> >>> Otherwise the patch looks good to me. >> >> Thanks. After the updated comparison data below, you can find the patch >> I'm checking in, followed by the small interdiff from the previous >> patch. >> >> >> Happy GNU Year! :-) >> >> >> The results below can be reproduced with r182723. >> >> stage1 sources are patched, stage2 and stage3 aren't, so >> stage2 is built with a patched compiler, stage3 isn't. >> >> $ wc -l obj-{x86_64,i686}-linux-gnu/stage[23]-gcc/cc1plus.ev >> 100784 obj-x86_64-linux-gnu/stage2-gcc/cc1plus.ev >> 102406 obj-x86_64-linux-gnu/stage3-gcc/cc1plus.ev >> 33275 obj-i686-linux-gnu/stage2-gcc/cc1plus.ev >> 33944 obj-i686-linux-gnu/stage3-gcc/cc1plus.ev >> >> $ wc -l obj-{x86_64,i686}-linux-gnu/stage[23]-gcc/cc1plus.csv >> 523647 obj-x86_64-linux-gnu/stage2-gcc/cc1plus.csv >> 523536 obj-x86_64-linux-gnu/stage3-gcc/cc1plus.csv >> 521276 obj-i686-linux-gnu/stage2-gcc/cc1plus.csv >> 521907 obj-i686-linux-gnu/stage3-gcc/cc1plus.csv >> >> $ diff -yW80 obj-x86_64-linux-gnu/stage[23]-gcc/cc1plus.ls >> cov% samples cumul cov% samples cumul >> 0.0 150949/30% 150949/30% | 0.0 150980/30% 150980/30% >> 0..5 6234/1% 157183/31% | 0..5 6254/1% 157234/31% >> 6..10 5630/1% 162813/32% | 6..10 5641/1% 162875/32% >> 11..15 4675/0% 167488/33% | 11..15 4703/0% 167578/33% >> 16..20 5041/1% 172529/34% | 16..20 5044/1% 172622/34% >> 21..25 5435/1% 177964/35% | 21..25 5466/1% 178088/35% >> 26..30 4249/0% 182213/36% | 26..30 4269/0% 182357/36% >> 31..35 4666/0% 186879/37% | 31..35 4674/0% 187031/37% >> 36..40 6939/1% 193818/38% | 36..40 6982/1% 194013/38% >> 41..45 7824/1% 201642/40% | 41..45 7859/1% 201872/40% >> 46..50 8538/1% 210180/42% | 46..50 8536/1% 210408/42% >> 51..55 7585/1% 217765/43% | 51..55 7611/1% 218019/43% >> 56..60 6088/1% 223853/44% | 56..60 6108/1% 224127/44% >> 61..65 5545/1% 229398/45% | 61..65 5574/1% 229701/46% >> 66..70 7151/1% 236549/47% | 66..70 7195/1% 236896/47% >> 71..75 8068/1% 244617/49% | 71..75 8104/1% 245000/49% >> 76..80 18852/3% 263469/52% | 76..80 18879/3% 263879/52% >> 81..85 11958/2% 275427/55% | 81..85 11954/2% 275833/55% >> 86..90 15201/3% 290628/58% | 86..90 15145/3% 290978/58% >> 91..95 16814/3% 307442/61% | 91..95 16727/3% 307705/61% >> 96..99 17121/3% 324563/65% | 96..99 16991/3% 324696/65% >> 100 174515/34% 499078/100% | 100 173994/34% 498690/100% >> >> $ diff -yW80 obj-i686-linux-gnu/stage[23]-gcc/cc1plus.ls >> cov% samples cumul cov% samples cumul >> 0.0 145453/27% 145453/27% | 0.0 145480/27% 145480/27% >> 0..5 6594/1% 152047/29% | 0..5 6603/1% 152083/29% >> 6..10 5664/1% 157711/30% | 6..10 5671/1% 157754/30% >> 11..15 4982/0% 162693/31% | 11..15 4997/0% 162751/31% >> 16..20 6155/1% 168848/32% | 16..20 6169/1% 168920/32% >> 21..25 5038/0% 173886/33% | 21..25 5057/0% 173977/33% >> 26..30 4925/0% 178811/34% | 26..30 4918/0% 178895/34% >> 31..35 4359/0% 183170/35% | 31..35 4372/0% 183267/35% >> 36..40 6977/1% 190147/36% | 36..40 6972/1% 190239/36% >> 41..45 8138/1% 198285/38% | 41..45 8148/1% 198387/38% >> 46..50 8538/1% 206823/39% | 46..50 8538/1% 206925/39% >> 51..55 5607/1% 212430/40% | 51..55 5610/1% 212535/40% >> 56..60 6629/1% 219059/41% | 56..60 6629/1% 219164/42% >> 61..65 5232/1% 224291/42% | 61..65 5242/1% 224406/43% >> 66..70 8827/1% 233118/44% | 66..70 8824/1% 233230/44% >> 71..75 6240/1% 239358/45% | 71..75 6245/1% 239475/45% >> 76..80 8573/1% 247931/47% | 76..80 8577/1% 248052/47% >> 81..85 8235/1% 256166/49% | 81..85 8236/1% 256288/49% >> 86..90 13385/2% 269551/51% | 86..90 13365/2% 269653/51% >> 91..95 21427/4% 290978/55% | 91..95 21397/4% 291050/55% >> 96..99 20791/3% 311769/59% | 96..99 20739/3% 311789/59% >> 100 209906/40% 521675/100% | 100 209781/40% 521570/100% >> >> >> for gcc/ChangeLog >> from Alexandre Oliva <[email protected]> >> >> * cselib.h (cselib_add_permanent_equiv): Declare. >> (canonical_cselib_val): New. >> * cselib.c (new_elt_loc_list): Rework to support value >> equivalences. Adjust all callers. >> (preserve_only_constants): Retain value equivalences. >> (references_value_p): Retain preserved values. >> (rtx_equal_for_cselib_1): Handle value equivalences. >> (cselib_invalidate_regno): Use canonical value. >> (cselib_add_permanent_equiv): New. >> * alias.c (find_base_term): Reset locs lists while recursing. >> * var-tracking.c (val_bind): New. Don't add equivalences >> present in cselib table, compared with code moved from... >> (val_store): ... here. >> (val_resolve): Use val_bind. >> (VAL_EXPR_HAS_REVERSE): Drop. >> (add_uses): Do not create MOps for addresses. Do not mark >> non-REG non-MEM expressions as requiring resolution. >> (reverse_op): Record reverse as a cselib equivalence. >> (add_stores): Use it. Do not create MOps for addresses. >> Do not require resolution for non-REG non-MEM expressions. >> Simplify support for reverse operations. >> (compute_bb_dataflow): Drop reverse support. >> (emit_notes_in_bb): Likewise. >> (create_entry_value): Rename to... >> (record_entry_value): ... this. Use cselib equivalences. >> (vt_add_function_parameter): Adjust. > > This has significantly increased the compile time of 20001226-1.c -O3 -g > on mips64-linux-gnu: > > time for -O3: > real 0m25.656s > user 0m25.106s > sys 0m0.544s > > time for -O3 -g with this patch and the follow-ons reverted: > real 0m28.212s > user 0m27.586s > sys 0m0.624s > > time for -O3 -g with today's trunk: > CTRL-Ced after (sorry, I got bored): > real 15m11.725s > user 15m11.177s > sys 0m0.496s > > I realise that 20001226-1.c isn't exactly the most representative > testcase in the world, but it does look like a real problem. > > The testcase has a lot of identical "load high, add low" sequences. > Even though there are only two values here (the high part and > the result), we end up with thousands of value rtxs and thousands > of distinct "reverse" locations attached to the "load high" value. > > One source of inefficiency seems to be that if an as-yet-unseen value > is passed to cselib_add_permanent_equiv, we create a new value for it, > then make the previous value (ELT) equivalent to it. AIUI, ELT is still > supposed to be the "canonical" representation of the value in this > situation, so most (but presumably not all, given the bug) instances > of the new value will be replaced by the old. > > Which led to the patch below. However, it triggers an infinite loop > in alias.c:memrefs_conflict_p when compiling libgcov.c, because we end > up recording that > > value V1 = value V2 + OFFSET > value V2 = value V1 - OFFSET > > and the function cycles endlessly between V1 and V2. > > And this is what makes me a bit nervous about the original patch. > AIUI, cselib values were always supposed to form a dag, whereas > the patch above is deliberately recording cycles. It seems a bit > dangerous to change something like that at the end of stage 3. > I couldn't convince myself that, without this patch, all these > cycles were harmless, especially given the increased use of > canonical_cselib_val in the follow-on patches. > > The follow-on patches also made me realise that I don't understand > when cselib is supposed to use "canonical" values and when it should > use (or can make do with) the non-canonical ones.
Now filed as PR 52001 to make sure it doesn't get lost. Richard
