On Wed, 23 Sep 2020, Tom de Vries wrote: > On 9/23/20 9:28 AM, Richard Biener wrote: > > On Tue, 22 Sep 2020, Tom de Vries wrote: > > > >> [ was: Re: [Patch] [middle-end & nvptx] gcc/tracer.c: Don't split BB > >> with SIMT LANE [PR95654] ] > >> > >> On 9/16/20 8:20 PM, Alexander Monakov wrote: > >>> > >>> > >>> On Wed, 16 Sep 2020, Tom de Vries wrote: > >>> > >>>> [ cc-ing author omp support for nvptx. ] > >>> > >>> The issue looks familiar. I recognized it back in 2017 (and LLVM people > >>> recognized it too for their GPU targets). In an attempt to get agreement > >>> to fix the issue "properly" for GCC I found a similar issue that affects > >>> all targets, not just offloading, and filed it as PR 80053. > >>> > >>> (yes, there are no addressable labels involved in offloading, but > >>> nevertheless > >>> the nature of the middle-end issue is related) > >> > >> Hi Alexander, > >> > >> thanks for looking into this. > >> > >> Seeing that the attempt to fix things properly is stalled, for now I'm > >> proposing a point-fix, similar to the original patch proposed by Tobias. > >> > >> Richi, Jakub, OK for trunk? > > > > I notice that we call ignore_bb_p many times in tracer.c but one call > > is conveniently early in tail_duplicate (void): > > > > int n = count_insns (bb); > > if (!ignore_bb_p (bb)) > > blocks[bb->index] = heap.insert (-bb->count.to_frequency (cfun), > > bb); > > > > where count_insns already walks all stmts in the block. It would be > > nice to avoid repeatedly walking all stmts, maybe adjusting the above > > call is enough and/or count_insns can compute this and/or the ignore_bb_p > > result can be cached (optimize_bb_for_size_p might change though, > > but maybe all other ignore_bb_p calls effectively just are that, > > checks for blocks that became optimize_bb_for_size_p). > > > > This untested follow-up patch tries something in that direction. > > Is this what you meant?
Yeah, sort of. +static bool +cached_can_duplicate_bb_p (const_basic_block bb) +{ + if (can_duplicate_bb) is there any path where can_duplicate_bb would be NULL? + { + unsigned int size = SBITMAP_SIZE (can_duplicate_bb); + /* Assume added bb's should be ignored. */ + if ((unsigned int)bb->index < size + && bitmap_bit_p (can_duplicate_bb_computed, bb->index)) + return !bitmap_bit_p (can_duplicate_bb, bb->index); yes, newly added bbs should be ignored so, } - return false; + bool val = compute_can_duplicate_bb_p (bb); + if (can_duplicate_bb) + cache_can_duplicate_bb_p (bb, val); no need to compute & cache for them, just return true (because we did duplicate them)? Thanks, Richard. > Thanks, > - Tom > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imend