https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120941

--- Comment #38 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 30 Jul 2025, hjl.tools at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120941
> 
> --- Comment #37 from H.J. Lu <hjl.tools at gmail dot com> ---
> (In reply to Richard Biener from comment #35)
> > (In reply to H.J. Lu from comment #33)
> > > Created attachment 61995 [details]
> > > An updated patch
> > > 
> > > Please try this.
> > 
> > Looking at the patch I do wonder about
> > 
> > static void 
> > ix86_place_single_vector_set (rtx dest, rtx src, bitmap bbs,
> >                               rtx inner_scalar = nullptr)
> > {                        
> >   basic_block bb = nearest_common_dominator_for_set (CDI_DOMINATORS, bbs);
> >   while (bb->loop_father->latch              
> >          != EXIT_BLOCK_PTR_FOR_FN (cfun))
> >     bb = get_immediate_dominator (CDI_DOMINATORS,
> >                                   bb->loop_father->header);
> > 
> > when the nearest common dominator is a BB in a loop nest like
> > 
> >  loop {
> >    loop {
> >    }
> > 
> >    loop {
> >       BB;
> >    }
> >    BB';
> >  }
> > 
> > this will skip an arbitrary number of earlier sibling loops.  I think
> > if we want to do such additional hoisting at all - for a splat of a
> > non-constant we have to ensure the set of the source we splat is still
> > dominating the insertion point (where's that done?) - it IMO only
> > makes sense (without extra costing) to hoist the set out of a perfect
> > nest, thus never across earlier sibling loops.  Even for BB' this is
> > likely problematic.
> 
> Since my patch works, I'd like to keep it as is.  Will it work for you?

Sure.  Where is it ensured the splat isn't inserted before it is set?

Reply via email to