On Wed, Jun 12, 2019 at 11:00:43AM +0200, Richard Biener wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux and furthermore tested
> > with x86_64-linux to nvptx-none offloading with the nvptx.c PR90811 fix
> > reverted to verify the overaligned variables are gone. Ok for trunk?
>
> Isn't there hunk missing that actually passes false?
It is that
+ add_stack_var (origvar, really_expand);
expand_one_var is called 3 times:
expand_one_var (t, toplevel, true);
size += expand_one_var (var, true, false);
expand_one_var (var, true, true);
where the second one is the estimated_stack_frame_size call where it passes
down false as really_expand. And the patch passes it down through to
align_local_variable.
> I guess only CCPs bit-value/alignment tracking sees the difference.
>
> Note we are already aligning variables in stor-layout.c with target
> information so in some way this isn't a real fix. Offloading as
> implemented right now really leaks target dependent decisions from
> the target to the offload target.
I was thinking about only honoring DECL_ALIGN for non-FIELD_DECLs when
streaming in ACCEL_COMPILER only if DECL_USER_ALIGN, the risk is if
we bump DECL_ALIGN somewhere pre-IPA on the host side and some optimization
actually optimizes code using that DECL_ALIGN, later on we stream offloading
out and in, downgrade DECL_ALIGN and suddenly the optimization was invalid.
> Not opposed to the change though a comment in the align_local_variable
> hunk as to why we only adjust DECL_ALIGN late would be appreciated.
Sure, that can be done:
> > --- gcc/cfgexpand.c.jj 2019-05-20 11:39:34.059816432 +0200
> > +++ gcc/cfgexpand.c 2019-06-11 11:28:08.720932421 +0200
> > @@ -361,7 +361,7 @@ static bool has_short_buffer;
> > we can't do with expected alignment of the stack boundary. */
> >
> > static unsigned int
> > -align_local_variable (tree decl)
> > +align_local_variable (tree decl, bool really_expand)
> > {
> > unsigned int align;
> >
> > @@ -370,7 +370,8 @@ align_local_variable (tree decl)
> > else
> > {
> > align = LOCAL_DECL_ALIGNMENT (decl);
> > - SET_DECL_ALIGN (decl, align);
+ /* Don't change DECL_ALIGN when called from estimated_stack_frame_size.
+ That is done before IPA and could bump alignment based on host
+ backend even for offloaded code which wants different
+ LOCAL_DECL_ALIGNMENT. */
> > + if (really_expand)
> > + SET_DECL_ALIGN (decl, align);
> > }
> > return align / BITS_PER_UNIT;
> > }
Jakub