On Tue, Jan 8, 2019 at 5:17 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Tue, Jan 8, 2019 at 6:54 AM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > On Tue, Jan 8, 2019 at 3:39 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > On Mon, Jan 7, 2019 at 11:12 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> > > >
> > > > On Mon, Jan 7, 2019 at 6:40 PM H.J. Lu <hongjiu...@intel.com> wrote:
> > > > >
> > > > > There is no need to generate vzeroupper if caller uses upper bits of
> > > > > AVX/AVX512 registers,  We track caller's avx_u128_state and avoid
> > > > > vzeroupper when caller's avx_u128_state is AVX_U128_DIRTY.
> > > > >
> > > > > Tested on i686 and x86-64 with and without --with-arch=native.
> > > > >
> > > > > OK for trunk?
> > > >
> > > > In principle OK, but I think we don't have to cache the result of
> > > > ix86_avx_u128_mode_entry. Simply call the function from
> > > > ix86_avx_u128_mode_exit; it is a simple function, so I guess we can
> > > > afford to re-call it one more time per function.
> > >
> > > Do we really need ix86_avx_u128_mode_entry?  We can just
> > > set entry state to AVX_U128_CLEAN and set exit state to
> > > AVX_U128_DIRTY if caller returns AVX/AVX512 register or passes
> > > AVX/AVX512 registers to callee.
> > >
> > > Does this patch look OK?
> >
> > No, the compiler is then free to move optimal insertion point at the
> > beginning of the function.
> >
>
> Here is the updated patch.  OK for trunk?

OK with the comment fix.

Thanks,
Uros.

-  return AVX_U128_CLEAN;
+  /* Entry mode is set to AVX_U128_DIRTY if there are 256bit or 512bit

s/Entry/Exit/

+     modes used in function arguments.  */

... , otherwise return AVX_U128_CLEAN.

+  return ix86_avx_u128_mode_entry ();
 }

Reply via email to