https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64896

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 34685
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34685&action=edit
gcc5-pr64896.patch

I think if !aggregate_value_p, we really should be using a temporary var rather
than RESULT_DECL.

That said, if it doesn't generate optimal code, we should optimize it, but it
wouldn't be IPA's or expand_thunk's task, such problem would affect all similar
user written code.  But, on this exact testcase with the patch I get identical
assembly to -fno-ipa-icf.  The
        movl    $0, -24(%rsp)
        movl    $0, -20(%rsp)
        xorl    %edx, %edx
        movq    -24(%rsp), %rax
is of course not optimal, xorl %eax, %eax; xorl %edx, %edx would do too, but it
is a matter of some RTL optimization of late GIMPLE to improve this.

But, related to this, I've noticed that:
1) pass_nrv doesn't seem to work very well on x86_64, apparently the thing is
that the temporaries usually have DECL_ALIGN bumped by LOCAL_DECL_ALIGNMENT to
128 bits, while RESULT_DECL typically does not that "optimization", so the nrv
pass gives up.  Wonder at least for the case where the decl isn't addressable
why would we care about DECL_ALIGN of the temporary (rather than just
TYPE_ALIGN).
2) even on i386 where tree nrv usually works, I see on testcase like:
struct A
{
  int m_x, m_y;
};
struct Q
{
  struct A m_location;
  int m_size;
  long m_foo;
};
struct Q foo ();
struct Q bar ()
{
  struct Q x = foo ();
  return x;
}
(in C, so that C++ nrv doesn't trigger) unnecessary stack adjustments

Reply via email to