http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55829



--- Comment #9 from Uros Bizjak <ubizjak at gmail dot com> 2013-01-09 17:52:19 
UTC ---

gcc now generates:



        movq    p1(%rip), %r12  # 56    *movdi_internal_rex64/2 [length = 7]

        movq    %r12, (%rsp)    # 57    *movdi_internal_rex64/4 [length = 4]

        movddup (%rsp), %xmm1   # 23    *vec_concatv2df/3       [length = 5]



is there a reason not to load directly from p1, to avoid extra moves:



        movddup p1(%rip), %xmm1

Reply via email to