https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116312

ktkachov at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #2 from ktkachov at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #1)
> >but we could implement it as a simple final assembly output template change 
> >for minimal invasion.
> 
> No you can't since ldp and ld2 mean 2 different things.
> 
> ld2 is basically a perm to unmix the two registers. that is load lanes.
> 
> Note in the GCC case there is only one fadd while in LLVM there are 2 though
> indepedent.
> 
> so the question becomes is the ldp better than ld2 here? overall or just
> looking at the ldp vs ld2?

Yeah you're right, it's too early in the morning for me...
It could be that LLVM's vectorisation approach here is better, but that's a
separate discussion

Reply via email to