> >
> > Please can you try that on trunk and report back.
> 
> OK, this is trunk, and I'm not longer seeing that happen.
> 
> However, I am seeing:
> 
>    0x0000007fb76dc82c <+160>: adrp    x25, 0x7fb7c80000
>    0x0000007fb76dc830 <+164>: add     x25, x25, #0x480
>    0x0000007fb76dc834 <+168>: fmov    d8, x0
>    0x0000007fb76dc838 <+172>: add     x0, x29, #0x160
>    0x0000007fb76dc83c <+176>: fmov    d9, x0
>    0x0000007fb76dc840 <+180>: add     x0, x29, #0xd8
>    0x0000007fb76dc844 <+184>: fmov    d10, x0
>    0x0000007fb76dc848 <+188>: add     x0, x29, #0xf8
>    0x0000007fb76dc84c <+192>: fmov    d11, x0
> 
> followed later by:
> 
>    0x0000007fb76dd224 <+2712>:        fmov    x0, d9
>    0x0000007fb76dd228 <+2716>:        add     x6, x29, #0x118
>    0x0000007fb76dd22c <+2720>:        str     x20, [x0,w27,sxtw #3]
>    0x0000007fb76dd230 <+2724>:        fmov    x0, d10
>    0x0000007fb76dd234 <+2728>:        str     w28, [x0,w27,sxtw #2]
>    0x0000007fb76dd238 <+2732>:        fmov    x0, d11
>    0x0000007fb76dd23c <+2736>:        str     w19, [x0,w27,sxtw #2]
> 
> which seems a bit suboptimal, given that these double registers now
> have
> to be saved in the prologue.
> 

Thanks for doing that.  Many AArch64 improvements have gone in since
4.8 was released.

I think we'd have to see the output for the whole function to
determine whether that code is sane. I don't suppose the source
code is shareable or you have a testcase for this you can share?

Cheers,
Ian



Reply via email to