On 2 March 2016 at 11:35, Edward Nevill <edward.nev...@linaro.org> wrote:
>         cmp     x2, 8           <<< (1)
> (1) If count as a 64 bit unsigned is <= 8 then it is probably still <= 8 as a 
> 32 bit unsigned.

You mean to use "cmp w2, 8" instead? Is there any difference?


> (2) Nowhere in the function does it store anything on the stack, so why
> drop and restore the stack every time. Also, minor quibble in the
> disass, why does sub use #64 whereas add uses just '64' (appreciate this
> is probably binutils, not gcc).

My reading of the AAPCS64 is that it's not necessary to have a frame
at all, only that if you do, it must be quad-word aligned.

Clang/LLVM doesn't seem to bother with the push and pop, but it also
uses "cmp x".


> .L15:
>         adrp    x3, .L4
>         add     x3, x3, :lo12:.L4
>         ldr     x2, [x3, x2, lsl #3]
>         br      x2

Hum, this is *exactly* what Clang generates... :)


> (4) Seems to be something wrong with the load scheduler here? Why not
> move the stp x2, x3 to the end. It does this repeatedly.

Again, Clang seems to do what you want...

Have you tried building OpenJDK with Clang?

cheers,
--renato
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to