https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85366

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
For arm64, it is obvious why it is not optimized into one divide:
.L2:
        udiv    w2, w0, w3
        msub    w2, w2, w3, w0
        cbnz    w2, .L5
        sdiv    w2, w0, w3
        .p2align 2
.L4:
        mov     w0, w2
        str     w3, [x1], 4
        sdiv    w2, w2, w3
        msub    w4, w2, w3, w0
        cbz     w4, .L4
.L5:
        add     w3, w3, 1
        cmp     w3, w0
        ble     .L2
.L1:
        ret

--- CUT ---
In the first case we have an unsigned division while in the second case we have
a signed division.

Reply via email to