On July 23, 2018 7:01:23 PM GMT+02:00, Tamar Christina <tamar.christ...@arm.com> wrote: >Hi All, > >This allows copy_blkmode_to_reg to perform larger copies when it is >safe to do so by calculating >the bitsize per iteration doing the maximum copy allowed that does not >read more >than the amount of bits left to copy. > >Strictly speaking, this copying is only done if: > > 1. the target supports fast unaligned access > 2. no padding is being used. > >This should avoid the issues of the first patch (PR85123) but still >work for targets that are safe >to do so. > >Original patch https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01088.html >Previous respin >https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00239.html > > >This produces for the copying of a 3 byte structure: > >fun3: > adrp x1, .LANCHOR0 > add x1, x1, :lo12:.LANCHOR0 > mov x0, 0 > sub sp, sp, #16 > ldrh w2, [x1, 16] > ldrb w1, [x1, 18] > add sp, sp, 16 > bfi x0, x2, 0, 16 > bfi x0, x1, 16, 8 > ret > >whereas before it was producing > >fun3: > adrp x0, .LANCHOR0 > add x2, x0, :lo12:.LANCHOR0 > sub sp, sp, #16 > ldrh w1, [x0, #:lo12:.LANCHOR0] > ldrb w0, [x2, 2] > strh w1, [sp, 8] > strb w0, [sp, 10] > ldr w0, [sp, 8] > add sp, sp, 16 > ret > >Cross compiled and regtested on > aarch64_be-none-elf > armeb-none-eabi >and no issues > >Boostrapped and regtested > aarch64-none-linux-gnu > x86_64-pc-linux-gnu > powerpc64-unknown-linux-gnu > arm-none-linux-gnueabihf > >and found no issues. > >OK for trunk?
How does this affect store-to-load forwarding when the source is initialized piecewise? IMHO we should avoid larger loads but generate larger stores when possible. How do non-x86 architectures behave with respect to STLF? Richard. >Thanks, >Tamar > >gcc/ >2018-07-23 Tamar Christina <tamar.christ...@arm.com> > > * expr.c (copy_blkmode_to_reg): Perform larger copies when safe.