On Mon, Oct 24, 2016 at 03:27:10PM +0100, Kyrill Tkachov wrote: > Hi all, > > When storing a 64-bit immediate that has equal bottom and top halves we > currently > synthesize the repeating 32-bit pattern twice and perform a single X-store. > With this patch we synthesize the 32-bit pattern once into a W register and > store > that twice using an STP. This reduces codesize bloat from synthesising the > same > constant multiple times at the expense of converting a store to a store-pair. > It will only trigger if we can save two or more instructions, so it will only > transform: > mov x1, 49370 > movk x1, 0xc0da, lsl 32 > str x1, [x0] > > into: > > mov w1, 49370 > stp w1, w1, [x0] > > when optimising for -Os, whereas it will always transform a 4-insn synthesis > sequence into a two-insn sequence + STP (see comments in the patch). > > This patch triggers already but will trigger more with the store merging pass > that I'm working on since that will generate more of these repeating 64-bit > constants. > This helps improve codegen on 456.hmmer where store merging can sometimes > create very > complex repeating constants and target-specific expand needs to break them > down. > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk?
Hi Kyrill, Does this do the right thing for: void bar(u64 *x) { *(volatile u64 *)x = 0xabcdef10abcdef10; } C.f. https://lore.kernel.org/lkml/20190821103200.kpufwtviqhpbuv2n@willie-the-truck/ i.e. is this optimization still valid for volatile? Thanks, James > > Thanks, > Kyrill > > 2016-10-24 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * config/aarch64/aarch64.md (mov<mode>): Call > aarch64_split_dimode_const_store on DImode constant stores. > * config/aarch64/aarch64-protos.h (aarch64_split_dimode_const_store): > New prototype. > * config/aarch64/aarch64.c (aarch64_split_dimode_const_store): New > function. > > 2016-10-24 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * gcc.target/aarch64/store_repeating_constant_1.c: New test. > * gcc.target/aarch64/store_repeating_constant_2.c: Likewise.