On Tue, Jun 05, 2018 at 11:32:06AM -0500, Kyrill Tkachov wrote: > > On 04/06/18 18:40, Kyrill Tkachov wrote: > > Hi all, > > > > This patch adds support for generating LDPs and STPs of Q-registers. > > This allows for more compact code generation and makes better use of the > > ISA. > > > > It's implemented in a straightforward way by allowing 16-byte modes in the > > sched-fusion machinery and adding appropriate peepholes in aarch64-ldpstp.md > > as well as the patterns themselves in aarch64-simd.md. > > > > I didn't see any non-noise performance effect on SPEC2017 on Cortex-A72 and > > Cortex-A53. > > > > Adding some folks who know more about other CPUs as well. > Are you okay with enabling these instructions in AArch64? > > If you could give this a spin on some benchmarks you > care about on your platforms it would be really useful data.
>From an architecture perspective, I think this is the right thing for us to do. Given the feedback from Andrew and Siddhesh I think we should support this patch, defaulting to on; but behind a tuning flag for those who want to disable it for their -mcpu tuning. If you can respin it behind a tuning parameter and give the community another 48 hours or so to respond, I think we'd have a good patch here. I'm also adding some more contributors to the AArch64 cores file for their thoughts on the proposal. Thanks, James