On Fri, 2024-08-16 at 09:41 +0200, Stefan Schulze Frielinghaus wrote: > Currently subregs originating from *tf_to_fprx2_0 and *tf_to_fprx2_1 > survive register allocation. This in turn leads to wrong register > renaming. Keeping the current approach would mean we need two insns > for > *tf_to_fprx2_0 and *tf_to_fprx2_1, respectively. Something along the > lines > > (define_insn "*tf_to_fprx2_0" > [(set (subreg:DF (match_operand:FPRX2 0 "nonimmediate_operand" > "=f") 0) > (unspec:DF [(match_operand:TF 1 "general_operand" "v")] > UNSPEC_TF_TO_FPRX2_0))] > "TARGET_VXE" > "#") > > (define_insn "*tf_to_fprx2_0" > [(set (match_operand:DF 0 "nonimmediate_operand" "=f") > (unspec:DF [(match_operand:TF 1 "general_operand" "v")] > UNSPEC_TF_TO_FPRX2_0))] > "TARGET_VXE" > "vpdi\t%v0,%v1,%v0,1 > [(set_attr "op_type" "VRR")]) > > and similar for *tf_to_fprx2_1. Note, pre register allocation > operand 0 > has mode FPRX2 and afterwards DF once subregs have been eliminated. > > Since we always copy a whole vector register into a floating-point > register pair, another way to fix this is to merge *tf_to_fprx2_0 and > *tf_to_fprx2_1 into a single insn which means we don't have to use > subregs at all. The downside of this is that the assembler template > contains two instructions, now. The upside is that we don't have to > come up with some artificial insn before RA which might be more > readable/maintainable. That is implemented by this patch. > > In commit r11-4872-ge627cda5686592, the output operand specifier %V > was > introduced which is used in tf_to_fprx2 only, now. I didn't come up > with its counterpart like %F for floating-point registers. Instead I > printed the register pair in the output function directly. This > spares > us a new and "rare" format specifier for a single insn. I don't have > a > strong opinion which option to choose, however, we should either add > %F > in order to mimic the same behaviour as %V or getting rid of %V and > inline the logic in the output function. I lean towards the latter. > Any preferences? > --- > gcc/config/s390/s390.md | 2 + > gcc/config/s390/vector.md | 66 +++++++++++--------- > -- > gcc/testsuite/gcc.target/s390/pr115860-1.c | 26 +++++++++ > 3 files changed, 60 insertions(+), 34 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/s390/pr115860-1.c
[...] > + char buf[64]; > + switch (which_alternative) > + { > + case 0: > + if (REGNO (operands[0]) == REGNO (operands[1])) > + return "vpdi\t%V0,%v1,%V0,5"; > + else > + return "ldr\t%f0,%f1;vpdi\t%V0,%v1,%V0,5"; > + case 1: > + { > + const char *reg_pair = reg_names[REGNO (operands[0]) + 1]; > + snprintf (buf, sizeof (buf), "ld\t%%f0,%%1;ld\t%%%s,8+%%1", > reg_pair); I wonder if there is a corner case where 8+ does not fit into short displacement? [...]