Hi Folks, (this is a bug reported against Fortran, but actually is a generic insn selection problem for m64 powerpc-darwin. In fact, I’d say it renders the 64b multilib unusable).
The solution proposed is Darwin-local (it's a long-standing bug and it would be intended to back-port it). However, I'd welcome views on it. The current Darwin load/store lo_sum patterns have neither predicate nor constraint. This means that most parts of the backend, which rely on recog() to validate the rtx, can produce invalid combinations/selections. For 32bit cases this isn't a problem since we can load/store to unaligned addresses using D-mode insns. Conversely, for 64bit instructions that use DS mode, this can manifest as assemble errors (for an assembler that checks the LO14 relocations), or as crashes caused by wrong offsets (or worse, wrong content for the two LSBs). Trying to find the right place to fix this has been tricky, there's discussion in the PR trail. There doesn't seem to be any way to deal with this in legitimize_address (since most of the damage is done by the wide open patterns that are in effect after the legitimizer has run). Likewise, extending the checks in legitimate_address_p() and mode_dependent_address doesn't help if the constraint is not accurate in the end. What we want to check for "Y" on Darwin is: - that the alignment of the Symbols' target is sufficient for DS mode - that the offset is suitable for DS mode. (while looking through the Mach-O PIC unspecs). Proposed: 1) Drop the Darwin-specific lo_sum patterns. Actually, this produces an immediate progression in quite a number of tests, we begin using the movdi_internal64 patterns. However there are also regressions caused by instructions unable to satisfy their constraints. 2) To resolve this we need to extend the handling of the mem_operand_gpr to allow looking through Mach-O PIC UNSPECs in the lo_sum cases. - note, that rs6000_offsettable_memref_p () will not handle these so that would return early, producing the issue with unsatisfiable constraints. - I do wonder if that's also the case for some non-Darwin lo_sum cases. (some things might be hard to detect, since the code will generally fall back to doing " la Rx xxx@l ; ld Ry 0(Rx)" so it won't fail - just be less efficient than it could be). I’ve tested this on powerpc-darwin9, and powerpc64-linux-gnu. It progresses around 120 tests across the testsuite (many in Fortran but also in C and C++) thoughts? Iain P.S. Probably the other crufty old lo_sum patterns can go too - but for another day. gcc/ChangeLog: 2019-10-12 Iain Sandoe <i...@sandoe.co.uk> * config/rs6000/darwin.md (movdi_low, movsi_low_st): Delete (movdi_low_st): Delete. * config/rs6000/rs6000.c (darwin_rs6000_legitimate_lo_sum_const_p): New. (mem_operand_gpr): Validate Mach-O LO_SUM cases separately. * config/rs6000/rs6000.md (movsi_low): Delete. diff --git a/gcc/config/rs6000/darwin.md b/gcc/config/rs6000/darwin.md index 3994447f38..16f710b28b 100644 --- a/gcc/config/rs6000/darwin.md +++ b/gcc/config/rs6000/darwin.md @@ -122,33 +122,6 @@ You should have received a copy of the GNU General Public License [(set_attr "type" "store")]) ;; 64-bit MachO load/store support -(define_insn "movdi_low" - [(set (match_operand:DI 0 "gpc_reg_operand" "=r,*!d") - (mem:DI (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,b") - (match_operand 2 "" ""))))] - "TARGET_MACHO && TARGET_64BIT" - "@ - ld %0,lo16(%2)(%1) - lfd %0,lo16(%2)(%1)" - [(set_attr "type" "load")]) - -(define_insn "movsi_low_st" - [(set (mem:SI (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand 2 "" ""))) - (match_operand:SI 0 "gpc_reg_operand" "r"))] - "TARGET_MACHO && ! TARGET_64BIT" - "stw %0,lo16(%2)(%1)" - [(set_attr "type" "store")]) - -(define_insn "movdi_low_st" - [(set (mem:DI (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,b") - (match_operand 2 "" ""))) - (match_operand:DI 0 "gpc_reg_operand" "r,*!d"))] - "TARGET_MACHO && TARGET_64BIT" - "@ - std %0,lo16(%2)(%1) - stfd %0,lo16(%2)(%1)" - [(set_attr "type" "store")]) ;; Mach-O PIC. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index d1434a9f74..bc22573174 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -7329,6 +7329,103 @@ address_offset (rtx op) return NULL_RTX; } +/* This tests that a lo_sum {constant, symbol, symbol+offset} is valid for + the mode. If we can't find (or don't know) the alignment of the symbol + we assume (optimistically) that it's sufficiently aligned [??? maybe we + should be pessimistic]. Offsets are validated in the same way as for + reg + offset. */ +static bool +darwin_rs6000_legitimate_lo_sum_const_p (rtx x, machine_mode mode) +{ + /* We should not get here with this. */ + gcc_checking_assert (! mode_supports_dq_form (mode)); + + if (GET_CODE (x) == CONST) + x = XEXP (x, 0); + + if (GET_CODE (x) == UNSPEC && XINT (x, 1) == UNSPEC_MACHOPIC_OFFSET) + x = XVECEXP (x, 0, 0); + + rtx sym = NULL_RTX; + unsigned HOST_WIDE_INT offset = 0; + + if (GET_CODE (x) == PLUS) + { + sym = XEXP (x, 0); + if (! SYMBOL_REF_P (sym)) + return false; + if (!CONST_INT_P (XEXP (x, 1))) + return false; + offset = INTVAL (XEXP (x, 1)); + } + else if (SYMBOL_REF_P (x)) + sym = x; + else if (CONST_INT_P (x)) + offset = INTVAL (x); + else if (GET_CODE (x) == LABEL_REF) + offset = 0; // We assume code labels are suitably aligned + else + return false; // not sure what we have here. + + /* If we don't know the alignment of the thing to which the symbol refers, + we assume optimistically it is "enough". + ??? maybe we should be pessimistic instead. */ + unsigned align = 0; + + if (sym) + { + tree decl = SYMBOL_REF_DECL (sym); +#if TARGET_MACHO + if (MACHO_SYMBOL_INDIRECTION_P (sym)) + /* The decl in an indirection symbol is the original one, which might + be less aligned than the indirection. Our indirections are always + pointer-aligned. */ + ; + else +#endif + if (decl && DECL_ALIGN (decl)) + align = DECL_ALIGN_UNIT (decl); + } + + unsigned int extra = 0; + switch (mode) + { + case E_DFmode: + case E_DDmode: + case E_DImode: + /* If we are using VSX scalar loads, restrict ourselves to reg+reg + addressing. */ + if (VECTOR_MEM_VSX_P (mode)) + return false; + + if (!TARGET_POWERPC64) + extra = 4; + else if ((offset & 3) || (align & 3)) + return false; + break; + + case E_TFmode: + case E_IFmode: + case E_KFmode: + case E_TDmode: + case E_TImode: + case E_PTImode: + extra = 8; + if (!TARGET_POWERPC64) + extra = 12; + else if ((offset & 3) || (align & 3)) + return false; + break; + + default: + break; + } + + /* We only care if the access(es) would cause a change to the high part. */ + offset = ((offset & 0xffff) ^ 0x8000) - 0x8000; + return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra); +} + /* Return true if the MEM operand is a memory operand suitable for use with a (full width, possibly multiple) gpr load/store. On powerpc64 this means the offset must be divisible by 4. @@ -7366,7 +7463,13 @@ mem_operand_gpr (rtx op, machine_mode mode) if (address_is_prefixed (addr, mode, NON_PREFIXED_DS)) return true; - /* Don't allow non-offsettable addresses. See PRs 83969 and 84279. */ + /* We need to look through Mach-O PIC unspecs to determine if a lo_sum is + really OK. Doing this early avoids teaching all the other machinery + about them. */ + if (TARGET_MACHO && GET_CODE (addr) == LO_SUM) + return darwin_rs6000_legitimate_lo_sum_const_p (XEXP (addr, 1), mode); + + /* Only allow offsettable addresses. See PRs 83969 and 84279. */ if (!rs6000_offsettable_memref_p (op, mode, false)) return false; diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 2dca269744..29dd616520 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -6894,13 +6894,6 @@ ;; do the load 16-bits at a time. We could do this by loading from memory, ;; and this is even supposed to be faster, but it is simpler not to get ;; integers in the TOC. -(define_insn "movsi_low" - [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (mem:SI (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b") - (match_operand 2 "" ""))))] - "TARGET_MACHO && ! TARGET_64BIT" - "lwz %0,lo16(%2)(%1)" - [(set_attr "type" "load")]) ;; MR LA LWZ LFIWZX LXSIWZX ;; STW STFIWX STXSIWX LI LIS