https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70873
--- Comment #23 from H.J. Lu <hjl.tools at gmail dot com> --- (In reply to Uroš Bizjak from comment #22) > Created attachment 38412 [details] > Proposed patch > > This patch moves all TARGET_SSE_PARTIAL_REG_DEPENDENCY FP conversion > splitters to a later split pass. Plus, the patch substantially cleans these > and related patterns. > > The functionality of post-reload conversion splitters goes this way: > > - process FP conversions for TARGET_USE_VECTOR_FP_CONVERTS in an early > post-reload splitter. This pass will rewrite FP conversions to vector insns > and is thus incompatible with the next two passes. AMDFAM10 processors > depend on this transformation. > > - process FP conversions for TARGET_SPLIT_MEM_OPND_FOR_FP_CONVERTS in a > peephole2 pass. This will transform mem->reg insns to reg->reg insns, and > these insn could be processed by the next pass. Some Intel processors depend > on this transformation. > > - process FP conversions for TARGET_SSE_PARTIAL_REG_DEPENDENCY in a late > post-reload splitter, when allocated registers are stable. AMD and Intel > processors depend on this pass, so it is part of generic tuning. We need to move those special SSE SF->DF splitters before (define_split [(set (match_operand 0 "any_fp_register_operand") (float_extend (match_operand 1 "memory_operand")))] "reload_completed && (GET_MODE (operands[0]) == TFmode || GET_MODE (operands[0]) == XFmode || GET_MODE (operands[0]) == DFmode)" [(set (match_dup 0) (match_dup 2))] { operands[2] = find_constant_src (curr_insn); if (operands[2] == NULL_RTX || (SSE_REGNO_P (REGNO (operands[0])) && standard_sse_constant_p (operands[2], GET_MODE (operands[0])) != 1) || (STACK_REGNO_P (REGNO (operands[0])) && standard_80387_constant_p (operands[2]) < 1)) FAIL; }) Otherwise, they may not be used on memory operand since the general SSE (In reply to Uroš Bizjak from comment #22) > Created attachment 38412 [details] > Proposed patch > > This patch moves all TARGET_SSE_PARTIAL_REG_DEPENDENCY FP conversion > splitters to a later split pass. Plus, the patch substantially cleans these > and related patterns. > > The functionality of post-reload conversion splitters goes this way: > > - process FP conversions for TARGET_USE_VECTOR_FP_CONVERTS in an early > post-reload splitter. This pass will rewrite FP conversions to vector insns > and is thus incompatible with the next two passes. AMDFAM10 processors > depend on this transformation. > > - process FP conversions for TARGET_SPLIT_MEM_OPND_FOR_FP_CONVERTS in a > peephole2 pass. This will transform mem->reg insns to reg->reg insns, and > these insn could be processed by the next pass. Some Intel processors depend > on this transformation. > > - process FP conversions for TARGET_SSE_PARTIAL_REG_DEPENDENCY in a late > post-reload splitter, when allocated registers are stable. AMD and Intel > processors depend on this pass, so it is part of generic tuning. We need to move those special SSE SF->DF splitters before (define_split [(set (match_operand 0 "any_fp_register_operand") (float_extend (match_operand 1 "memory_operand")))] "reload_completed && (GET_MODE (operands[0]) == TFmode || GET_MODE (operands[0]) == XFmode || GET_MODE (operands[0]) == DFmode)" [(set (match_dup 0) (match_dup 2))] { operands[2] = find_constant_src (curr_insn); if (operands[2] == NULL_RTX || (SSE_REGNO_P (REGNO (operands[0])) && standard_sse_constant_p (operands[2], GET_MODE (operands[0])) != 1) || (STACK_REGNO_P (REGNO (operands[0])) && standard_80387_constant_p (operands[2]) < 1)) FAIL; }) Otherwise, they may not be used on memory operand since the general SSE float_extend splitter on memory operand will be used.