Hello! The problem here was with x86 FP move patterns that allowed FP constants for CM_MEDIUM and CM_LARGE memory models, but not others. However, this interfered with ifcvt cmove detection, since FP constant was rematerialized in, not copied to the output register.
According to ix86_in_large_data_p, automatic variables are never large data, so there is no justification to treat CM_MEDIUM memory models any different than CM_SMALL model. In CM_SMALL models, constants are always expanded as loads from memory, so CSE passes can do their job, and after registers are allocated, we simplify relevant loads from memory to a simple constant move insn anyway. In contrast, CM_LARGE{,_PIC} models have costly memory access, so we should avoid loads from memory as much as possible. These models should be treated like -Os case, where we already avoid memory loads, *unless* we are sure that the constant will result in a simple constant move insn. Unfortunately, above mentioned ifcvt cmove detection interference will trigger for -Os and CM_LARGE models, but hopefully, STRICT_MIN/MAX_EXPR patches [1] (not yet fully reviewed and committed) will resolve this problem. 2016-05-14 Uros Bizjak <ubiz...@gmail.com> PR target/71097 * config/i386/i386.md (*movtf_internal): Before register allocation, do not allow FP constants for CM_MEDIUM memory model, allow only standard FP constants for CM_LARGE and CM_LARGE_PIC models. (*movxf_internal): Ditto. (*movdf_internal): Ditto. (*movsf_internal): Ditto. Patch was bootstrapped and regression tested on x86_64-linux-gnu/-mcmodel=medium/{,-fpic}. Committed to mainline SVN. [1] https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00330.html Uros.
Index: i386.md =================================================================== --- i386.md (revision 236239) +++ i386.md (working copy) @@ -3114,9 +3114,9 @@ "(TARGET_64BIT || TARGET_SSE) && !(MEM_P (operands[0]) && MEM_P (operands[1])) && (!can_create_pseudo_p () - || (ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_LARGE) || !CONST_DOUBLE_P (operands[1]) - || (optimize_function_for_size_p (cfun) + || ((optimize_function_for_size_p (cfun) + || (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC)) && standard_sse_constant_p (operands[1], TFmode) == 1 && !memory_operand (operands[0], TFmode)) || (!TARGET_MEMORY_MISMATCH_STALL @@ -3200,9 +3200,9 @@ "fm,f,G,roF,r , *roF,*r,F ,C,roF,rF"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && (!can_create_pseudo_p () - || (ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_LARGE) || !CONST_DOUBLE_P (operands[1]) - || (optimize_function_for_size_p (cfun) + || ((optimize_function_for_size_p (cfun) + || (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC)) && standard_80387_constant_p (operands[1]) > 0 && !memory_operand (operands[0], XFmode)) || (!TARGET_MEMORY_MISMATCH_STALL @@ -3273,9 +3273,9 @@ "Yf*fm,Yf*f,G ,roF,r ,*roF,*r,F ,rm,rC,C ,F ,C,v,m,v,C ,*x,m ,*x,Yj,r ,roF,rF,rmF,rC"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && (!can_create_pseudo_p () - || (ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_LARGE) || !CONST_DOUBLE_P (operands[1]) - || (optimize_function_for_size_p (cfun) + || ((optimize_function_for_size_p (cfun) + || (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC)) && ((!(TARGET_SSE2 && TARGET_SSE_MATH) && standard_80387_constant_p (operands[1]) > 0) || (TARGET_SSE2 && TARGET_SSE_MATH @@ -3475,9 +3475,9 @@ "Yf*fm,Yf*f,G ,rmF,rF,C,v,m,v,Yj,r ,*y ,m ,*y,*Yn,r ,rmF,rF"))] "!(MEM_P (operands[0]) && MEM_P (operands[1])) && (!can_create_pseudo_p () - || (ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_LARGE) || !CONST_DOUBLE_P (operands[1]) - || (optimize_function_for_size_p (cfun) + || ((optimize_function_for_size_p (cfun) + || (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC)) && ((!TARGET_SSE_MATH && standard_80387_constant_p (operands[1]) > 0) || (TARGET_SSE_MATH