http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49519

--- Comment #6 from Yukhin Kirill <kirill.yukhin at intel dot com> 2011-06-30 
15:11:41 UTC ---
I've looked into tail-call opt. Seems we need not call it at all if we have
new/old stack addresses for parameters overlap. BTW, I think it is to
conservative, anyway...
We have call to pointer and passing of 5 params. Last param is out of our
interest, but first 4 do. 
We have in expand:
  GIMPLE snippet:
    D.172468_17 = MEM[(struct cons &)&arg_refs + 12].head;
    D.172469_18 = MEM[(struct cons &)&arg_refs + 8].head;
    D.172470_19 = MEM[(struct cons &)&arg_refs + 4].head;
    D.172471_20 = MEM[(struct cons &)&arg_refs];
    D.172462_21 = (sizetype) fun_ptr$__delta_26;
    D.172463_22 = obj_3(D) + D.172462_21;
    fun_ptr$__pfn_23 (D.172463_22, D.172471_20, D.172470_19, D.172469_18,
D.172468_17); [tail call]

And subsequently expanding it we have RTL:
(insn 19 18 20 4 (set (reg/f:SI 80)
        (mem/s/f/j/c:SI (plus:SI (reg/f:SI 53 virtual-incoming-args)
                (const_int 28 [0x1c])) [0 MEM[(struct cons &)&arg_refs +
12].head+0 S4 A32])) include/base/thread_management.h:1534 -1
     (nil))

(insn 20 19 21 4 (set (mem:SI (plus:SI (reg/f:SI 53 virtual-incoming-args)
                (const_int 16 [0x10])) [0 S4 A32])
        (reg/f:SI 80)) include/base/thread_management.h:1534 -1
     (nil))

(insn 21 20 22 4 (set (reg/f:SI 81)
        (mem/s/f/j/c:SI (plus:SI (reg/f:SI 53 virtual-incoming-args)
                (const_int 24 [0x18])) [0 MEM[(struct cons &)&arg_refs +
8].head+0 S4 A32])) include/base/thread_management.h:1534 -1
     (nil))

(insn 22 21 23 4 (set (mem:SI (plus:SI (reg/f:SI 53 virtual-incoming-args)
                (const_int 12 [0xc])) [0 S4 A32])
        (reg/f:SI 81)) include/base/thread_management.h:1534 -1
     (nil))

(insn 23 22 24 4 (set (reg/f:SI 82)
        (mem/s/f/j/c:SI (plus:SI (reg/f:SI 53 virtual-incoming-args)
                (const_int 20 [0x14])) [0 MEM[(struct cons &)&arg_refs +
4].head+0 S4 A32])) include/base/thread_management.h:1534 -1
     (nil))

(insn 24 23 25 4 (set (mem:SI (plus:SI (reg/f:SI 53 virtual-incoming-args)
                (const_int 8 [0x8])) [0 S4 A32])
        (reg/f:SI 82)) include/base/thread_management.h:1534 -1
     (nil))

(insn 25 24 26 4 (parallel [
            (set (reg:SI 83)
                (plus:SI (reg/f:SI 53 virtual-incoming-args)
                    (const_int 16 [0x10])))
            (clobber (reg:CC 17 flags))
        ]) step-14.cc:4271 -1
     (nil))

(insn 26 25 27 4 (set (reg/f:SI 84)   <----
        (mem/f/c:SI (reg:SI 83) [0 MEM[(struct cons &)&arg_refs]+0 S4 A32]))
include/base/thread_management.h:1534 -1 <----
     (nil))

(insn 27 26 28 4 (set (mem:SI (plus:SI (reg/f:SI 53 virtual-incoming-args)
                (const_int 4 [0x4])) [0 S4 A32])
        (reg/f:SI 84)) include/base/thread_management.h:1534 -1
     (nil))

(insn 28 27 29 4 (parallel [
            (set (reg:SI 85)
                (plus:SI (reg/v/f:SI 77 [ obj ])
                    (reg:SI 74 [ fun_ptr$__delta ])))
            (clobber (reg:CC 17 flags))
        ]) include/base/thread_management.h:1534 -1
     (nil))

(insn 29 28 30 4 (set (mem:SI (reg/f:SI 53 virtual-incoming-args) [0 S4 A32])
        (reg:SI 85)) include/base/thread_management.h:1534 -1
     (nil))

(call_insn/j 30 29 31 4 (call (mem:QI (reg/f:SI 59 [ fun_ptr$__pfn ]) [0
*fun_ptr$__pfn_23 S1 A8])
        (const_int 20 [0x14])) include/base/thread_management.h:1534 -1
     (nil)
    (expr_list:REG_DEP_TRUE (use (mem/f/i:SI (reg/f:SI 53
virtual-incoming-args) [0 S4 A32]))
        (expr_list:REG_DEP_TRUE (use (mem/f/i:SI (plus:SI (reg/f:SI 53
virtual-incoming-args)
                        (const_int 4 [0x4])) [0 S4 A32]))
            (expr_list:REG_DEP_TRUE (use (mem/f/i:SI (plus:SI (reg/f:SI 53
virtual-incoming-args)
                            (const_int 8 [0x8])) [0 S4 A32]))
                (expr_list:REG_DEP_TRUE (use (mem/f/i:SI (plus:SI (reg/f:SI 53
virtual-incoming-args)
                                (const_int 12 [0xc])) [0 S4 A32]))
                    (expr_list:REG_DEP_TRUE (use (mem/f/i:SI (plus:SI (reg/f:SI
53 virtual-incoming-args)
                                    (const_int 16 [0x10])) [0 S4 A32]))
                        (nil)))))))


You can see that calculation of address of 4-th param is performed in different
way. We calculate a sum, store it to register, load memory from that address
and the put it on the new stack.

BUT. Predicate which check for memory overlapping looks like this:
 static bool
  mem_overlaps_already_clobbered_arg_p (rtx addr, unsigned HOST_WIDE_INT size)
  {
    HOST_WIDE_INT i;

    if (addr == crtl->args.internal_arg_pointer)
      i = 0;
    else if (GET_CODE (addr) == PLUS
             && XEXP (addr, 0) == crtl->args.internal_arg_pointer
             && CONST_INT_P (XEXP (addr, 1)))
      i = INTVAL (XEXP (addr, 1));
    /* Return true for arg pointer based indexed addressing.  */
    else if (GET_CODE (addr) == PLUS
             && (XEXP (addr, 0) == crtl->args.internal_arg_pointer
                 || XEXP (addr, 1) == crtl->args.internal_arg_pointer))
      return true;
    else
      return false;
.....

You can see that if we have load which does not look like (esp+*), routine
always states that there is no overlap.

That is why tail-call applied, while he mustn't.

Reply via email to