https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122686
--- Comment #22 from Haochen Jiang <haochen.jiang at intel dot com> ---
(In reply to Andrew Macleod from comment #19)
> (In reply to Haochen Jiang from comment #18)
> > I forgot to attach the gimple it stuck at:
> >
> > It stuck here:
> >
> > (gdb) p *(gimple*)(0x7fe446e45320)
> > $4 = {code = GIMPLE_CALL, no_warning = 0, visited = 1, nontemporal_move = 0,
> > plf = 1, modified = 0, has_volatile_ops = 0, ilf = 0, subcode = 0, uid = 2,
> > num_ops = 6, location = 4611686018427388158, bb = 0x7fe4470e4240,
> > next = 0x7fe4473fe660, prev = 0x7fe44739e660}
>
> huh. That seems very weird. The tracebacks are all in range_of_range_op,
> meaning there is a handler.. but calls are handled by range_of_call ().
> And we don't process arguments of calls either.
>
> OH!... but we do handle calls that are builtins in
> gimple_range_op_handler::maybe_builtin_call by creating custom handlers for
> specific calls. So this is most likely a builtin call, so we'll want to
> check the handler.
>
> Its also odd that its appears to be in an infinite loop, yet does not blow
> the stack. It must be cycling between the same set of statements. What
> is the CALL statement? its also trying to get ranges on an edge, so
> presumably at least one of the parameters to this builtin are coming from a
> different basic block. perhaps the cache propagator is getting confused by
> the CFG state?
>
> Can you attach both the .optimized and what there is of .expand listings?
> and print_gimple_stmt (stderr, stmt, 0, 0) on that statement that we are not
> getting past... I want to know which statement in the listing we are stuck
> on I'll look to see if there is anything odd, and then probably have more
> questions :-p
For -march=znver5, the build will not hang. (BTW for -march=znver3 it will also
hang.) In expand pass output, it is nearly the same for both -march=znver5 and
-march=icelake-server.
For znver5, it continues from where icelake-server got stuck:
;; Generating RTL for gimple basic block 17
Swap operands in stmt:
ivtmp.953_2709 = _2644 + _2648;
Cost left opnd=0, right opnd=1
Registering value_relation (_712 pe64 _139) (bb10) at _712 = (unsigned long)
_139;
;; ivtmp.953_2709 = _2648 + _2644;
(insn 423 422 424 (parallel [
(set (reg:DI 1454 [ _2648 ])
(ashift:DI (reg:DI 148 [ ivtmp.961 ])
(const_int 2 [0x2])))
(clobber (reg:CC 17 flags))
]) -1
(nil))
(insn 424 423 0 (parallel [
(set (reg:DI 1203 [ ivtmp.953 ])
(plus:DI (reg:DI 1454 [ _2648 ])
(reg:DI 1184 [ _2644 ])))
(clobber (reg:CC 17 flags))
]) -1
(nil))
;; _88 = (int) _2059;
(insn 425 424 0 (parallel [
(set (reg:SI 149 [ _88 ])
(plus:SI (reg:SI 115 [ _18 ]) <--- This is where
icelake-server expand dump stuck.
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
]) -1
(nil))
;; Generating RTL for gimple basic block 18
Selected stringop expansion strategy: libcall
;; __builtin_memcpy (_1815, _524, _1809);
(insn 427 426 428 (set (reg:DI 1455)
(reg:DI 1269 [ ivtmp.951 ])) "module_polarfft.fppized.f90":390:44
discrim 1 -1
(nil))
(insn 428 427 429 (set (reg:DI 1456)
(reg:DI 1203 [ ivtmp.953 ])) "module_polarfft.fppized.f90":390:44
discrim 1 -1
(nil))
(insn 429 428 430 (set (reg:DI 1457)
(reg:DI 797 [ _1809 ])) "module_polarfft.fppized.f90":390:44 discrim 1
-1
(nil))
(insn 430 429 431 (set (reg:DI 1 dx)
(reg:DI 1457)) "module_polarfft.fppized.f90":390:44 discrim 1 -1
(nil))
(insn 431 430 432 (set (reg:DI 4 si)
(reg:DI 1456)) "module_polarfft.fppized.f90":390:44 discrim 1 -1
(nil))
(insn 432 431 433 (set (reg:DI 5 di)
(reg:DI 1455)) "module_polarfft.fppized.f90":390:44 discrim 1 -1
(nil))
(call_insn 433 432 434 (set (reg:DI 0 ax)
(call (mem:QI (symbol_ref:DI ("memcpy") [flags 0x41] <function_decl
0x727127cd2700 __builtin_memcpy>) [0 __builtin_memcpy S1 A8])
(const_int 0 [0]))) "module_polarfft.fppized.f90":390:44 discrim 1
-1
(expr_list:REG_CALL_DECL (symbol_ref:DI ("memcpy") [flags 0x41]
<function_decl 0x727127cd2700 __builtin_memcpy>)
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil)))
(expr_list:DI (set (reg:DI 0 ax)
(reg:DI 5 di))
(expr_list:DI (use (reg:DI 5 di))
(expr_list:DI (use (reg:DI 4 si))
(expr_list:DI (use (reg:DI 1 dx))
(nil))))))
The bb are nearly the same:
For -march=znver5:
;; basic block 17, loop depth 1
;; pred: 16
_2648 = ivtmp.961_87 * 4;
ivtmp.953_2709 = _2644 + _2648;
_2170 = (unsigned int) _18;
_2059 = _2170 + 1;
_88 = (int) _2059;
;; succ: 18
;; basic block 18, loop depth 2
;; pred: 18
;; 17
# k_304 = PHI <k_192(18), _19(17)>
# ivtmp.951_2871 = PHI <ivtmp.951_2870(18), _2906(17)>
# ivtmp.953_2769 = PHI <ivtmp.953_2710(18), ivtmp.953_2709(17)>
_1815 = (float *) ivtmp.951_2871;
_524 = (float *) ivtmp.953_2769;
__builtin_memcpy (_1815, _524, _1809);
k_192 = k_304 + 1;
ivtmp.951_2870 = _1947 + ivtmp.951_2871;
ivtmp.953_2710 = _881 + ivtmp.953_2769;
if (_88 == k_192)
goto <bb 19>; [11.00%]
else
goto <bb 18>; [89.00%]
;; succ: 19
;; 18
For -march=icelake-server:
;; basic block 17, loop depth 1
;; pred: 16
_2609 = ivtmp.906_2541 * 4;
ivtmp.896_2506 = _2608 + _2609;
_2515 = (unsigned int) _18;
_2516 = _2515 + 1;
_2517 = (int) _2516;
;; succ: 18
;; basic block 18, loop depth 2
;; pred: 18
;; 17
# k_304 = PHI <k_192(18), _19(17)>
# ivtmp.894_2493 = PHI <ivtmp.894_2494(18), _2500(17)>
# ivtmp.896_2504 = PHI <ivtmp.896_2505(18), ivtmp.896_2506(17)>
_1815 = (float *) ivtmp.894_2493;
_524 = (float *) ivtmp.896_2504;
__builtin_memcpy (_1815, _524, _1809);
k_192 = k_304 + 1;
ivtmp.894_2494 = _2492 + ivtmp.894_2493;
ivtmp.896_2505 = ivtmp.896_2504 + _2519;
if (k_192 == _2517)
goto <bb 19>; [11.00%]
else
goto <bb 18>; [89.00%]
;; succ: 19
;; 18
We might stuck at that __builtin_memcopy()?