Sorry my previous answer was cut. The motivation for prepass global code motion is indeed that after register allocation, inter-block scheduling is even more restricted due to anti-dependencies, including those due to live-out on side exit branches. Global code motion is a key performance enabler especially for the non-temporal loads (i.e. L1 cache bypass loads), which have an exposed latency close to 20 cycles on the current kvx cores.
The dataflow issues encountered with SEL_SCHED in prepass with control speculation enabled was inconsistent liveness reported by the compiler. I am running a test suite to reproduce it (saw it 3 months ago). Here is again a motivating example where I expect the scheduler to speculate loads from the second to the first block in the loop, which dominates it, so in principle SCHED_RGN should do it: typedef struct list_cell_ { struct list_cell_ *next; float payload; } list_cell_, *list_cell; float list_sum(list_cell_ *list) { float result = 0.0; while (list->next) { list = list->next; result += 1.0f/list->payload; if (!list->next) break; list = list->next; result += 1.0f/list->payload; } return result; } Here is the TARGET_SCHED_SET_SCHED_FLAGS, with comments that reflect my understanding on what to do. The commented line prevents SEL_SCHED with control speculation unless postpass (as in ia64): static void kvx_sched_set_sched_flags (struct spec_info_def *spec_info) { unsigned int *flags = &(current_sched_info->flags); // Speculative scheduling is enabled by non-zero spec_info->mask. spec_info->mask = 0; if (*flags & (SEL_SCHED | SCHED_RGN)) { //if (!sel_sched_p () || reload_completed) { // Must do this in case of speculation. *flags |= USE_DEPS_LIST | DO_SPECULATION; // Do control speculation only. spec_info->mask = BEGIN_CONTROL; // Speculative scheduling without CHECK. spec_info->flags = SEL_SCHED_SPEC_DONT_CHECK_CONTROL; // Dump into the sched_dump. spec_info->dump = sched_dump; } } } The TARGET_SCHED_SET_SCHED_FLAGS is implemented by (should memoize to return 0 if already speculated with the same ts, assuming not relevant here): static int kvx_sched_speculate_insn (rtx_insn *insn, ds_t ts, rtx *new_pat) { rtx pattern = PATTERN (insn); if (GET_CODE (pattern) == SET) { rtx src = SET_SRC (pattern); if (GET_CODE (src) == MEM) { *new_pat = pattern; return 1; } } return -1; } And TARGET_SCHED_NEEDS_BLOCK_P always returns false. When I compile the motivating example above for the KVX, kvx_sched_speculate_insn() is indeed called with reload_completed==0 (prepass) for the two loads of the second block, but no code motion to the first block happens. Generated code is the same for SCHED_RGN (default) or SEL_SCHED (-fselective-scheduling), up to a renaming of the registers, although SEL_SCHED calls kvx_sched_speculate_insn() several times for each load. For the ia64 on the motivating example, it seems there is no prepass control speculation either: ./gcc/ia64/gcc/cc1 -fpreprocessed list_sum2.i -quiet -dumpbase list_sum2.c -dp -auxbase list_sum2 -O3 -version -ffast-math -o list_sum2.s -da -dp -msched-control-spec -msched-in-control-spec grep _speculative list_sum2.c.* list_sum2.c.298r.mach: ] UNSPEC_LDS)) 24 {movsf_speculative} ... I noticed that the ia64 target uses the undocumented target hooks TARGET_SCHED_GET_INSN_SPEC_DS and TARGET_SCHED_GET_INSN_CHECKED_DS whose code is actually executed on this example. Any recommendation on how to get load control speculation in prepass for any of the GCC 7.5 targets? Best, Benoît