Re: IA64 control speculation of loads

Benoît De Dinechin Tue, 09 Feb 2021 02:34:02 -0800

Sorry my previous answer was cut.

The motivation for prepass global code motion is indeed that after register 
allocation, inter-block scheduling is even more restricted due to 
anti-dependencies, including those due to live-out on side exit branches. 
Global code motion is a key performance enabler especially for the non-temporal 
loads (i.e. L1 cache bypass loads), which have an exposed latency close to 20 
cycles on the current kvx cores.


The dataflow issues encountered with SEL_SCHED in prepass with control 
speculation enabled was inconsistent liveness reported by the compiler. I am 
running a test suite to reproduce it (saw it 3 months ago).

Here is again a motivating example where I expect the scheduler to speculate 
loads from the second to the first block in the loop, which dominates it, so in 
principle SCHED_RGN should do it:

  typedef struct list_cell_ {
    struct list_cell_ *next;
    float payload;
  } list_cell_, *list_cell;

  float
  list_sum(list_cell_ *list)
  {
    float result = 0.0;
    while (list->next) {
      list = list->next;
      result += 1.0f/list->payload;
      if (!list->next) break;
      list = list->next;
      result += 1.0f/list->payload;
    }
    return result;
  }

Here is the TARGET_SCHED_SET_SCHED_FLAGS, with comments that reflect my 
understanding on what to do. The commented line prevents SEL_SCHED with control 
speculation unless postpass (as in ia64):

  static void
  kvx_sched_set_sched_flags (struct spec_info_def *spec_info)                   
                                                       
  {
    unsigned int *flags = &(current_sched_info->flags);                         
                                                       
    // Speculative scheduling is enabled by non-zero spec_info->mask.           
                                                       
    spec_info->mask = 0;                                                        
                                                       
    if (*flags & (SEL_SCHED | SCHED_RGN))                                       
                                                       
      {
        //if (!sel_sched_p () || reload_completed)                              
                                                       
          {
            // Must do this in case of speculation.                             
                                                       
            *flags |= USE_DEPS_LIST | DO_SPECULATION;                           
                                                       
            // Do control speculation only.                                     
                                                       
            spec_info->mask = BEGIN_CONTROL;                                    
                                                       
            // Speculative scheduling without CHECK.                            
                                                       
            spec_info->flags = SEL_SCHED_SPEC_DONT_CHECK_CONTROL;               
                                                       
            // Dump into the sched_dump.                                        
                                                       
            spec_info->dump = sched_dump;                                       
                                                       
          }
      }
  }

The TARGET_SCHED_SET_SCHED_FLAGS is implemented by (should memoize to return 0 
if already speculated with the same ts, assuming not relevant here):

  static int
  kvx_sched_speculate_insn (rtx_insn *insn, ds_t ts, rtx *new_pat)
  {
    rtx pattern = PATTERN (insn);
    if (GET_CODE (pattern) == SET)
      {
        rtx src = SET_SRC (pattern);
        if (GET_CODE (src) == MEM)
          {
            *new_pat = pattern;
            return 1;
          }
      }
    return -1;
  }

And TARGET_SCHED_NEEDS_BLOCK_P always returns false.

When I compile the motivating example above for the KVX, 
kvx_sched_speculate_insn() is indeed called with reload_completed==0 (prepass) 
for the two loads of the second block, but no code motion to the first block 
happens. Generated code is the same for SCHED_RGN (default) or SEL_SCHED 
(-fselective-scheduling), up to a renaming of the registers, although SEL_SCHED 
calls kvx_sched_speculate_insn() several times for each load.

For the ia64 on the motivating example, it seems there is no prepass control 
speculation either:

  ./gcc/ia64/gcc/cc1 -fpreprocessed list_sum2.i -quiet -dumpbase list_sum2.c 
-dp -auxbase list_sum2 -O3 -version -ffast-math -o list_sum2.s -da -dp 
-msched-control-spec -msched-in-control-spec
  grep _speculative list_sum2.c.*
  list_sum2.c.298r.mach:            ] UNSPEC_LDS)) 24 {movsf_speculative}
  ...

I noticed that the ia64 target uses the undocumented target hooks 
TARGET_SCHED_GET_INSN_SPEC_DS and TARGET_SCHED_GET_INSN_CHECKED_DS whose code 
is actually executed on this example.

Any recommendation on how to get load control speculation in prepass for any of 
the GCC 7.5 targets?

Best,

Benoît

Re: IA64 control speculation of loads

Reply via email to