fs: generalize SIMD16 interference workaround

Connor Abbott Tue, 03 May 2016 23:03:08 -0700

On Tue, May 3, 2016 at 6:16 PM, Francisco Jerez <[email protected]> wrote:
> Samuel Iglesias Gonsálvez <[email protected]> writes:
>
>> From: Connor Abbott <[email protected]>
>>
>> Work based on registers read/written instead of dispatch_width, so that
>> the interferences are added for 64-bit sources/destinations as well.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 67 
>> ++++++++++++++++-------
>>  1 file changed, 48 insertions(+), 19 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
>> index 2347cd5..f0e96f9 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
>> @@ -541,6 +541,34 @@ setup_mrf_hack_interference(fs_visitor *v, struct 
>> ra_graph *g,
>>     }
>>  }
>>
>> +static unsigned
>> +get_reg_node(const fs_reg& reg, unsigned first_payload_node)
>> +{
>> +   switch (reg.file) {
>> +   case VGRF:
>> +      return reg.nr;
>> +   case ARF:
>> +   case FIXED_GRF:
>> +      return first_payload_node + reg.nr;
>> +   case MRF:
>> +   default:
>> +      unreachable("unhandled register file");
>> +   }
>> +
>> +   return 0;
>> +}
>> +
>> +static void
>> +add_reg_interference(struct ra_graph *g, const fs_reg& reg1,
>> +                     const fs_reg& reg2, unsigned first_payload_node)
>> +{
>> +   if ((reg1.file == VGRF || reg1.file == ARF || reg1.file == FIXED_GRF) &&
>> +       (reg2.file == VGRF || reg2.file == ARF || reg2.file == FIXED_GRF)) {
>> +      ra_add_node_interference(g, get_reg_node(reg1, first_payload_node),
>> +                               get_reg_node(reg2, first_payload_node));
>> +   }
>> +}
>> +
>>  bool
>>  fs_visitor::assign_regs(bool allow_spilling)
>>  {
>> @@ -643,26 +671,27 @@ fs_visitor::assign_regs(bool allow_spilling)
>>        }
>>     }
>>
>> -   if (dispatch_width > 8) {
>> -      /* In 16-wide dispatch we have an issue where a compressed
>> -       * instruction is actually two instructions executed simultaneiously.
>> -       * It's actually ok to have the source and destination registers be
>> -       * the same.  In this case, each instruction over-writes its own
>> -       * source and there's no problem.  The real problem here is if the
>> -       * source and destination registers are off by one.  Then you can end
>> -       * up in a scenario where the first instruction over-writes the
>> -       * source of the second instruction.  Since the compiler doesn't know
>> -       * about this level of granularity, we simply make the source and
>> -       * destination interfere.
>> -       */
>> -      foreach_block_and_inst(block, fs_inst, inst, cfg) {
>> -         if (inst->dst.file != VGRF)
>> -            continue;
>> +   /* When instructions both read/write more than a single SIMD8 register, 
>> we
>> +    * have an issue where an instruction is actually two instructions 
>> executed
>> +    * simultaneiously.  It's actually ok to have the source and destination
>> +    * registers be the same.  In this case, each instruction over-writes its
>> +    * own source and there's no problem.  The real problem here is if the
>> +    * source and destination registers are off by one.  Then you can end up 
>> in
>> +    * a scenario where the first instruction over-writes the source of the
>> +    * second instruction.  Since the compiler doesn't know about this level 
>> of
>> +    * granularity, we simply make the source and destination interfere.
>> +    */
>> +   foreach_block_and_inst(block, fs_inst, inst, cfg) {
>> +      if (inst->dst.file != VGRF &&
>> +          inst->dst.file != ARF && inst->dst.file != FIXED_GRF)
>> +         continue;
>>
>> -         for (int i = 0; i < inst->sources; ++i) {
>> -            if (inst->src[i].file == VGRF) {
>> -               ra_add_node_interference(g, inst->dst.nr, inst->src[i].nr);
>> -            }
>> +      if (inst->regs_written <= 1)
>> +         continue;
>> +
>> +      for (int i = 0; i < inst->sources; ++i) {
>> +         if (inst->regs_read(i) > 1) {
>
> What's the point of checking that regs_read(i) > 1?  Isn't this
> supposedly a problem in particular when a source register is fully
> contained inside the first (one register) half of the destination?


No, it's not. It's a problem when the first half of the destination
overlaps with the second register read by source. For example,
something like:

  add(8) g11:DF g10:DF g20:DF

will be broken into:

   add(4) g11:DF g10:DF g20:DF
   add(4) g12:DF g11:DF g21:DF

which obviously doesn't do what you want. Note that something with a
source stride of 2 and a destination stride of 1, like:

   mov(8) g11:UD g10<2>:UD

has the same problem, so checking that regs_read(i) > 1 is both a
necessary and sufficient condition for the problem to hold for some
choice of physical registers.

>
> Either way I have some evidence suggesting that the EU works around this
> problem in hardware on at least some generations by buffering the result
> From the first half of any compressed instruction while the second half
> executes -- Can you come up with a test case where this fixes a problem?
> On what hardware?

I'd be very surprised if that were the case, since to the part of the
pipeline that handles register fetching and writeback, an instruction
like

  add(8) g11:DF g10:DF g20:DF

isn't really that different from

  add(16) g11:F g10:F g20:F

and both the existing workaround in this patch and a separate
workaround in liveness analysis for uniforms already exist to prevent
something like the second instruction from getting generated. Also, I
wouldn't have written this patch if it hadn't fixed something :) I can
try running piglit with this patch reverted tomorrow, if no one beats
me to it.

>
>> +            add_reg_interference(g, inst->dst, inst->src[i], 
>> first_payload_node);
>>           }
>>        }
>>     }
>> --
>> 2.5.0
>>
>> _______________________________________________
>> mesa-dev mailing list
>> [email protected]
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 37/59] i965/fs: generalize SIMD16 interference workaround

Reply via email to