On Mon, Mar 14, 2016 at 5:35 PM, Francisco Jerez <curroje...@riseup.net> wrote: > Matt Turner <matts...@gmail.com> writes: > >> On Mon, Mar 14, 2016 at 5:10 PM, Francisco Jerez <curroje...@riseup.net> >> wrote: >>> Matt Turner <matts...@gmail.com> writes: >>> >>>> Missing this causes an assertion failure in the scheduler with the next >>>> patch. >>>> --- >>>> src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h >>>> b/src/mesa/drivers/dri/i965/brw_ir_vec4.h >>>> index 660beca..7cedf8e 100644 >>>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h >>>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h >>>> @@ -201,7 +201,8 @@ public: >>>> { >>>> return (conditional_mod && (opcode != BRW_OPCODE_SEL && >>>> opcode != BRW_OPCODE_IF && >>>> - opcode != BRW_OPCODE_WHILE)); >>>> + opcode != BRW_OPCODE_WHILE)) || >>>> + opcode == TCS_OPCODE_SRC0_010_IS_ZERO; >>> >>> Meh... Any reason this weird instruction doesn't have the >>> conditional_mod set on creation? Having the generator set it implicitly >>> makes the instruction *less* useful and is the only reason we need to >>> introduce these hacks. >> >> None that I know of. I'd be happy with that fix as a replacement for this >> patch. >> >>> AFAICT the TCS_OPCODE_SRC0_010_IS_ZERO opcode is actually redundant and >>> equivalent to: >>> >>> broadcast.nz.f0 null.xyzw, src.xxxx, 0UD >> >> I think there's a difference -- that broadcast instruction treats the >> two vec4 halves independently, reading a component from each of them. >> The SRC0_010_IS_ZERO reads a single component of the whole vec8 (I >> don't see calls that would set it, but that region implies it's an >> align1 instruction). Does that seem right? >> > broadcast is supposed to be about taking the value from a given channel > (which will be a vec4 in Align16 and a scalar in Align1) and > "broadcasting" it to all channels of the destination. In Align16 mode > and if the second source is an immediate it basically boils down to:
Yep, you're right. I had to look at what broadcast did again, and it looks like it generates equivalent code (though in two instructions). >> brw_MOV(p, dst, stride(suboffset(src, 4 * i), 0, 4, 1)); > > (from brw_eu_emit.c:3365), which seems pretty close to what TCS want. > Anyway I'm fine with this patch landing as-is, this can always be > cleaned up later. Okay. I do like the suggestion to just set condmod at a higher level. I'll send a patch that does that to replace this one. Thanks! _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev