On Monday, March 14, 2016 5:25:58 PM PDT Matt Turner wrote: > On Mon, Mar 14, 2016 at 5:22 PM, Matt Turner <matts...@gmail.com> wrote: > > On Mon, Mar 14, 2016 at 5:10 PM, Francisco Jerez <curroje...@riseup.net> wrote: > >> Matt Turner <matts...@gmail.com> writes: > >> > >>> Missing this causes an assertion failure in the scheduler with the next > >>> patch. > >>> --- > >>> src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++- > >>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/ dri/i965/brw_ir_vec4.h > >>> index 660beca..7cedf8e 100644 > >>> --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h > >>> +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h > >>> @@ -201,7 +201,8 @@ public: > >>> { > >>> return (conditional_mod && (opcode != BRW_OPCODE_SEL && > >>> opcode != BRW_OPCODE_IF && > >>> - opcode != BRW_OPCODE_WHILE)); > >>> + opcode != BRW_OPCODE_WHILE)) || > >>> + opcode == TCS_OPCODE_SRC0_010_IS_ZERO; > >> > >> Meh... Any reason this weird instruction doesn't have the > >> conditional_mod set on creation? Having the generator set it implicitly > >> makes the instruction *less* useful and is the only reason we need to > >> introduce these hacks. > > > > None that I know of. I'd be happy with that fix as a replacement for this patch.
Setting it in the visitor seems totally reasonable. I like that better. > >> AFAICT the TCS_OPCODE_SRC0_010_IS_ZERO opcode is actually redundant and > >> equivalent to: > >> > >> broadcast.nz.f0 null.xyzw, src.xxxx, 0UD > > > > I think there's a difference -- that broadcast instruction treats the > > two vec4 halves independently, reading a component from each of them. > > The SRC0_010_IS_ZERO reads a single component of the whole vec8 (I > > don't see calls that would set it, but that region implies it's an > > align1 instruction). Does that seem right? Yeah, that's the point - we want to make both SIMD4x2 halves read the same 32-bit scalar value. I didn't realize that broadcast existed in the vec4 backend. I don't think it does that, though...? > > Uh, this seems weird. That apparently generates: > > mov.z.f0(8) null<1>F g5<0,4,1>.xUD { align16 1Q }; > > in arb_tessellation_shader/execution/nop.shader_test. > > I wonder if Ken intended that instruction to be align1? I don't recall. Does it matter? Aren't mov.z.f0(8) null<1>F g5<0,4,1>.xUD { align16 1Q }; and mov.z.f0(8) null<1>F g5<0,1,0>UD { align1 1Q }; both equivalent, anyway? I think I named the opcode "010" because that's effectively what it does, even if I did it in align16 mode. I never wanted to make this opcode - it's weird and stupid - but we don't have great regioning controls in the vec4 backend today, and it does the job. I don't see much point in trying to pretty it (other than fixing the conditional_mod thing!). --Ken
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev