On 7 October 2015 at 18:04, Connor Abbott <[email protected]> wrote: > On Wed, Oct 7, 2015 at 7:51 AM, Emil Velikov <[email protected]> wrote: >> XXX: commit message, comment in nir_intrinsics.h >> >> Signed-off-by: Emil Velikov <[email protected]> >> --- >> src/glsl/nir/glsl_to_nir.cpp | 6 ++++++ >> src/glsl/nir/nir_intrinsics.h | 2 ++ >> 2 files changed, 8 insertions(+) >> >> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp >> index efaa73e..231bdbf 100644 >> --- a/src/glsl/nir/glsl_to_nir.cpp >> +++ b/src/glsl/nir/glsl_to_nir.cpp >> @@ -698,6 +698,8 @@ nir_visitor::visit(ir_call *ir) >> op = nir_intrinsic_ssbo_atomic_exchange; >> } else if (strcmp(ir->callee_name(), >> "__intrinsic_ssbo_atomic_comp_swap_internal") == 0) { >> op = nir_intrinsic_ssbo_atomic_comp_swap; >> + } else if (strcmp(ir->callee_name(), "__intrinsic_shader_clock") == >> 0) { >> + op = nir_intrinsic_shader_clock; >> } else { >> unreachable("not reached"); >> } >> @@ -802,6 +804,10 @@ nir_visitor::visit(ir_call *ir) >> case nir_intrinsic_memory_barrier: >> nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr); >> break; >> + case nir_intrinsic_shader_clock: >> + nir_ssa_dest_init(&instr->instr, &instr->dest, 1, NULL); >> + nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr); >> + break; >> case nir_intrinsic_store_ssbo: { >> exec_node *param = ir->actual_parameters.get_head(); >> ir_rvalue *block = ((ir_instruction *)param)->as_rvalue(); >> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h >> index 263d8c1..4b32215 100644 >> --- a/src/glsl/nir/nir_intrinsics.h >> +++ b/src/glsl/nir/nir_intrinsics.h >> @@ -83,6 +83,8 @@ BARRIER(discard) >> */ >> BARRIER(memory_barrier) >> >> +INTRINSIC(shader_clock, 0, ARR(), true, 1, 1, 0, 0 /* flags ? */) > > This should have NIR_INTRINSIC_CAN_DELETE, since if the result is > unused we can safely delete it (i.e. it has no side effects), but we > can't safely reorder it. > Thanks. Will do.
> Side note: NIR's current model, as well as any more flexible memory > model we might adopt in the future, assumes that intrinsics which are > marked as reorderable, as well as ALU operations which are implicitly > reorderable, can be freely reordered with respect to *any* other > operation, even one that's explicitly not reorderable. So, for > example, if you do: > > ... = clock(); > a = b + c; > ... = clock(); > > then there are no guarantees that the addition won't get moved outside > the clock() calls. Currently, this will only happen if the addition > becomes involved in some algebraic optimization or CSE, but in the > future with passes like GCM that move code around indiscriminately > it's going to be much more of a problem. I don't think we could really > solve this problem in a useful and general way without making the rest > of NIR significantly more complicated and slower, which I definitely > don't want. I think the best answer is to say "really these tools are > unreliable and meant mainly for driver developers and people who know > what they're doing, and if you use them you have to be prepared to > look at the assembly source and see if it matches what you expected." > I haven't looked at the optimisations closely and I assumed that all intrinsics act as motion barriers. Seems like I was mistaking. Can we call it a "where sub-group is implementation dependent" and be done with it ;-) Thanks Emil _______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
