On Sat, Mar 26, 2016 at 2:02 PM, Rob Clark <[email protected]> wrote:
> From: Rob Clark <[email protected]> > > A pass to lower complex (struct/array/mat) inputs/outputs to primitive > types. This allows, for example, linking that removes unused components > of a larger type which is not indirectly accessed. > > In the near term, it is needed for gallium (mesa/st) support for NIR, > since only used components of a type are assigned VBO slots, and we > otherwise have no way to represent that to the driver backend. But it > should be useful for doing shader linking in NIR. > > Signed-off-by: Rob Clark <[email protected]> > --- > src/compiler/Makefile.sources | 1 + > src/compiler/nir/nir.h | 1 + > src/compiler/nir/nir_lower_io_types.c | 178 > ++++++++++++++++++++++++++++++++++ > 3 files changed, 180 insertions(+) > create mode 100644 src/compiler/nir/nir_lower_io_types.c > > diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources > index ae7efbf..8f1ffdc 100644 > --- a/src/compiler/Makefile.sources > +++ b/src/compiler/Makefile.sources > @@ -196,6 +196,7 @@ NIR_FILES = \ > nir/nir_lower_idiv.c \ > nir/nir_lower_indirect_derefs.c \ > nir/nir_lower_io.c \ > + nir/nir_lower_io_types.c \ > nir/nir_lower_outputs_to_temporaries.c \ > nir/nir_lower_passthrough_edgeflags.c \ > nir/nir_lower_phis_to_scalar.c \ > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h > index 65f75bd..7f7c459 100644 > --- a/src/compiler/nir/nir.h > +++ b/src/compiler/nir/nir.h > @@ -2172,6 +2172,7 @@ void nir_lower_io(nir_shader *shader, > nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr); > nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr); > > +void nir_lower_io_types(nir_shader *shader, int (*type_size)(const struct > glsl_type *)); > void nir_lower_vars_to_ssa(nir_shader *shader); > > bool nir_remove_dead_variables(nir_shader *shader); > diff --git a/src/compiler/nir/nir_lower_io_types.c > b/src/compiler/nir/nir_lower_io_types.c > new file mode 100644 > index 0000000..b7d9ebe > --- /dev/null > +++ b/src/compiler/nir/nir_lower_io_types.c > @@ -0,0 +1,178 @@ > +/* > + * Copyright © 2016 Red Hat > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without > limitation > + * the rights to use, copy, modify, merge, publish, distribute, > sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the > next > + * paragraph) shall be included in all copies or substantial portions of > the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT > SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > ARISING FROM, > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS > IN THE > + * SOFTWARE. > + */ > + > +#include "nir.h" > +#include "nir_builder.h" > + > +/* Lower complex (struct/array/mat) input and output vars to primitive > types > + * (vec4) for linking. All indirect input/output access should already be > + * lowered (ie. nir_lower_io_to_temporaries). > + */ > + > +struct lower_io_types_state { > + nir_shader *shader; > + struct exec_list new_ins; > + struct exec_list new_outs; > + int (*type_size)(const struct glsl_type *); > +}; > + > +static nir_variable * > +get_new_var(struct lower_io_types_state *state, nir_variable *var, > unsigned off) > +{ > + struct exec_list *list; > + > + if (var->data.mode == nir_var_shader_in) { > + list = &state->new_ins; > + } else { > + assert(var->data.mode == nir_var_shader_out); > + list = &state->new_outs; > + } > + > + nir_foreach_variable(nvar, list) { > + if (nvar->data.location == (var->data.location + off)) > + return nvar; > + } > Doing a linear search here could get expensive. Why not just have an array that maps locations to variables? I think you have a maximum of 64 possible locations (if you disregard tess) so it's not that memory-intensive. > + > + /* doesn't already exist, so we need to create a new one: */ > + /* TODO figure out if scalar vs vec, and if float/int/uint/(double?) > + * do we need to fixup interpolation mode for int vs float components > + * of a struct, etc.. > + */ > + const struct glsl_type *ntype = glsl_vec4_type(); > + nir_variable *nvar = nir_variable_create(state->shader, var->data.mode, > + ntype, NULL); > Do you want to create a new one or clone? Cloning seems better because that would ensure that interp qualifiers etc. get copied over. Also, I seem to recall it being possible for outputs to have constant initializers coming out of GLSL IR. This pass doesn't handle that. > + > + nvar->name = ralloc_asprintf(nvar, "%s@%u", var->name, off); > + nvar->data = var->data; > + nvar->data.location += off; > + > + /* nir_variable_create is too clever for it's own good: */ > + exec_node_remove(&nvar->node); > + exec_node_self_link(&nvar->node); /* no delinit() :-( */ > This isn't needed. Re-insertion just stomps whatever was there before. Incidentally, if you had an array, you could just let nir_variable_create be clever. > + > + exec_list_push_tail(list, &nvar->node); > + > + /* remove existing var from input/output list: */ > + exec_node_remove(&var->node); > + exec_node_self_link(&var->node); > This isn't needed either. > + > + return nvar; > +} > + > +static unsigned > +get_deref_offset(struct lower_io_types_state *state, nir_deref *tail) > +{ > + unsigned offset = 0; > + > + while (tail->child != NULL) { > + const struct glsl_type *parent_type = tail->type; > + tail = tail->child; > + > + if (tail->deref_type == nir_deref_type_array) { > + nir_deref_array *deref_array = nir_deref_as_array(tail); > + > + /* indirect inputs/outputs should already be lowered! */ > + assert(deref_array->deref_array_type == > nir_deref_array_type_direct); > + > + unsigned size = state->type_size(tail->type); > + > + offset += size * deref_array->base_offset; > + } else if (tail->deref_type == nir_deref_type_struct) { > + nir_deref_struct *deref_struct = nir_deref_as_struct(tail); > + > + for (unsigned i = 0; i < deref_struct->index; i++) { > + offset += state->type_size(glsl_get_struct_field(parent_type, > i)); > + } > + } > + } > + > + return offset; > +} > + > +static bool > +lower_io_types_block(nir_block *block, void *void_state) > +{ > + struct lower_io_types_state *state = void_state; > + > + nir_foreach_instr(block, instr) { > + if (instr->type != nir_instr_type_intrinsic) > + continue; > + > + nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr); > + > + if ((intr->intrinsic != nir_intrinsic_load_var) && > + (intr->intrinsic != nir_intrinsic_store_var)) > + continue; > You should say somewhere in the top comment that copies need to also be lowered prior to this pass. > + > + nir_variable *var = intr->variables[0]->var; > + > + if ((var->data.mode != nir_var_shader_in) && > + (var->data.mode != nir_var_shader_out)) > + continue; > + > + /* no need to split up if already primitive */ > + if (state->type_size(var->type) == 1) > + continue; > + > + unsigned off = get_deref_offset(state, &intr->variables[0]->deref); > + nir_variable *nvar = get_new_var(state, var, off); > + > + /* and then re-write the load/store_var deref: */ > + intr->variables[0] = nir_deref_var_create(intr, nvar); > + } > + > + return true; > +} > + > +static void > +lower_io_types_impl(nir_function_impl *impl, struct lower_io_types_state > *state) > +{ > + nir_foreach_block(impl, lower_io_types_block, state); > + > + nir_metadata_preserve(impl, nir_metadata_block_index | > + nir_metadata_dominance); > +} > + > + > +void > +nir_lower_io_types(nir_shader *shader, int (*type_size)(const struct > glsl_type *)) > Just a side-note (not a request) but we really should have a "glsl_type_size_func" typedef somewhere. Also, what happened to just using nir_type_size_vec4? Incidentally, I think you could also use the newly exposed component_slots function from my series. > +{ > + struct lower_io_types_state state; > + > + /* NOTE: only operates on units of vec4 slots: */ > + assert(type_size(glsl_vec4_type()) == 1); > + > + state.shader = shader; > + exec_list_make_empty(&state.new_ins); > + exec_list_make_empty(&state.new_outs); > + state.type_size = type_size; > + > + nir_foreach_function(shader, function) { > + if (function->impl) > + lower_io_types_impl(function->impl, &state); > + } > + > + /* move new in/out vars to shader's lists: */ > + exec_list_append(&shader->inputs, &state.new_ins); > + exec_list_append(&shader->outputs, &state.new_outs); > +} > -- > 2.5.5 > > _______________________________________________ > mesa-dev mailing list > [email protected] > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
