On Tue, May 24, 2016 at 3:41 PM, Rob Clark <[email protected]> wrote: > On Mon, May 16, 2016 at 3:16 PM, Jason Ekstrand <[email protected]> wrote: >> On Mon, May 16, 2016 at 9:09 AM, Rob Clark <[email protected]> wrote: >>> >>> On Mon, May 16, 2016 at 10:45 AM, Jason Ekstrand <[email protected]> >>> wrote: >>> > >>> > On May 16, 2016 7:29 AM, "Rob Clark" <[email protected]> wrote: >>> >> >>> >> On Sat, May 14, 2016 at 4:03 PM, Jason Ekstrand <[email protected]> >>> >> wrote: >>> >> > >>> >> > >>> >> > On Sat, May 14, 2016 at 12:20 PM, Rob Clark <[email protected]> >>> >> > wrote: >>> >> >> >>> >> >> On Thu, May 12, 2016 at 10:55 PM, Jason Ekstrand >>> >> >> <[email protected]> >>> >> >> wrote: >>> >> >> > >>> >> >> > >>> >> >> > On Tue, May 10, 2016 at 11:57 AM, Rob Clark <[email protected]> >>> >> >> > wrote: >>> >> >> >> >>> >> >> >> From: Rob Clark <[email protected]> >>> >> >> >> >>> >> >> >> Some optimizations, like converting integer multiply/divide into >>> >> >> >> left/ >>> >> >> >> right shifts, have additional constraints on the search >>> >> >> >> expression. >>> >> >> >> Like requiring that a variable is a constant power of two. >>> >> >> >> Support >>> >> >> >> these cases by allowing a fxn name to be appended to the search >>> >> >> >> var >>> >> >> >> expression (ie. "a#32(is_power_of_two)"). >>> >> >> >> >>> >> >> >> TODO update doc/comment explaining search var syntax >>> >> >> >> TODO the eagle-eyed viewer might have noticed that this could >>> >> >> >> also >>> >> >> >> replace the existing const syntax (ie. "#a"). Not sure if we >>> >> >> >> should >>> >> >> >> keep that.. we could make it syntactic sugar (ie '#' >>> >> >> >> automatically >>> >> >> >> sets >>> >> >> >> the cond fxn ptr to 'is_const') or just get rid of it entirely? >>> >> >> >> Maybe >>> >> >> >> that is a follow-on clean-up patch? >>> >> >> >> >>> >> >> >> Signed-off-by: Rob Clark <[email protected]> >>> >> >> >> --- >>> >> >> >> src/compiler/nir/nir_algebraic.py | 8 +++-- >>> >> >> >> src/compiler/nir/nir_opt_algebraic.py | 5 +++ >>> >> >> >> src/compiler/nir/nir_search.c | 3 ++ >>> >> >> >> src/compiler/nir/nir_search.h | 10 ++++++ >>> >> >> >> src/compiler/nir/nir_search_helpers.h | 66 >>> >> >> >> +++++++++++++++++++++++++++++++++++ >>> >> >> >> 5 files changed, 90 insertions(+), 2 deletions(-) >>> >> >> >> create mode 100644 src/compiler/nir/nir_search_helpers.h >>> >> >> >> >>> >> >> >> diff --git a/src/compiler/nir/nir_algebraic.py >>> >> >> >> b/src/compiler/nir/nir_algebraic.py >>> >> >> >> index 285f853..19ac6ee 100644 >>> >> >> >> --- a/src/compiler/nir/nir_algebraic.py >>> >> >> >> +++ b/src/compiler/nir/nir_algebraic.py >>> >> >> >> @@ -76,6 +76,7 @@ class Value(object): >>> >> >> >> return Constant(val, name_base) >>> >> >> >> >>> >> >> >> __template = mako.template.Template(""" >>> >> >> >> +#include "compiler/nir/nir_search_helpers.h" >>> >> >> >> static const ${val.c_type} ${val.name} = { >>> >> >> >> { ${val.type_enum}, ${val.bit_size} }, >>> >> >> >> % if isinstance(val, Constant): >>> >> >> >> @@ -84,6 +85,7 @@ static const ${val.c_type} ${val.name} = { >>> >> >> >> ${val.index}, /* ${val.var_name} */ >>> >> >> >> ${'true' if val.is_constant else 'false'}, >>> >> >> >> ${val.type() or 'nir_type_invalid' }, >>> >> >> >> + ${val.cond if val.cond else 'NULL'}, >>> >> >> >> % elif isinstance(val, Expression): >>> >> >> >> ${'true' if val.inexact else 'false'}, >>> >> >> >> nir_op_${val.opcode}, >>> >> >> >> @@ -113,7 +115,7 @@ static const ${val.c_type} ${val.name} = { >>> >> >> >> Variable=Variable, >>> >> >> >> Expression=Expression) >>> >> >> >> >>> >> >> >> -_constant_re = >>> >> >> >> re.compile(r"(?P<value>[^@]+)(?:@(?P<bits>\d+))?") >>> >> >> >> +_constant_re = >>> >> >> >> re.compile(r"(?P<value>[^@\(]+)(?:@(?P<bits>\d+))?") >>> >> >> > >>> >> >> > >>> >> >> > Spurious change? >>> >> >> > >>> >> >> >>> >> >> I thought it needed to avoid matching something like >>> >> >> a(is_power_of_two).. but it seems to work with that hunk reverted so >>> >> >> I >>> >> >> guess I can drop it.. >>> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> class Constant(Value): >>> >> >> >> def __init__(self, val, name): >>> >> >> >> @@ -150,7 +152,8 @@ class Constant(Value): >>> >> >> >> return "nir_type_float" >>> >> >> >> >>> >> >> >> _var_name_re = re.compile(r"(?P<const>#)?(?P<name>\w+)" >>> >> >> >> - >>> >> >> >> r"(?:@(?P<type>int|uint|bool|float)?(?P<bits>\d+)?)?") >>> >> >> >> + >>> >> >> >> r"(?:@(?P<type>int|uint|bool|float)?(?P<bits>\d+)?)?" >>> >> >> >> + r"(?P<cond>\([^\)]+\))?") >>> >> >> >> >>> >> >> >> class Variable(Value): >>> >> >> >> def __init__(self, val, name, varset): >>> >> >> >> @@ -161,6 +164,7 @@ class Variable(Value): >>> >> >> >> >>> >> >> >> self.var_name = m.group('name') >>> >> >> >> self.is_constant = m.group('const') is not None >>> >> >> >> + self.cond = m.group('cond') >>> >> >> >> self.required_type = m.group('type') >>> >> >> >> self.bit_size = int(m.group('bits')) if m.group('bits') >>> >> >> >> else >>> >> >> >> 0 >>> >> >> >> >>> >> >> >> diff --git a/src/compiler/nir/nir_opt_algebraic.py >>> >> >> >> b/src/compiler/nir/nir_opt_algebraic.py >>> >> >> >> index 0a95725..952a91a 100644 >>> >> >> >> --- a/src/compiler/nir/nir_opt_algebraic.py >>> >> >> >> +++ b/src/compiler/nir/nir_opt_algebraic.py >>> >> >> >> @@ -62,6 +62,11 @@ d = 'd' >>> >> >> >> # constructed value should have that bit-size. >>> >> >> >> >>> >> >> >> optimizations = [ >>> >> >> >> + >>> >> >> >> + (('imul', a, '#b@32(is_power_of_two)'), ('ishl', a, >>> >> >> >> ('find_lsb', >>> >> >> >> b))), >>> >> >> >> + (('udiv', a, '#b@32(is_power_of_two)'), ('ushr', a, >>> >> >> >> ('find_lsb', >>> >> >> >> b))), >>> >> >> >> + (('umod', a, '#b(is_power_of_two)'), ('iand', a, ('isub', >>> >> >> >> b, >>> >> >> >> 1))), >>> >> >> >> + >>> >> >> >> (('fneg', ('fneg', a)), a), >>> >> >> >> (('ineg', ('ineg', a)), a), >>> >> >> >> (('fabs', ('fabs', a)), ('fabs', a)), >>> >> >> >> diff --git a/src/compiler/nir/nir_search.c >>> >> >> >> b/src/compiler/nir/nir_search.c >>> >> >> >> index 2c2fd92..b21fb2c 100644 >>> >> >> >> --- a/src/compiler/nir/nir_search.c >>> >> >> >> +++ b/src/compiler/nir/nir_search.c >>> >> >> >> @@ -127,6 +127,9 @@ match_value(const nir_search_value *value, >>> >> >> >> nir_alu_instr *instr, unsigned src, >>> >> >> >> instr->src[src].src.ssa->parent_instr->type != >>> >> >> >> nir_instr_type_load_const) >>> >> >> >> return false; >>> >> >> >> >>> >> >> >> + if (var->cond && !var->cond(instr, src, num_components, >>> >> >> >> new_swizzle)) >>> >> >> >> + return false; >>> >> >> >> + >>> >> >> >> if (var->type != nir_type_invalid) { >>> >> >> >> if (instr->src[src].src.ssa->parent_instr->type != >>> >> >> >> nir_instr_type_alu) >>> >> >> >> return false; >>> >> >> >> diff --git a/src/compiler/nir/nir_search.h >>> >> >> >> b/src/compiler/nir/nir_search.h >>> >> >> >> index a500feb..f55d797 100644 >>> >> >> >> --- a/src/compiler/nir/nir_search.h >>> >> >> >> +++ b/src/compiler/nir/nir_search.h >>> >> >> >> @@ -68,6 +68,16 @@ typedef struct { >>> >> >> >> * never match anything. >>> >> >> >> */ >>> >> >> >> nir_alu_type type; >>> >> >> >> + >>> >> >> >> + /** Optional condition fxn ptr >>> >> >> >> + * >>> >> >> >> + * This is only allowed in search expressions, and allows >>> >> >> >> additional >>> >> >> >> + * constraints to be placed on the match. Typically used for >>> >> >> >> 'is_constant' >>> >> >> >> + * variables to require, for example, power-of-two in order >>> >> >> >> for >>> >> >> >> the >>> >> >> >> search >>> >> >> >> + * to match. >>> >> >> >> + */ >>> >> >> >> + bool (*cond)(nir_alu_instr *instr, unsigned src, >>> >> >> >> + unsigned num_components, const uint8_t >>> >> >> >> *swizzle); >>> >> >> >> } nir_search_variable; >>> >> >> >> >>> >> >> >> typedef struct { >>> >> >> >> diff --git a/src/compiler/nir/nir_search_helpers.h >>> >> >> >> b/src/compiler/nir/nir_search_helpers.h >>> >> >> >> new file mode 100644 >>> >> >> >> index 0000000..8ed2ca0 >>> >> >> >> --- /dev/null >>> >> >> >> +++ b/src/compiler/nir/nir_search_helpers.h >>> >> >> >> @@ -0,0 +1,66 @@ >>> >> >> >> +/* >>> >> >> >> + * Copyright © 2016 Red Hat >>> >> >> >> + * >>> >> >> >> + * Permission is hereby granted, free of charge, to any person >>> >> >> >> obtaining >>> >> >> >> a >>> >> >> >> + * copy of this software and associated documentation files (the >>> >> >> >> "Software"), >>> >> >> >> + * to deal in the Software without restriction, including >>> >> >> >> without >>> >> >> >> limitation >>> >> >> >> + * the rights to use, copy, modify, merge, publish, distribute, >>> >> >> >> sublicense, >>> >> >> >> + * and/or sell copies of the Software, and to permit persons to >>> >> >> >> whom >>> >> >> >> the >>> >> >> >> + * Software is furnished to do so, subject to the following >>> >> >> >> conditions: >>> >> >> >> + * >>> >> >> >> + * The above copyright notice and this permission notice >>> >> >> >> (including >>> >> >> >> the >>> >> >> >> next >>> >> >> >> + * paragraph) shall be included in all copies or substantial >>> >> >> >> portions >>> >> >> >> of >>> >> >> >> the >>> >> >> >> + * Software. >>> >> >> >> + * >>> >> >> >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>> >> >> >> KIND, >>> >> >> >> EXPRESS OR >>> >> >> >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>> >> >> >> MERCHANTABILITY, >>> >> >> >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>> >> >> >> EVENT >>> >> >> >> SHALL >>> >> >> >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, >>> >> >> >> DAMAGES >>> >> >> >> OR >>> >> >> >> OTHER >>> >> >> >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>> >> >> >> OTHERWISE, >>> >> >> >> ARISING >>> >> >> >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR >>> >> >> >> OTHER >>> >> >> >> DEALINGS >>> >> >> >> + * IN THE SOFTWARE. >>> >> >> >> + * >>> >> >> >> + * Authors: >>> >> >> >> + * Rob Clark <[email protected]> >>> >> >> >> + */ >>> >> >> >> + >>> >> >> >> +#ifndef _NIR_SEARCH_HELPERS_ >>> >> >> >> +#define _NIR_SEARCH_HELPERS_ >>> >> >> >> + >>> >> >> >> +#include "nir.h" >>> >> >> >> + >>> >> >> >> +static inline bool >>> >> >> >> +__is_power_of_two(unsigned int x) >>> >> >> >> +{ >>> >> >> >> + return ((x != 0) && !(x & (x - 1))); >>> >> >> >> +} >>> >> >> >> + >>> >> >> >> +static inline bool >>> >> >> >> +is_power_of_two(nir_alu_instr *instr, unsigned src, unsigned >>> >> >> >> num_components, >>> >> >> >> + const uint8_t *swizzle) >>> >> >> >> +{ >>> >> >> >> + nir_const_value *val = >>> >> >> >> nir_src_as_const_value(instr->src[src].src); >>> >> >> >> + >>> >> >> >> + /* only constant src's: */ >>> >> >> >> + if (!val) >>> >> >> >> + return false; >>> >> >> >> + >>> >> >> >> + for (unsigned i = 0; i < num_components; i++) { >>> >> >> >> + switch (nir_op_infos[instr->op].input_types[src]) { >>> >> >> >> + case nir_type_int: >>> >> >> >> + if (!__is_power_of_two(val->i32[swizzle[i]])) >>> >> >> >> + return false; >>> >> >> >> + break; >>> >> >> >> + case nir_type_uint: >>> >> >> >> + if (!__is_power_of_two(val->u32[swizzle[i]])) >>> >> >> >> + return false; >>> >> >> > >>> >> >> > >>> >> >> > Your implementation of __is_power_of_two takes an unsigned. >>> >> >> > There's >>> >> >> > no >>> >> >> > benefit to splitting it into two cases and it just creates a false >>> >> >> > distinction. If you do, you can probably inline the helper (I >>> >> >> > don't >>> >> >> > care >>> >> >> > much one way or the other on that). >>> >> >> >>> >> >> hmm, are you sure about negative PoT? Although possibly that should >>> >> >> be '__is_power_of_two(ABS(val->i32[..])'? >>> >> > >>> >> > >>> >> > I don't know off-hand whether or not it's correct for negative >>> >> > power-of-two. >>> >> > However, I do know that it does exactly the same thing in both cases >>> >> > regardless of whether or not the first one is correct :-) >>> >> > --Jason >>> >> >>> >> so, for the record, __is_power_of_two(ABS(val->i32[..])) seems to be >>> >> the correct thing to do. That would let us add a (presumably >>> >> optional) rule to lower idiv to ushr plus a multiply by the isign of >>> >> each of the args.. >>> > >>> > Unfortunately, idiv can never be lowered because -1 >> 1 == -1. Yeah, >>> > that's annoying. >>> > >>> > Also, I don't think any of the optimizations you've added here works for >>> > negative numbers without a sign-flip. We would need a pair of >>> > optimizations: one for non-negative powers of two and one for negative. >>> >>> so pretty sure this isn't an issue, since (udiv, a, >>> #b(is_power_of_two)) hits the nir_type_uint case in is_power_of_two() >>> and doesn't recognize -2 (for example) as a PoT.. >>> >>> possibly we could do a bit better, I think we'd have to split >>> is_power_of_two() into is_neg_power_of_two() and >>> is_pos_power_of_two().. >> >> >> I like that plan >> > > ok, for what it's worth, my current version of this patch has: > > (('imul', a, '#b@32(is_pos_power_of_two)'), ('ishl', a, ('find_lsb', b))), > (('imul', a, '#b@32(is_neg_power_of_two)'), ('ineg', ('ishl', a, > ('find_lsb', ('iabs', b))))), > (('udiv', a, '#b@32(is_pos_power_of_two)'), ('ushr', a, ('find_lsb', b))), > (('idiv', a, '#b@32(is_pos_power_of_two)'), ('imul', ('isign', a), > ('ushr', ('iabs', a), ('find_lsb', b))), 'options->lower_idiv'), > (('idiv', a, '#b@32(is_neg_power_of_two)'), ('ineg', ('imul', > ('isign', a), ('ushr', ('iabs', a), ('find_lsb', ('iabs', b))))), > 'options->lower_idiv'), > (('umod', a, '#b(is_pos_power_of_two)'), ('iand', a, ('isub', b, 1))), > > In theory this should be wrong, because (iabs MIN_INT) -> MIN_INT.. > > but things are a bit strange.. I hacked up a .shader_test: > > ---- > [require] > GLSL >= 1.30 > > [vertex shader] > void main() > { > gl_Position = gl_Vertex; > } > > [fragment shader] > uniform int arg0; > uniform int expected; > > void main() > { > int result = (arg0 / 2); > gl_FragColor = result == expected ? vec4(0.0, 1.0, 0.0, 1.0) : > vec4(1.0, 0.0, 0.0, 1.0); > } > > [test] > clear color 0.0 0.0 1.0 0.0 > clear > uniform int arg0 0x80000000 > uniform int expected 0xc0000000 > draw rect ortho 0 0 4 4 > probe rect rgba (0, 0, 4, 4) (0.0, 1.0, 0.0, 1.0) > ---- > > oddly (even w/out opt-algebraic rules), it fails for > llvmpipe/softpipe/freedreno on armv7. But passes for > llvmpipe/softpipe/i965 on x86. Not entirely sure what to make of > that.
Ilia realized this was to do w/ strtol() vs 32b ints (so I guess it would have passed on armv8..). I should have been using: uniform int arg0 -2147483648 uniform int expected -1073741824 and now I get the same result on freedreno now, as i965, with that lowering rule. (Ie. both passing.) So I guess it does work for the MIN_INT case BR, -R > Now the really odd thing is if I enable the opt rule (and I can see > from INTEL_DEBUG=fs.. http://hastebin.com/meyekajake.md), that the > test still passes. Not entirely sure how that is possible. > > I'll have to rig up some test program on arm to see what is really > going on, but since it effects softpipe and llvmpipe I guess it is > unrelated. > > BR, > -R _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
