On 23/2/19 8:54 am, Jason Ekstrand wrote:
On Fri, Feb 22, 2019 at 2:47 PM Timothy Arceri <[email protected] <mailto:[email protected]>> wrote:



    On 23/2/19 6:31 am, Rob Clark wrote:
     > On Fri, Feb 22, 2019 at 12:39 PM Jason Ekstrand
    <[email protected] <mailto:[email protected]>> wrote:
     >>
     >> On Fri, Feb 22, 2019 at 9:51 AM Eric Anholt <[email protected]
    <mailto:[email protected]>> wrote:
     >>>
     >>> Timothy Arceri <[email protected]
    <mailto:[email protected]>> writes:
     >>>
     >>>> shader-db results i965 (SKL):
     >>>>
     >>>> total instructions in shared programs: 13219105 -> 13024761
    (-1.47%)
     >>>> instructions in affected programs: 1169457 -> 975113 (-16.62%)
     >>>> helped: 599
     >>>> HURT: 154
     >>>>
     >>>> total cycles in shared programs: 333968972 -> 324822073 (-2.74%)
     >>>> cycles in affected programs: 130032440 -> 120885541 (-7.03%)
     >>>> helped: 590
     >>>> HURT: 216
     >>>>
     >>>> total spills in shared programs: 57947 -> 29130 (-49.73%)
     >>>> spills in affected programs: 53364 -> 24547 (-54.00%)
     >>>> helped: 351
     >>>> HURT: 0
     >>>>
     >>>> total fills in shared programs: 51310 -> 25468 (-50.36%)
     >>>> fills in affected programs: 44882 -> 19040 (-57.58%)
     >>>> helped: 351
     >>>> HURT: 0
     >>>> ---
     >>>>   src/compiler/nir/nir_lower_phis_to_scalar.c | 1 +
     >>>>   1 file changed, 1 insertion(+)
     >>>>
     >>>> diff --git a/src/compiler/nir/nir_lower_phis_to_scalar.c
    b/src/compiler/nir/nir_lower_phis_to_scalar.c
     >>>> index 16001f73685..f6f702bca15 100644
     >>>> --- a/src/compiler/nir/nir_lower_phis_to_scalar.c
     >>>> +++ b/src/compiler/nir/nir_lower_phis_to_scalar.c
     >>>> @@ -74,6 +74,7 @@ is_phi_src_scalarizable(nir_phi_src *src,
     >>>>         /* A phi is scalarizable if we're going to lower it */
     >>>>         return should_lower_phi(nir_instr_as_phi(src_instr),
    state);
     >>>>
     >>>> +   case nir_instr_type_tex:
     >>>>      case nir_instr_type_load_const:
     >>>>      case nir_instr_type_ssa_undef:
     >>>>         /* These are trivially scalarizable */
     >>>
     >>> Sounds promising, but I would definitely not describe
    instr_type_tex as
     >>> "trivially scalarizable" -- could you explain what's going on
    with this
     >>> patch?

    Basically it just turns:

    if ssa0 {
        ...
        vec4 ssa1 = txf .....
    } else {
         ...
         vec4 ssa2 = ...
    }
    vec4 ss3 = phi ssa1, ssa2

    Into

    if ssa0 {
        ...
        vec4 ssa1 = txf .....
        vec1 ssa2 = imov ssa1.x
        vec1 ssa3 = imov ssa1.y
        vec1 ssa4 = imov ssa1.z
        vec1 ssa5 = imov ssa1.w
    } else {
         ...
         vec4 ssa6 = ...
         vec1 ssa7 = imov ssa6.x
         vec1 ssa8 = imov ssa6.y
         vec1 ssa9 = imov ssa6.z
         vec1 ssa10 = imov ssa6.w
    }
    vec1 ss11 = phi ssa2, ssa7
    vec1 ss12 = phi ssa3, ssa8
    vec1 ss13 = phi ssa4, ssa9
    vec1 ss14 = phi ssa5, ssa10


    This allows a whole bunch more optimisation to take place as often not
    all of the phi channels are actually used. If some cases large
    chunks of
    logic can be remove from the if branch that doesn't contain the texture
    access.

     >>
     >>
     >> I think I can for Intel though I'm not sure how this affects
    other drivers.
     >>
     >> On Intel hardware, we almost always have to combine all the
    texture sources into one big message.  Since having more than one
    source is very common, this means that we have to make a temporary
    copy of the sources anyway.  Because we're copying them, having them
    contiguous (a vector in NIR terms) doesn't actually gain us
    anything.  We may as well let NIR scalarize them and give more
    freedom to the register allocator and other NIR passes which may
    need to clean things up.  We don't want to make the same choice for
    destinations as they are required to be contiguous.
     >>
     >
     > hmm, but this is abut the phi src (ie. the tex dest), not the tex
    src, isn't it?


Well, that's a pickle...  When I wrote this pass I did so to try and explicitly keep from breaking up things that are known to be vectors in the back-end such as textures.  The idea was to try and not break up things like this:

  if (...) {
     ssa_1 = tex()
} else {
     ssa_2 = tex()
}
ssa_3 = phi (ssa_1, ssa_2)

in the hopes that it would turn into

if (...) {
     r1 = tex();
} else {
     r1 = tex();
}

Clearly, that notion was mis-placed.  At this point, I really wonder what the complexity is saving us.  Maybe it's not worth it at all? Maybe we need to be more agressive and require all sources to not be vectorizable or something?

Maybe checking if all components of the phi are used before converting to scalar would be useful? Not sure but I've sent a more conservative v2 of this patch where a bunch of hurt is gone.

The remaining hurt is very small and comes from shaders like this:

 if (...) {
      r1 = tex();
 } else {
    if (...) {
      r1 = tex();
    } else {
      r1 = undefined;
    }
 }
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to