On Tue, Jan 16, 2018 at 10:21:44PM -0800, Jason Ekstrand wrote: > On Tue, Jan 16, 2018 at 4:30 PM, Nanley Chery <[email protected]> wrote: > > > On Tue, Jan 16, 2018 at 12:02:43PM -0800, Nanley Chery wrote: > > > On Sat, Jan 13, 2018 at 11:11:35AM -0800, Jason Ekstrand wrote: > > > > Sorry for all the list spam, but I'm sort of thinking out-loud and > > writing > > > > it on the list for all to read. > > > > > > > > I'm thinking that what we want this list to return is not a bool but > > an enum > > > > > > > > /* The ordering of this enum is important */ > > > > enum anv_fast_clear_support { > > > > ANV_FAST_CLEAR_NONE = 0, > > > > ANV_FAST_CLEAR_ZERO_ONLY = 1, > > > > ANV_FAST_CLEAR_ANY = 2, > > > > }; > > > > > > > > And then the predicate for whether or not to do the resolve becomes > > > > (has_compression && !supports_compression) || (image_fast_clear > > > > > supported_fast_clear). I still haven't quite figured out what to do > > with > > > > MI_PREDICATE to make this work out. I'm sure it's possible with MI > > math > > > > but maybe I can come up with something clever that doesn't require > > that. > > > > In the worst case, we only have to deal with this complexity on gen9+ > > where > > > > we have MI_MATH so it'll be ok. > > > > > > > > Does this seem like a reasonable plan? > > > > > > > > > > Thanks for asking about benchmark numbers. I checked the Sascha Willems > > > demos and an old build of Dota 2 and neither app used this optimization. > > > I haven't checked any other Vulkan application, but given that it's the > > > GENERAL layout I think we'd do well to drop the optimization and avoid > > > the additional complexity. Is that cool with you? > > > > > > > I just realized that this doesn't solve all the issues (see below). > > > > > I checked the git logs to learn more about why the driver does this. It > > > looks like fast-clearing with any color was supported in the GENERAL > > > layout back when we resolved at the end of the render pass. When porting > > > the code to use layout transitions, I tried to mitigate the possible > > > regressions with dcff5ab9f164afbc29c051b18990a377bb46e4bc. > > > > > > -Nanley > > > > > > > On Fri, Jan 12, 2018 at 5:23 PM, Jason Ekstrand <[email protected]> > > > > wrote: > > > > > > > > > I made a table to help visualize all the different cases: > > > > > > > > > > Layout | compression | zero clear | non-zero > > clear > > > > > ============================+=============+============+==== > > ============ > > > > > GENERAL | N | N | N > > > > > ----------------------------+-------------+------------+---- > > ------------ > > > > > COLOR_ATTACHMENT | Y | Y | Y > > > > > ----------------------------+-------------+------------+---- > > ------------ > > > > > SHADER_READ_ONLY | Y | Y | N > > > > > ----------------------------+-------------+------------+---- > > ------------ > > > > > GENERAL (RT/tex/blit only) | Y | Y | N > > > > > ----------------------------+-------------+------------+---- > > ------------ > > > > > PRESENT_SRC (Y_TILED) | N | N | N > > > > > ----------------------------+-------------+------------+---- > > ------------ > > > > > PRESENT_SRC (Y_TILED_CCS) | Y | N | N > > > > > ----------------------------+-------------+------------+---- > > ------------ > > > > > > > > > This looks good to me. Could the last entry support zero clear? > > > > I think we could probably define it to as I think that's what scanout does > when it sees a clear block. However, there's no guarantee that mesa > texturing from it will give you a zero clear color. I think it will but > that's more an artifact of history than intent. The GL driver assumes that > if something is in the COMPRESSED_NO_CLEAR state that there are exactly > zero clear blocks. We could adjust Y_TILED_CCS to mean COMPRESSED_CLEAR > with a clear color of 0. > >
Thanks for testing this. It's too bad the scanout hardware doesn't return the zero clear color. > > > > > We will need to think about what to do with this. Having non-zero > > clears > > > > > be different from zero clears is tricky. We may need to have even > > more > > > > > bits in our aux tracking. :-/ > > > > > > > > > > On January 12, 2018 16:05:12 Jason Ekstrand <[email protected]> > > wrote: > > > > > > > > > >> On Wed, Dec 13, 2017 at 11:26 AM, Nanley Chery < > > [email protected]> > > > > >> wrote: > > > > >> > > > > >>> On Mon, Nov 27, 2017 at 07:05:56PM -0800, Jason Ekstrand wrote: > > > > >>> > --- > > > > >>> > src/intel/vulkan/anv_image.c | 58 > > ++++++++++++++++++++++++++++++ > > > > >>> ++++++++++++ > > > > >>> > src/intel/vulkan/anv_private.h | 5 ++++ > > > > >>> > 2 files changed, 63 insertions(+) > > > > >>> > > > > > >>> > diff --git a/src/intel/vulkan/anv_image.c > > > > >>> b/src/intel/vulkan/anv_image..c > > > > >>> > > > > >>> > index a872149..561da28 100644 > > > > >>> > --- a/src/intel/vulkan/anv_image.c > > > > >>> > +++ b/src/intel/vulkan/anv_image.c > > > > >>> > @@ -837,6 +837,64 @@ anv_layout_to_aux_usage(const struct > > > > >>> gen_device_info * const devinfo, > > > > >>> > unreachable("layout is not a VkImageLayout enumeration > > member."); > > > > >>> > } > > > > >>> > > > > > >>> > +/** > > > > >>> > + * This function returns true if the given image in the given > > > > >>> VkImageLayout > > > > >>> > + * supports unresolved fast-clears. > > > > >>> > + * > > > > >>> > + * @param devinfo The device information of the Intel GPU. > > > > >>> > + * @param image The image that may contain a collection of > > buffers. > > > > >>> > + * @param aspect The aspect of the image to be accessed. > > > > >>> > + * @param layout The current layout of the image aspect(s). > > > > >>> > + */ > > > > >>> > +bool > > > > >>> > +anv_layout_supports_fast_clear(const struct gen_device_info * > > const > > > > >>> devinfo, > > > > >>> > + const struct anv_image * const > > image, > > > > >>> > + const VkImageAspectFlagBits > > aspect, > > > > >>> > + const VkImageLayout layout) > > > > >>> > +{ > > > > >>> > + /* The aspect must be exactly one of the image aspects. */ > > > > >>> > + assert(_mesa_bitcount(aspect) == 1 && (aspect & > > image->aspects)); > > > > >>> > + > > > > >>> > + uint32_t plane = anv_image_aspect_to_plane(image->aspects, > > > > >>> aspect); > > > > >>> > + > > > > >>> > + /* If there is no auxiliary surface allocated, there are no > > > > >>> fast-clears */ > > > > >>> > + if (image->planes[plane].aux_surface.isl.size == 0) > > > > >>> > + return false; > > > > >>> > + > > > > >>> > + /* All images that use an auxiliary surface are required to > > be > > > > >>> tiled. */ > > > > >>> > + assert(image->tiling == VK_IMAGE_TILING_OPTIMAL); > > > > >>> > + > > > > >>> > + /* Stencil has no aux */ > > > > >>> > + assert(aspect != VK_IMAGE_ASPECT_STENCIL_BIT); > > > > >>> > + > > > > >>> > + if (aspect == VK_IMAGE_ASPECT_DEPTH_BIT) { > > > > >>> > + /* For depth images (with HiZ), the layout supports > > fast-clears > > > > >>> if and > > > > >>> > + * only if it supports HiZ. > > > > >>> > + */ > > > > >>> > + enum isl_aux_usage aux_usage = > > > > >>> > + anv_layout_to_aux_usage(devinfo, image, aspect, > > layout); > > > > >>> > + return aux_usage == ISL_AUX_USAGE_HIZ; > > > > >>> > + } > > > > >>> > + > > > > >>> > + assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV); > > > > >>> > + > > > > >>> > + /* Multisample fast-clear is not yet supported. */ > > > > >>> > + if (image->samples > 1) > > > > >>> > + return false; > > > > >>> > + > > > > >>> > + /* The only layout which actually supports fast-clears today > > is > > > > >>> > + * VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL. Some day in > > the > > > > >>> future > > > > >>> > + * this may change if our ability to track clear colors > > improves. > > > > >>> > + */ > > > > >>> > + switch (layout) { > > > > >>> > + case VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL: > > > > >>> > + return true; > > > > >>> > + > > > > >>> > > > > >>> This is kind of tricky. We actually allow fast-clears for CCS_E > > textures > > > > >>> in the GENERAL layout if the clear color is all zeros. When this > > > > >>> happens, we set the needs_resolve predicate to false. This means > > that > > > > >>> unresolved fast-clears is potentially in use for CCS_E images in > > any > > > > >>> layout. > > > > >>> > > > > >> > > > > >> Hrm... This is going to get interesting. This works but only if we > > > > >> assume that the thing using it in the general layout is rendering or > > > > >> texturing. If it's storage, this doesn't work. That said, storage > > images > > > > >> are CCS_D-only so that's not a problem now. The bigger problem is > > that > > > > >> this means the resolve won't happen for window-system images and > > that's > > > > >> bad. I'm not sure what the right path forward is; I'll have to > > think on it. > > > > >> > > > > >> Do we have any benchmark numbers to justify this clever trick? > > > > >> > > > > >> > > > > I just realized that we have the same problem with CCS_E-enabled images > > in the COLOR_ATTACHMENT_OPTIMAL layout. We set the needs_resolve bit > > to false when such images are fast-cleared transparent black. Thus any > > layout of a CCS_E-enabled image may have an unresolved fast-clear. > > > > I think those are effectively the same issue but maybe I'm misunderstanding > something. > > Yes, they're the same issue. I may have been overly verbose. Sorry about that. > > Quick testing shows that this gives an ~13% FPS increase in the Sascha > > Willems deferred demo. The following other demos use this but provide > > no visible change in performance: debugmarker, deferredshadows, hdr, > > offscreen, radialblur, subpasses. One image in Dota 2 makes use of this, > > but I didn't measure the performance impact (could be nil). > > > > That's unfortunate... I'm not seeing a lot of other options off-hand than > to just track zero vs. non-zero clear colors. > Yeah, neither am I. > On another note, does this image in the deferred demo happen to ever use > STORE_OP_DONT_CARE? If so, we may be able to whack the needs_resolve bit > to off. > > I looked at the source and the image doesn't seem to use STORE_OP_DONT_CARE. _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
