On Tue, May 17, 2016 at 1:52 AM, Bas Nieuwenhuizen <[email protected]> wrote: > On Mon, May 16, 2016 at 10:15 PM, Marek Olšák <[email protected]> wrote: >> On Fri, May 13, 2016 at 3:37 AM, Bas Nieuwenhuizen >> <[email protected]> wrote: >>> Using more than 1 wave per threadgroup does increase performance >>> generally. Not using too many patches per threadgroup also >>> increases performance. Both catalyst and amdgpu-pro seem to >>> use 40 patches as their maximum, but I haven't really seen >>> any performance increase from limiting the number of patches >>> to 40 instead of 64. >> >> 40 may be optimal for existing OpenGL apps on some chips. >> >> Vulkan doesn't set more than 16. >> >> Let's set either 40 or 16 with a comment where the value comes from. > > IIRC heaven was more performant with multiple waves per threadgroup, > which means >16 patches, as it uses 3 CP's per patch. Not sure about > 40 and I'm away from my dev machine at the moment.
OK. Maybe Vulkan sets more than 16 using external settings not specified by its code. > >> >>> >>> Note that the trick where we overlap the input and output LDS >>> does not work anymore as the insertion of the tess factors >>> changes the patch stride. >> >> I don't understand this. Can you explain it more? > > When we didn't have a TCS, we would just use TCS input as TCS output > and let the fixed function TCS add the per patch outputs (tess > factors) at the end. > > This works fine when you have a single patch, but not with multiple. > To see why we have to look at the input/output format in LDS. This is > > Attributes for patch 0 vertex 0. > Attributes for patch 0 vertex 1. > ... > Per patch attributes for patch 0. > Attributes for patch 1 vertex 0. > ... > > So the number of per patch attributes changes the stride between > patches. As the LS output has 0 per patch attributes, and TCS output > has at least the tess factors this differs. Therefore the second and > later patches start at different offset in TCS input and output, so we > need to copy or move them. > > I hope this makes things a bit more clear. Thanks for the explanation. Marek _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
