Hi, When the Ryzen CPUs were launched, they didn't perform very well in games, and it took a while before games were patched. Guess what, Mesa drivers have suffered from the same inefficincies until now.
The AMD Zen architecture has multiple core complexes (CCX) where each CCX has e.g. 4C/8T and always one L3 cache. If application and driver threads don't run on the same CCX, communication between threads is slow, because multiple L3 caches must maintain coherency between them. Atomic operations seem to suffer the most, almost as if they were uncached. (are they?) This series pins the application thread and all driver execution threads to 1 L3 cache (1 CCX). If the application thread is already pinned to a hw thread or core(s), all driver threads are pinned to the same L3 cache (CCX) as the application thread. Shader compiler threads are unpinned, as they are not critical. The piglit/drawoverhead microbenchmark shows that this increases performance by 32% for DrawElements and 25% for DrawArrays on Ryzen 1st-Gen CPUs. It will probably be much less with real apps. Please review. Thanks, Marek _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
