On 21/04/16 13:05, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <[email protected]>
i915_gem_obj_to_vma is one of the most expensive functions in
our profiles. Could avoiding some branching by replacing it
with arithmetic be beneficial? Some benchmarks suggest it
slightly might.
Signed-off-by: Tvrtko Ursulin <[email protected]>
---
drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0549dea683e1..243bfb922eb3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4642,11 +4642,21 @@ struct i915_vma *i915_gem_obj_to_vma(struct
drm_i915_gem_object *obj,
struct i915_address_space *vm)
{
struct i915_vma *vma;
+
+ BUILD_BUG_ON(I915_GGTT_VIEW_NORMAL != 0);
+
list_for_each_entry(vma, &obj->vma_list, obj_link) {
- if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL &&
- vma->vm == vm)
+ /*
+ * Below is just a branching avoiding way of saying:
+ * vma_ggtt_view.type == I915_GGTT_VIEW_NORMAL && vma->vm == vm,
+ * which relies on the fact I915_GGTT_VIEW_NORMAL has to be
+ * zero.
+ */
+ if (!((unsigned long)vma->ggtt_view.type |
+ ((unsigned long)vma->vm ^ (unsigned long)vm)))
return vma;
}
+
return NULL;
}
Other alternatives might include splitting the vma_list, so that we have
one list for the most-frequently searched-for entries (GGTT view NORMAL)
and for everything else, so the above would just need a single test for
equality.
Or, slightly less effectively, add GGTT/NORMAL entries at the head of
the list and others at the tail (and search backwards if you *don't*
want a GGTT/NORMAL entry). That would still need the comparisons, but
would likely hit an early match.
.Dave.
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx