bug#17049: [PATCH] Make reverse! forego the cost of SCM_VALIDATE_LIST

David Kastrup Thu, 20 Mar 2014 15:01:36 -0700

Andy Wingo <[email protected]> writes:

> Thanks for the patch.  What is its performance impact for your use
> case?

Use case is a bit hard to benchmark from scratch as it's basically
consing a list together and then reversing it.  Of course, in a fresh
session the consing is dominating the run time, while in a real-life
session cons cells are usually available from a free list.

Of course, one can just call scm_reverse_x in a loop without any consing
at all, but that's probably a bit too much of a best use case.  I could
do this in order to give an upper limit to what improvement may be
expected.

In that case, if we are talking about large lists, I'd expect a factor
of about 7:4 since the reversal takes 2 read-modify-writes per 2
elements, and tortoise and hare take 1 and 2 read cycles per 2 elements.
The loops should be tight enough to fit into registers and executing
cache, so those memory accesses should by far dominate the timing. And
with a large list, all of those data accesses need to get their cache
line at a different time.

> On Thu 20 Mar 2014 12:23, David Kastrup <[email protected]> writes:
>
>> +  /* We did not start with a proper list.  Undo the reversal. */
>
> I'm hesitant.  This is visible to other threads.  (Granted there has to
> be significant wrongness if we get here...)

Uh, what of it?  The other threads have to deal with the reversal in the
non-error case, and it does not have well-defined multithread semantics.
So what you are saying is that you want to have multithread-safe
semantics for the error case exclusively if I understand correctly.

But the tortoise-hare algorithm used in SCM_VALIDATE_LIST is not
multithread-robust either as far as I can see.

At any rate, this additional reversal only happens in the error case.
Tortoise-hare is read-only, so it will be quite faster than a double
reversal (which dirties the cache of the chain twice, and if we are
talking about a circular list, the cache of the non-circular part even
four times).  So if we want to have an efficient error behavior, this
change is not going to be an improvement.

It seems pointless to add an extra unvalidated reversal when it would be
just as fast as _this_ validated reversal.

-- 
David Kastrup

bug#17049: [PATCH] Make reverse! forego the cost of SCM_VALIDATE_LIST

Reply via email to