Andy Wingo <[email protected]> writes: > Thanks for the patch. What is its performance impact for your use > case?
Use case is a bit hard to benchmark from scratch as it's basically consing a list together and then reversing it. Of course, in a fresh session the consing is dominating the run time, while in a real-life session cons cells are usually available from a free list. Of course, one can just call scm_reverse_x in a loop without any consing at all, but that's probably a bit too much of a best use case. I could do this in order to give an upper limit to what improvement may be expected. In that case, if we are talking about large lists, I'd expect a factor of about 7:4 since the reversal takes 2 read-modify-writes per 2 elements, and tortoise and hare take 1 and 2 read cycles per 2 elements. The loops should be tight enough to fit into registers and executing cache, so those memory accesses should by far dominate the timing. And with a large list, all of those data accesses need to get their cache line at a different time. > On Thu 20 Mar 2014 12:23, David Kastrup <[email protected]> writes: > >> + /* We did not start with a proper list. Undo the reversal. */ > > I'm hesitant. This is visible to other threads. (Granted there has to > be significant wrongness if we get here...) Uh, what of it? The other threads have to deal with the reversal in the non-error case, and it does not have well-defined multithread semantics. So what you are saying is that you want to have multithread-safe semantics for the error case exclusively if I understand correctly. But the tortoise-hare algorithm used in SCM_VALIDATE_LIST is not multithread-robust either as far as I can see. At any rate, this additional reversal only happens in the error case. Tortoise-hare is read-only, so it will be quite faster than a double reversal (which dirties the cache of the chain twice, and if we are talking about a circular list, the cache of the non-circular part even four times). So if we want to have an efficient error behavior, this change is not going to be an improvement. It seems pointless to add an extra unvalidated reversal when it would be just as fast as _this_ validated reversal. -- David Kastrup
