original-brownbear commented on PR #13472: URL: https://github.com/apache/lucene/pull/13472#issuecomment-2174188876
> May be I am missing something, but the implementation basically introduces "double queuing": task executor has one and very likely the supplied executor would have one, both are competing over taskId (directly or indirectly) to get something done. You're right this is one of the remaining areas of contention that could be fixed for even better performance. Using a `ForkJoinPool` also has some potential for reducing the impact of memory barriers. I wonder if this is the right area to optimize though? Looking at the results of concurrency vs. no-concurrency in https://github.com/apache/lucene/pull/13472#issuecomment-2173609575 I'm inclined to think this is not the code we should optimize further. Even with this change we're at times 2x-3x (vs. 8-9x without my changes) slower than `main` without concurrency for some parts of the benchmark. I don't think we can eliminate all the overhead of requiring some memory barriers for synchronising tasks. So maybe the problem eventually just is with tasks that are too small? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org