mikemccand commented on issue #12203: URL: https://github.com/apache/lucene/issues/12203#issuecomment-1513800971
> the rewriting of doc values from the original segment (10 millions docs) took 318 seconds, which is comparable to the time it takes to merge posting lists. The fully parallel writing (w/o a final metadata update) took 23 seconds! Wow this is amazing speedup! Can you provide some details? Where is the prototype implementation, how many cores/threads did you use, etc. @rmuir's concern about concurrently enumerating multiple `OrdinalMap`s is a real risk: this data structure can be memory consuming, especially for high cardinality (many unique strings) fields, and, indices where each segment has very different sets of strings for that field. But maybe if we limited the concurrency of such costly fields in some way this might be OK. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org