vigyasharma commented on issue #12203: URL: https://github.com/apache/lucene/issues/12203#issuecomment-1502180974
> My basic idea is to write each field in parallel to a separate file and then perform a low-level merge of the binary data (just appending bytes to the final file). After that, I can rewrite only the metadata to update the offsets. This is an interesting approach, I'd like to explore it more. There has been some discussion on this problem in #9626 , that you might find useful. I think it makes sense to chip away at this problem with one format type at a time. Let's start with the approach you have in mind for doc-values. If you want to raise a PR for this, I'm happy to iterate with you on it. We can start with a draft PR that demonstrates the idea first, and benchmark it for viability. And eventually refine it to consolidate across the general segment merging framework. I was playing around with doing this for the postings format some time back. It has been on the shelf for some time, but I guess it's time to dust it off and try again. Hopefully, we can collaborate on some common learnings. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org