mikemccand commented on issue #12986: URL: https://github.com/apache/lucene/issues/12986#issuecomment-1873933126
I agree with @rmuir -- promising backwards compatibility (API or index format) is a huge burden on Lucene developers, and it's hard enough with the default Codec today. Given that the bloom postings format is still very much in flux, let's wait on removing the experimental tag. E.g. we are also [pursuing another experimental codec (inspired by Tantivy)](https://github.com/apache/lucene/pull/12688) that also seems to speed up the primary-key lookup use case. Note that OpenSearch devs could also choose to offer this backwards compatibility to its users. The promise need not be implemented only in Lucene. Thank you for sharing those benchmark results. That is indeed quite a sizable impact on indexing throughput / long-pole latencies, especially as you greatly increase the bloom filter size to lower the false-positive rate. It looks like that test was 100% `updateDocument` calls with 25% of the updates being updates not appends? +1 to pursue the linked improvements (off-heap option) -- the [linked PR](https://github.com/opensearch-project/OpenSearch/pull/11027) looks interesting -- maybe open a PR here for that? Or is that change somehow OpenSearch specific? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org