RS146BIJAY commented on issue #13387: URL: https://github.com/apache/lucene/issues/13387#issuecomment-2360649893
## Approach 3: Combining group level IndexWriter with addIndexes ![approach3](https://github.com/user-attachments/assets/32ea3baa-0ae6-4a60-84e9-352a0e1e6a5e) In this approach, in order to make multiple group-level IndexWriters function as a unified entity, we use the Lucene’s addIndxes api to combine them. This ensures that the top-level IndexWriter shares a common segment_N, SegmentCommitInfos, generation and version. During indexing or update request, the client (such as OpenSearch) will continue to route requests to the appropriate IndexWriter based on the documents’s criteria evaluation. During flush, in addition to flushing the segments of the group-level IndexWriters, we will merge/move them into a single parent IndexWriter using the addIndexes API call. For read (or replication) operations, the client (like OpenSearch) will now open a Reader on the parent IndexWriter. ### Pros 1. Having a common IndexWriter with a single SegmentCommitInfos, generation etc, ensures that client (OpenSearch) is still interacting with Lucene using a single entity. ### Cons 1. When segments of different groups are combined into a single index, we must ensure that only segment within a group are merged together. This will require a new merge policy for top level IndexWriter. 2. The Lucene addIndexes API seems to acquire a write lock on each directory associated with group level IndexWriters, preventing active writes during the Index merging process. This can cause a downtime on the client (OpenSearch) side during this period. However, this issue could be mitigated if Lucene provided an API to combine these group level indexes as a soft reference, without copying the segment files or locking the group level IndexWriters. 3. Additionally, index merging involves copying files from group level IndexWriters’ directory to parent IndexWriter directory. This is a resource intensive operation, consuming disk IO and CPU cycles. Moreover, since we open a Reader on the parent IndexWriter (combined IndexWriter from group level IndexWriters), slow index merging may impact reader refresh times delaying visibility of changes for search. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org