RS146BIJAY commented on issue #13387:
URL: https://github.com/apache/lucene/issues/13387#issuecomment-2360649893

   ## Approach 3: Combining group level IndexWriter with addIndexes
   
   
![approach3](https://github.com/user-attachments/assets/32ea3baa-0ae6-4a60-84e9-352a0e1e6a5e)
   
   In this approach, in order to make multiple group-level IndexWriters 
function as a unified entity, we use the Lucene’s addIndxes api to combine 
them. This ensures that the top-level IndexWriter shares a common segment_N, 
SegmentCommitInfos, generation and version. During indexing or update request, 
the client (such as OpenSearch) will continue to route requests to the 
appropriate IndexWriter based on the documents’s criteria evaluation. During 
flush, in addition to flushing the segments of the group-level IndexWriters, we 
will merge/move them into a single parent IndexWriter using the addIndexes API 
call. For read (or replication) operations, the client (like OpenSearch) will 
now open a Reader on the parent IndexWriter.
   
   ### Pros
   
   1. Having a common IndexWriter with a single SegmentCommitInfos, generation 
etc, ensures that client (OpenSearch) is still interacting with Lucene using a 
single entity.
   
   ### Cons
   
   1. When segments of different groups are combined into a single index, we 
must ensure that only segment within a group are merged together. This will 
require a new merge policy for top level IndexWriter.
   2. The Lucene addIndexes API  seems to acquire a write lock on each 
directory associated with group level IndexWriters, preventing active writes 
during the Index merging process. This can cause a downtime on the client 
(OpenSearch) side during this period. However, this issue could be mitigated if 
Lucene provided an API to combine these group level indexes as a soft 
reference, without copying the segment files or locking the group level 
IndexWriters.
   3. Additionally, index merging involves copying files from group level 
IndexWriters’ directory to parent IndexWriter directory. This is a resource 
intensive operation, consuming disk IO and CPU cycles. Moreover, since we open 
a Reader on the parent IndexWriter (combined IndexWriter from group level 
IndexWriters), slow index merging may impact reader refresh times delaying 
visibility of changes for search.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to