RS146BIJAY commented on issue #13387: URL: https://github.com/apache/lucene/issues/13387#issuecomment-2492296015
> Does the OpenSearch client directly work with 'n' different log-group specific IndexWriters? While writing logs, OpenSearch will interact with n' different log-group specific IndexWriters. For example, if logs are grouped by status codes, a 5xx log entry will be written using a 5xx specific IndexWriter. Conversely for read flows, like creating a reader, retrieving the latest commit (or segmentInfo state) associated with a directory (or IndexWriter) (for uploading to snapshot or syncing the state of replica from primary during checkpoint in SegRep, etc), OpenSearch will interact with Lucene via the combined view (parent IndexWriter). This parent Index Writer internally references segments of group level IndexWriters (200_0, 300_0 etc). Having separate IndexWriters for different groups ensures logs with different groups are maintained in different segments. Meanwhile, the combined view for group-level Segments of a Lucene Index in the form of parent IndexWriter provides a common view for operation like opening readers, syncing replicas, uploading segmentInfos of an index to a remote snapshot etc. > When a new log group is discovered, does the client create a new IndexWriter and add it to parent? Number of groups (IndexWriters) will be fixed and will be determined via a setting during Index creation. > Do we really need a parent "IndexWriter" with this approach? Would a Multi-Reader on all the child log-group directories work? Having a Multi-Reader on all the child log-group directories still won't provide a unified view of all group level segments associated with a Lucene Index. Even now, OpenSearch interacts with a Lucene index not only for indexing documents or opening a reader to read these indexed docs, but also for retrieving SegmentInfos associated with the latest commit of an IndexWriter directory (for eg: for storing snapshots of an Index on a remote store) or for obtaining file list associated with a past commit (for deleting unreferenced files inside commit deletion policy). Having a common view of multiple group level segments as an Index Writer associated with a single Lucene Index ensures that a Lucene index still behaves as a single entity (parent IndexWriters can be used to get a common commit for group level IndexWriters). Another approach is to use a SegmentInfos instance instead of an IndexWriter to maintain a common view for group level IndexWriters. Since in the above approach, parent IndexWriter periodically syncs and combines only segmentInfos of group-level IndexWriters, we can replace parent IndexWriter with a SegmentInfos as a combined view. This parent SegmentInfos will reference segments of group level segments similar to what a parent IndexWriter does. Let me know if this makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org