Re: [I] Support for criteria based DWPT selection inside DocumentWriter [lucene]

via GitHub Mon, 03 Jun 2024 06:08:31 -0700


mikemccand commented on issue #13387:
URL: https://github.com/apache/lucene/issues/13387#issuecomment-2145162839


   I like @jpountz's idea of just using separate `IndexWriter`s for this 
use-case, instead of adding custom routing logic to the separate DWPTs inside a 
single `IndexWriter` and then also needing a custom `MergePolicy` that ensures 
that only the like-segments are merged.  A separate `IndexWriter` would cleanly 
achieve both of these?
   
   The idea of using a single underlying multi-tenant `Directory` with multiple 
`FilterDirectory` wrappers (one per `IndexWriter`) is interesting -- do we have 
such a class already (that would distinguish the tenants via filename prefix or 
so)?  That's a nice idea all by itself (separate from this use case) -- maybe 
open a spinoff to explore that?
   
   You would also need a clean-ish way to manage a single total allowed RAM 
bytes across the N `IndexWriter`s?   I think `IndexWriter`'s flushing policy or 
RAM accounting was already generalized to allow for this use case, but I don't 
remember the details.
   
   Searching across the N separate shards as if they were a single index is 
also possible via `MultiReader`, though, I'm not sure how well intra-query 
concurrency works -- maybe it works just fine because the search-time 
leaves/slices are all union'd across the N shards?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Support for criteria based DWPT selection inside DocumentWriter [lucene]

Reply via email to