Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-24 Thread via GitHub
mikemccand commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2828061328 > > I don't know if we are already doing this -- is this TieredMergePolicy's default behavior (1 -> 1) for forceMergeDeletes? I don't think so? > > It's not the default ind

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-21 Thread via GitHub
jpountz commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2819513522 > I don't know if we are already doing this -- is this TieredMergePolicy's default behavior (1 -> 1) for forceMergeDeletes? I don't think so? It's not the default indeed. Tier

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-21 Thread via GitHub
mikemccand commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2819320548 If we do add this timeout, I don't think the still-running merges kicked off during `forceMergeDeletes` should abort -- they should ideally run to completion, just in the backgro

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-17 Thread via GitHub
vigyasharma commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2811827103 > Suppose forceMergeDeletes() returned the MergeSpec This could be a _"good first issue"_, I'll create a spin-off issue for the same. We can close it if others disagree wi

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-15 Thread via GitHub
vigyasharma commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2807330038 For some Amazon Product Search context, we do a searcher switch-over to the newly built index once it is declared healthy and ready to use. The idea here is to first build the i

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-14 Thread via GitHub
houserjohn commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2803254782 After looking into the suggestions you mentioned, I still believe there is a valid need for a timeout for `forceMergeDeletes`. In the first suggestion, you recommended using two

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-06 Thread via GitHub
msokolov commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2781449497 We're operating in a setup where we have an initial phase that builds an index while it is offline, not accepting query traffic. Once that is complete, we enable the index to take

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-05 Thread via GitHub
jpountz commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2780641860 > and some deletes being addressed is better than none. This part of your message suggests that deletes get reclaimed progressively over time, which is often not true. So wait

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-04 Thread via GitHub
jpountz commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2778108878 For my understanding, what is the benefit of waiting until the timeout is reached rather than not waiting at all? -- This is an automated message from the Apache Git Service. To r

Re: [I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-04 Thread via GitHub
houserjohn commented on issue #14431: URL: https://github.com/apache/lucene/issues/14431#issuecomment-2779350159 Apologies if I am misunderstanding your question, but the example that it is great for is right after the full indexing of your documents. The indexing likely created many delete

[I] Add a timeout for forceMergeDeletes in IndexWriter [lucene]

2025-04-02 Thread via GitHub
houserjohn opened a new issue, #14431: URL: https://github.com/apache/lucene/issues/14431 ### Description Using IndexWriter's `forceMergeDeletes` to eliminate merge debt is a very useful feature -- especially during initial indexing. However, larger indexes can require 20+ minutes to