Re: [PR] Use a hint to specify READONCE IOContext [lucene]
uschindler commented on PR #14509: URL: https://github.com/apache/lucene/pull/14509#issuecomment-2808928300 Could we get some context/issue what this is about? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] OptimisticKnnVectorQuery [lucene]
benwtrent commented on PR #14226: URL: https://github.com/apache/lucene/pull/14226#issuecomment-2809670383 > , I'm in favor of fixing it as a magic number that replicates something close to the current behavior (or better recall if we can and retain the same latency as we found with lambda=16) and letting users tune further using fanout. I think these are roughly equivalent and I don't think we should be exposing a lot of knobs. I am for this as well. Exposing more and more knobs makes things way too complicated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Change uses of withReadAdvice to use hints instead [lucene]
thecoop opened a new pull request, #14510: URL: https://github.com/apache/lucene/pull/14510 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] OptimisticKnnVectorQuery [lucene]
msokolov commented on PR #14226: URL: https://github.com/apache/lucene/pull/14226#issuecomment-2810290056 As far as making this the default, that sounds OK to me, but let's not backport until we've had a chance to verify no harm for a while in some pre-production environments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] [Bug] Fix for stored fields force merge regression [lucene]
bharath-techie opened a new pull request, #14512: URL: https://github.com/apache/lucene/pull/14512 ### Description Resolves https://github.com/apache/lucene/issues/14463 I have made changes similar to https://github.com/apache/lucene/pull/13985 to update read advice to sequential during merge. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Use a hint to specify READONCE IOContext [lucene]
thecoop opened a new pull request, #14509: URL: https://github.com/apache/lucene/pull/14509 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use a hint to specify READONCE IOContext [lucene]
thecoop commented on PR #14509: URL: https://github.com/apache/lucene/pull/14509#issuecomment-2808933733 This is following on from https://github.com/apache/lucene/pull/14482. This is in draft, as I wanted to see that this refactoring worked cleanly based on the changes in #14482, but needs that PR merged first before it can go in. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Tone down TestIndexWriterDelete.testDeleteAllRepeated (OOMs sometimes) [lucene]
uschindler commented on issue #14508: URL: https://github.com/apache/lucene/issues/14508#issuecomment-2809265915 > [@msfroh](https://github.com/msfroh) would you know how to tackle this better than forcing FSDirectory in this test? Policeman Jenkins poor NVME disk!!! The new server is only online since a few weeks and it already eats up the "percentage_used" smart counter: ``` root@serv1 ~ # nvme smart-log /dev/nvme0 Smart Log for NVME device:nvme0 namespace-id: critical_warning: 0 temperature : 32 °C (305 K) available_spare : 100% available_spare_threshold : 10% percentage_used : 4% endurance group critical warning summary: 0 Data Units Read : 15428184 (7.90 TB) Data Units Written : 46860669 (23.99 TB) host_read_commands : 251666556 host_write_commands : 787078812 controller_busy_time: 3115 power_cycles: 11 power_on_hours : 423 unsafe_shutdowns: 4 media_errors: 0 num_err_log_entries : 0 Warning Temperature Time: 0 Critical Composite Temperature Time : 0 Temperature Sensor 1 : 32 °C (305 K) Temperature Sensor 2 : 34 °C (307 K) Thermal Management T1 Trans Count : 0 Thermal Management T2 Trans Count : 0 Thermal Management T1 Total Time: 0 Thermal Management T2 Total Time: 0 root@serv1 ~ # nvme smart-log /dev/nvme1 Smart Log for NVME device:nvme1 namespace-id: critical_warning: 0 temperature : 34 °C (307 K) available_spare : 100% available_spare_threshold : 5% percentage_used : 3% endurance group critical warning summary: 0 Data Units Read : 10776282 (5.52 TB) Data Units Written : 45766086 (23.43 TB) host_read_commands : 345725745 host_write_commands : 865033828 controller_busy_time: 495 power_cycles: 28 power_on_hours : 310 unsafe_shutdowns: 12 media_errors: 0 num_err_log_entries : 69 Warning Temperature Time: 0 Critical Composite Temperature Time : 0 Temperature Sensor 1 : 34 °C (307 K) Thermal Management T1 Trans Count : 0 Thermal Management T2 Trans Count : 0 Thermal Management T1 Total Time: 0 Thermal Management T2 Total Time: 0 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] deps(java): bump com.carrotsearch.randomizedtesting:randomizedtesting-runner from 2.8.1 to 2.8.3 [lucene]
dweiss merged PR #14504: URL: https://github.com/apache/lucene/pull/14504 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] deps(java): bump xerces:xercesImpl from 2.12.0 to 2.12.2 [lucene]
dweiss merged PR #14502: URL: https://github.com/apache/lucene/pull/14502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Ensuring skip list is read for fields indexed with only DOCS [lucene]
expani opened a new pull request, #14511: URL: https://github.com/apache/lucene/pull/14511 ### Description Fix for https://github.com/apache/lucene/issues/14445 Falling back to return a SlowImpactsEnum for all default cases but ensuring skip data is read for case where field is indexed with `IndexOptions.DOCS` by returning a non competitive impact. This is required because we stopped storing a default impact for such cases from 912Postings Format https://github.com/apache/lucene/blob/main/lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene99/Lucene99PostingsWriter.java#L275 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Ensuring skip list is read for fields indexed with only DOCS [lucene]
expani commented on code in PR #14511: URL: https://github.com/apache/lucene/pull/14511#discussion_r2047829308 ## lucene/core/src/java/org/apache/lucene/codecs/lucene103/Lucene103PostingsReader.java: ## @@ -1310,7 +1317,7 @@ public List getImpacts(int level) { return readImpacts(level1SerializedImpacts, level1Impacts); } } -return DUMMY_IMPACTS; +return NON_COMPETITIVE_IMPACTS; Review Comment: Good catch. This is unused after the change, so can remove it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Ensuring skip list is read for fields indexed with only DOCS [lucene]
msfroh commented on code in PR #14511: URL: https://github.com/apache/lucene/pull/14511#discussion_r2047793102 ## lucene/core/src/java/org/apache/lucene/codecs/lucene103/Lucene103PostingsReader.java: ## @@ -1310,7 +1317,7 @@ public List getImpacts(int level) { return readImpacts(level1SerializedImpacts, level1Impacts); } } -return DUMMY_IMPACTS; +return NON_COMPETITIVE_IMPACTS; Review Comment: This was the only reference to `DUMMY_IMPACTS`, right? Can we remove it? ## lucene/core/src/java/org/apache/lucene/codecs/lucene103/Lucene103PostingsReader.java: ## @@ -282,6 +288,10 @@ public PostingsEnum postings( @Override public ImpactsEnum impacts(FieldInfo fieldInfo, BlockTermState state, int flags) throws IOException { +if (state.docFreq <= BLOCK_SIZE) { + // no skip data + return new SlowImpactsEnum(postings(fieldInfo, state, null, flags)); +} Review Comment: This is essentially taking the place of `DUMMY_IMPACTS`, right? It's the thing that kicks in on tail blocks, which is what `DUMMY_IMPACTS` was there for. (I'm trying to make sure I understand the change.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[I] IndexWriter forceMergeDeletes should return its MergeSpec [lucene]
vigyasharma opened a new issue, #14515: URL: https://github.com/apache/lucene/issues/14515 IndexWriter provides a `forceMergeDeletes` API which triggers force merging of all segments that have deleted documents, allowing users to expunge deletes up to a configurable delete percentage (set via `setForceMergeDeletesPctAllowed()`). The API provides a blocking variant, which waits until the merges complete, and a non-blocking variant, that starts the merges in background threads and returns. For the non-blocking version, it would be nice to have the ability to monitor if merges have completed. Turns out, all we need for this, is to return the `MergeSpecification` that defines the merges triggered by the API. Indeed, the blocking variant of this API itself uses this spec to wait until all merges have completed. This is what would happen if you were using the `ConcurrentMergeScheduler` which will start merges in background, but invoked the API with `doWait=true`. However, there are benefits to being able to monitor from outside the API, like waiting only unto a max timeout, or reporting metrics on the progress of these merges. The change here is to change this API return type from void to `MergePolicy.MergeSpecification` and return the `spec` object. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Backport Bot [lucene]
jainankitk commented on issue #14496: URL: https://github.com/apache/lucene/issues/14496#issuecomment-2811880891 We have similar backport workflow in OpenSearch, that might be useful - https://github.com/opensearch-project/OpenSearch/blob/main/.github/workflows/backport.yml. Just need to add the backport- label and it creates backport PR once the labeled PR is merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org