Re: [PR] Use IOContext#RANDOM when appropriate. [lucene]

2024-04-04 Thread via GitHub
uschindler commented on code in PR #13267: URL: https://github.com/apache/lucene/pull/13267#discussion_r1552492150 ## lucene/CHANGES.txt: ## @@ -78,6 +78,12 @@ Optimizations * GITHUB#13149: Made PointRangeQuery faster, for some segment sizes, by reducing the amount of virtual

Re: [PR] Use IOContext#RANDOM when appropriate. [lucene]

2024-04-04 Thread via GitHub
uschindler commented on code in PR #13267: URL: https://github.com/apache/lucene/pull/13267#discussion_r1552487587 ## lucene/CHANGES.txt: ## @@ -78,6 +78,12 @@ Optimizations * GITHUB#13149: Made PointRangeQuery faster, for some segment sizes, by reducing the amount of virtual

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038269435 I'll debug it some more... just need a break. Mainly I wanted to make sure I didn't introduce this with the ICU upgrade... the two shinglefilters are suspect. -- This is an a

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038258715 I narrowed the fail down so far a bit to just this chain: ``` new Analyzer() { @Override protected TokenStreamComponents createComponents(Str

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038244230 Attached is a reproducer in TestBugInSomething that seems to work. It is ugly due to bugs in `javac`, but we don't care as we won't keep it and will just whittle it down to a proper t

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038169885 i'll try to dig into it to at least find the offending component. If we can narrow it down to the problematic charfilter, tokenizer, or tokenfilter, we can make an easier-to-reproduce

Re: [PR] Use IOContext#RANDOM when appropriate. [lucene]

2024-04-04 Thread via GitHub
uschindler commented on code in PR #13267: URL: https://github.com/apache/lucene/pull/13267#discussion_r1552390907 ## lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsWriter.java: ## @@ -315,10 +316,13 @@ public CloseableRandomVectorScorerSupplier merge

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038151821 OK, apologies for the noise, this test keeps failing weirdly for me. Git bisect has failed me :) -- This is an automated message from the Apache Git Service. To respond to the m

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
original-brownbear commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038144433 ++ to @rmuir I can reproduce this after reverting my changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038141226 I don't understand what that change has to do with analysis chain... inconsistent offsets has to do with what TokenStream is doing not the index. Be sure, that you aren't getting conf

Re: [I] TestKnnByteVectorQuery.testTimeout fails [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on issue #13272: URL: https://github.com/apache/lucene/issues/13272#issuecomment-2038142167 OK, this is muted in main now. The original change doesn't seem backported to 9x. So, nothing to mute there. -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Test mute for issue #13272 [lucene]

2024-04-04 Thread via GitHub
benwtrent merged PR #13273: URL: https://github.com/apache/lucene/pull/13273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038136325 So, if I switch to just return ` new Meta(numValues, blockShift)` when its all zero, the test passes :/. So it might be the single block meta deal... -- This is an automated mes

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
original-brownbear commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038127980 Looking into this now. Maybe it's the fact that for the all zeros case we now always have a single block meta ... let me check -- This is an automated message from the

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038124499 OK, I simply commented out @original-brownbear 's change and always returned the meta and the test passes with the seed and settings. When I return the cached instance, it fails.

Re: [I] TestKnnByteVectorQuery.testTimeout fails [lucene]

2024-04-04 Thread via GitHub
zacharymorn commented on issue #13272: URL: https://github.com/apache/lucene/issues/13272#issuecomment-2038123150 Another seed to reproduce this issue ``` gradlew test --tests TestKnnByteVectorQuery.testTimeout -Dtests.seed=51247A9C254BB9F -Dtests.locale=ff-Latn -Dtests.timezone=As

Re: [PR] GITHUB-13218: Add migrate entry for Collector to CollectorManager migration [lucene]

2024-04-04 Thread via GitHub
zacharymorn commented on PR #13238: URL: https://github.com/apache/lucene/pull/13238#issuecomment-2038118749 After merge, I saw a test run error and this issue is fixing it https://github.com/apache/lucene/issues/13272 -- This is an automated message from the Apache Git Service. To respon

[PR] Test mute for issue #13272 [lucene]

2024-04-04 Thread via GitHub
benwtrent opened a new pull request, #13273: URL: https://github.com/apache/lucene/pull/13273 These tests are failing fairly often, but aren't reliably reproduced. I am gonna mute to keep away the build noise for now. Related to: https://github.com/apache/lucene/issues/13272 -- Thi

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038092682 I just reproduced it on `main` too :( So I think the problem is that it doesn't always reproduce: this likely caused your git-bisect confusion. But at least it isn't NEWLY intro

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038088546 @rmuir > I can't reproduce it with the latest main branch which has the ICU upgrade. Oh dang, I might not have fetched latest 🤦 Let me try again. -- This is an aut

Re: [PR] GITHUB-13218: Add migrate entry for Collector to CollectorManager migration [lucene]

2024-04-04 Thread via GitHub
zacharymorn commented on code in PR #13238: URL: https://github.com/apache/lucene/pull/13238#discussion_r1552338688 ## lucene/MIGRATE.md: ## @@ -185,6 +185,34 @@ enum. `IOContext#LOAD` has been replaced with `IOContext#PRELOAD`. +### IndexSearch#search(Query, Collector) bei

Re: [PR] GITHUB-13218: Add migrate entry for Collector to CollectorManager migration [lucene]

2024-04-04 Thread via GitHub
zacharymorn merged PR #13238: URL: https://github.com/apache/lucene/pull/13238 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038087578 I can't find the specific bug, but we've reported several such issues to them in the past. I'm having trouble with their JIRA... Either way, fails on 342c814cafdaab341f8d86b80c7

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on PR #13202: URL: https://github.com/apache/lucene/pull/13202#issuecomment-2038084746 @kaivalnp I too have had this fail, but it is not deterministic. I created https://github.com/apache/lucene/issues/13272 to track it. I can mute the test while we wait for the fi

[I] TestKnnByteVectorQuery.testTimeout fails [lucene]

2024-04-04 Thread via GitHub
benwtrent opened a new issue, #13272: URL: https://github.com/apache/lucene/issues/13272 ### Description Periodic failures for testTimeout, its not locally repeatable. ### Gradle command to reproduce ``` ./gradlew :lucene:core:test --tests "org.apache.lucene.search.T

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038081951 @benwtrent how long ago did you see this test failure? I can't reproduce it with the latest `main` branch which has the ICU upgrade. If i go back one commit to before the ICU

Re: [I] TestKnnByteVectorQuery.testTimeout fails [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on issue #13272: URL: https://github.com/apache/lucene/issues/13272#issuecomment-2038087232 I am going to mute this on main & branch_9x as we wait for a fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Reduce duplication in taxonomy facets; always do counts [lucene]

2024-04-04 Thread via GitHub
stefanvodita commented on PR #12966: URL: https://github.com/apache/lucene/pull/12966#issuecomment-2038070448 Thank you for reviewing @mikemccand! I had to rebase after #12966. I'll push tomorrow maybe if there are no objections. -- This is an automated message from the Apache Git Service

Re: [I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on issue #13271: URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038069989 git bisect says: 6cba773318445d1ecd349d42ff85508b58be1752 It isn't immediately obvious why to me. @original-brownbear what do you think? -- This is an automated message f

[I] org.apache.lucene.analysis.tests.TestRandomChains.testRandomChains test failure [lucene]

2024-04-04 Thread via GitHub
benwtrent opened a new issue, #13271: URL: https://github.com/apache/lucene/issues/13271 ### Description Haven't finished bisection to figure out the origin of the bug. ``` org.apache.lucene.analysis.tests.TestRandomChains > testRandomChains FAILED java.lang.IllegalSt

Re: [PR] update commons-compress from 1.19 to 1.21 [lucene]

2024-04-04 Thread via GitHub
rmuir commented on PR #13270: URL: https://github.com/apache/lucene/pull/13270#issuecomment-2038041453 1.22+ starts dragging in more stuff such as `commons-lang3` and `commons-io` which is why I stopped at 1.21 -- This is an automated message from the Apache Git Service. To respond to the

[PR] update commons-compress from 1.19 to 1.21 [lucene]

2024-04-04 Thread via GitHub
rmuir opened a new pull request, #13270: URL: https://github.com/apache/lucene/pull/13270 Another pre-covid dependency version... I tried to upgrade to the latest version (both with and without the entangled commons-codec upgrade: https://github.com/apache/lucene/pull/13269) and ther

Re: [PR] upgrade commons codec from 1.13.0 to 1.16.0 [lucene]

2024-04-04 Thread via GitHub
rmuir commented on PR #13269: URL: https://github.com/apache/lucene/pull/13269#issuecomment-2038008126 I will open a separate issue for the commons-compress as there is more trouble there... -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] upgrade commons codec from 1.13.0 to 1.16.0 [lucene]

2024-04-04 Thread via GitHub
rmuir commented on PR #13269: URL: https://github.com/apache/lucene/pull/13269#issuecomment-2037993068 I think i understand the 1.16.1 issue. Looks like there is a bit of jar/maven-hell going on among the commons dependencies. I hit it when I tried to upgrade commons-compress... Bump

Re: [PR] Reduce duplication in taxonomy facets; always do counts [lucene]

2024-04-04 Thread via GitHub
stefanvodita commented on code in PR #12966: URL: https://github.com/apache/lucene/pull/12966#discussion_r1552265766 ## lucene/facet/src/java/org/apache/lucene/facet/TopOrdAndIntQueue.java: ## @@ -16,37 +16,42 @@ */ package org.apache.lucene.facet; -import org.apache.lucene

Re: [PR] Add timeout support to AbstractKnnVectorQuery [lucene]

2024-04-04 Thread via GitHub
kaivalnp commented on PR #13202: URL: https://github.com/apache/lucene/pull/13202#issuecomment-2037932211 I see a couple of recent failures related to the test added here: 1. https://github.com/apache/lucene/actions/runs/8557766498/job/23450804532 2. https://github.com/apache/lucene/act

[PR] upgrade commons codec from 1.13.0 to 1.16.0 [lucene]

2024-04-04 Thread via GitHub
rmuir opened a new pull request, #13269: URL: https://github.com/apache/lucene/pull/13269 Currently, we depend upon a pre-COVID version of commons-codec (like several other dependencies, I am working the issues). Upgrade to 1.16.0. NOTE: I'm unable to upgrade to 1.16.1: it caus

Re: [PR] Use `ReadAdvice#RANDOM` when appropriate. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13222: URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037833835 Backport PR: #13267. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] Use IOContext#RANDOM when appropriate. [lucene]

2024-04-04 Thread via GitHub
jpountz opened a new pull request, #13267: URL: https://github.com/apache/lucene/pull/13267 This switches the following files to `IOContext#RANDOM`: - Stored fields data file. - Term vectors data file. - HNSW graph. - Temporary file storing vectors at merge time that we use to

Re: [PR] Add new pluggable vector similarity to field info [lucene]

2024-04-04 Thread via GitHub
jimczi commented on PR #13200: URL: https://github.com/apache/lucene/pull/13200#issuecomment-2037737097 Thanks for persevering on this @benwtrent ! A few late thoughts on my end: I am a bit concerned about the generalization here. The whole similarity is currently modeled around t

Re: [I] `FieldInfo`? [lucene]

2024-04-04 Thread via GitHub
uschindler closed issue #13266: `FieldInfo`? URL: https://github.com/apache/lucene/issues/13266 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-u

Re: [PR] upgrade icu4j to 74.2 [lucene]

2024-04-04 Thread via GitHub
uschindler commented on code in PR #13239: URL: https://github.com/apache/lucene/pull/13239#discussion_r1552071936 ## lucene/analysis/common/src/test/org/apache/lucene/analysis/email/TLDs.txt: ## @@ -1,6 +1,5 @@ # Generated from IANA TLD Database (gradlew generateTlds).aaa aar

Re: [PR] Use `ReadAdvice#RANDOM` when appropriate. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13222: URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037703765 This is exactly what I'm doing. :) All in the same PR, but it shouldn't be too big. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Use `ReadAdvice#RANDOM` when appropriate. [lucene]

2024-04-04 Thread via GitHub
uschindler commented on PR #13222: URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037699875 > I'll look into porting this work to branch_9x via a separate PR. We should maybe also port some recent changes back, too. Not sure how to best do this. The new record classes

Re: [PR] Make the default ReadAdvice configurable by sysprop [lucene]

2024-04-04 Thread via GitHub
uschindler merged PR #13264: URL: https://github.com/apache/lucene/pull/13264 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Add new pluggable vector similarity to field info [lucene]

2024-04-04 Thread via GitHub
uschindler commented on code in PR #13200: URL: https://github.com/apache/lucene/pull/13200#discussion_r1552040639 ## lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/word2vec/Word2VecSynonymProvider.java: ## @@ -40,8 +40,8 @@ */ public class Word2VecSynonym

Re: [PR] Use `ReadAdvice#RANDOM` when appropriate. [lucene]

2024-04-04 Thread via GitHub
jpountz merged PR #13222: URL: https://github.com/apache/lucene/pull/13222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Use `ReadAdvice#RANDOM` when appropriate. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13222: URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037674709 I'll look into porting this work to branch_9x via a separate PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Propagate the flush IOContext to stored fields / term vectors writers when index sorting is enabled. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13265: URL: https://github.com/apache/lucene/pull/13265#issuecomment-2037661586 I tried to create a test for it but couldn't create a satisfying one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] Propagate the flush IOContext to stored fields / term vectors writers when index sorting is enabled. [lucene]

2024-04-04 Thread via GitHub
jpountz opened a new pull request, #13265: URL: https://github.com/apache/lucene/pull/13265 This fixes index sorting to pass the correct `IOContext` to stored fields and term vectors writers when index sorting is enabled. This is important for things like `NRTCachingDirectory`. -- Th

Re: [PR] Make the default ReadAdvice configurable by sysprop [lucene]

2024-04-04 Thread via GitHub
uschindler commented on PR #13264: URL: https://github.com/apache/lucene/pull/13264#issuecomment-2037631425 > This makes sense to me. At first I wondered if we needed this since it's possible to create a `FilterDirectory` that calls `ioContext.withReadAdvice(ReadAdvice.NORMAL)` on `Director

Re: [PR] Correct quantized HNSW validation to be in-line with HNSW [lucene]

2024-04-04 Thread via GitHub
benwtrent merged PR #13263: URL: https://github.com/apache/lucene/pull/13263 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
uschindler merged PR #13261: URL: https://github.com/apache/lucene/pull/13261 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Make the default ReadAdvice configurable by sysprop [lucene]

2024-04-04 Thread via GitHub
uschindler commented on PR #13264: URL: https://github.com/apache/lucene/pull/13264#issuecomment-2037578866 P.S.: I moved the lookup cache to a new constant name and placed it next to the `withReadAdvice()` method. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Use ReadAdvice.RANDOM by default. [lucene]

2024-04-04 Thread via GitHub
uschindler commented on PR #13244: URL: https://github.com/apache/lucene/pull/13244#issuecomment-2037575426 See #13264 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[PR] Make the default ReadAdvice configurable by sysprop [lucene]

2024-04-04 Thread via GitHub
uschindler opened a new pull request, #13264: URL: https://github.com/apache/lucene/pull/13264 Followup after #13244: This makes the default `ReadAdvice` for index files configurable via system property `org.apache.lucene.store.defaultReadAdvice`. I moved the corresponding constant to

Re: [PR] Use `ReadAdvice#RANDOM` when appropriate. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13222: URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037526473 @uschindler I'm curious what you think of this PR now that `ReadAdvice#RANDOM` is the default. Should we keep calling `state.context.withReadAdvice(ReadAdvice.RANDOM)` to be explicit abo

Re: [PR] Add new pluggable vector similarity to field info [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on PR #13200: URL: https://github.com/apache/lucene/pull/13200#issuecomment-2037464568 I still need to run with Lucene Util to ensure all these changes didn't add a weird overhead. This leads me to add back the deprecated apis for the vector fields. I know I woul

Re: [PR] Use ReadAdvice.RANDOM by default. [lucene]

2024-04-04 Thread via GitHub
uschindler commented on PR #13244: URL: https://github.com/apache/lucene/pull/13244#issuecomment-2037455008 Thanks! I will provide a PR to make it configurable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
uschindler commented on PR #13261: URL: https://github.com/apache/lucene/pull/13261#issuecomment-2037428907 The failing test is a known issue (see mailing list). Just ignore that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Use ReadAdvice.RANDOM by default. [lucene]

2024-04-04 Thread via GitHub
jpountz merged PR #13244: URL: https://github.com/apache/lucene/pull/13244 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] GITHUB-13218: Add migrate entry for Collector to CollectorManager migration [lucene]

2024-04-04 Thread via GitHub
mikemccand commented on code in PR #13238: URL: https://github.com/apache/lucene/pull/13238#discussion_r1551778848 ## lucene/MIGRATE.md: ## @@ -185,6 +185,34 @@ enum. `IOContext#LOAD` has been replaced with `IOContext#PRELOAD`. +### IndexSearch#search(Query, Collector) bein

[I] Remove deprecated code in `main` [lucene]

2024-04-04 Thread via GitHub
jpountz opened a new issue, #13262: URL: https://github.com/apache/lucene/issues/13262 ### Description We have some code that has been deprecated for a long time, see e.g. `IntField` constructors. Let's remove this deprecated code in the `main` branch ahead of 10.0. Note: we m

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
Pulkitg64 commented on code in PR #13261: URL: https://github.com/apache/lucene/pull/13261#discussion_r1551738126 ## lucene/core/src/java/org/apache/lucene/search/BooleanClause.java: ## @@ -67,25 +67,12 @@ public String toString() { }; } - /** The query whose matching

Re: [PR] Use ReadAdvice.RANDOM by default. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13244: URL: https://github.com/apache/lucene/pull/13244#issuecomment-2037277263 I hacked luceneutil to clear my page cache with `echo 1 > /proc/sys/vm/drop_caches` before each query: ```patch diff --git a/src/main/perf/TaskThreads.java b/src/main/perf/TaskT

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
Pulkitg64 commented on code in PR #13261: URL: https://github.com/apache/lucene/pull/13261#discussion_r1551703918 ## lucene/core/src/java/org/apache/lucene/search/BooleanClause.java: ## @@ -67,25 +67,12 @@ public String toString() { }; } - /** The query whose matching

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
uschindler commented on code in PR #13261: URL: https://github.com/apache/lucene/pull/13261#discussion_r1551701906 ## lucene/core/src/java/org/apache/lucene/search/BooleanClause.java: ## @@ -67,25 +67,12 @@ public String toString() { }; } - /** The query whose matchin

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
uschindler commented on code in PR #13261: URL: https://github.com/apache/lucene/pull/13261#discussion_r1551695502 ## lucene/core/src/java/org/apache/lucene/search/BooleanClause.java: ## @@ -67,25 +67,12 @@ public String toString() { }; } - /** The query whose matchin

Re: [PR] Add new pluggable vector similarity to field info [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on PR #13200: URL: https://github.com/apache/lucene/pull/13200#issuecomment-2037236344 I have removed the deprecated `VectorSimilarityFunction` methods from the kNN fields, I thought this was acceptable because those fields are flagged as `experimental`. However, I reali

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
uschindler commented on PR #13261: URL: https://github.com/apache/lucene/pull/13261#issuecomment-2037230661 > That's a good idea @jpountz. I can add this change under this section: https://github.com/apache/lucene/blob/main/lucene/MIGRATE.md#some-classes-converted-to-records-classes-github13

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
Pulkitg64 commented on PR #13261: URL: https://github.com/apache/lucene/pull/13261#issuecomment-2037226302 That's a good idea @jpountz. I can add this change under this section: https://github.com/apache/lucene/blob/main/lucene/MIGRATE.md#some-classes-converted-to-records-classes-github13207

Re: [PR] BaseVectorSimilarityQueryTestCase assumes connected hnsw graph [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on PR #13260: URL: https://github.com/apache/lucene/pull/13260#issuecomment-2037214166 @msokolov I have had something similar fail in other code, so it isn't related to your change here. I think its a flaky test. It doesn't reliably reproduce for me. -- This is an aut

Re: [I] TestIndexFileDeleter.testExcInDecRef test failure [LUCENE-9839] [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #10878: URL: https://github.com/apache/lucene/issues/10878#issuecomment-2037179564 I wonder if "concurrency issue" is jumping to conclusions. My machine has almost no concurrency: 1 cpu with 2 cores... -- This is an automated message from the Apache Git Service. T

Re: [PR] BaseVectorSimilarityQueryTestCase assumes connected hnsw graph [lucene]

2024-04-04 Thread via GitHub
msokolov commented on PR #13260: URL: https://github.com/apache/lucene/pull/13260#issuecomment-2037168102 hm a failure popped up when I pushed: ``` org.apache.lucene.search.TestKnnByteVectorQuery > testTimeout FAILED java.lang.AssertionError: expected:<1> but was:<0>

Re: [PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13261: URL: https://github.com/apache/lucene/pull/13261#issuecomment-2037159161 > Please add a note to lucene/MIGRATE.txt By the way, I wonder how we should handle all the classes that we are converting to records, should we have a single section in `lucene/MI

Re: [I] TestIndexFileDeleter.testExcInDecRef test failure [LUCENE-9839] [lucene]

2024-04-04 Thread via GitHub
rmuir commented on issue #10878: URL: https://github.com/apache/lucene/issues/10878#issuecomment-2037147651 I just popped this when pushing an unrelated commit. I see @mikemccand added more verbosity, but then reverted the verbosity (maybe because it caused the test to never fail anymore?)

[PR] Convert BooleanClause class to record class [lucene]

2024-04-04 Thread via GitHub
Pulkitg64 opened a new pull request, #13261: URL: https://github.com/apache/lucene/pull/13261 ### Description Convert ```BooleanClause``` class to record class which addresses #13207 partially. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Expand scalar quantization with adding half-byte (int4) quantization [lucene]

2024-04-04 Thread via GitHub
benwtrent commented on PR #13197: URL: https://github.com/apache/lucene/pull/13197#issuecomment-2037108804 @jpountz I can add the parameters today and fix the compilation. I think your change is the correct one. -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] upgrade icu4j to 74.2 [lucene]

2024-04-04 Thread via GitHub
rmuir merged PR #13239: URL: https://github.com/apache/lucene/pull/13239 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] BaseVectorSimilarityQueryTestCase assumes connected hnsw graph [lucene]

2024-04-04 Thread via GitHub
msokolov commented on PR #13260: URL: https://github.com/apache/lucene/pull/13260#issuecomment-2037063850 thanks for the reviews! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] BaseVectorSimilarityQueryTestCase assumes connected hnsw graph [lucene]

2024-04-04 Thread via GitHub
msokolov merged PR #13260: URL: https://github.com/apache/lucene/pull/13260 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [PR] Use ReadAdvice.RANDOM by default. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13244: URL: https://github.com/apache/lucene/pull/13244#issuecomment-2036934137 Here's a luceneutil run on wikibigall and data hot in the page cache. No difference, which is what was expected I guess: ``` TaskQPS baseline St

Re: [PR] Expand scalar quantization with adding half-byte (int4) quantization [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13197: URL: https://github.com/apache/lucene/pull/13197#issuecomment-2036830008 @benwtrent I tried to fix the compilation on luceneutil at https://github.com/mikemccand/luceneutil/commit/027146b05c755c303cac5d95451cc458ed397c51. I could use your help to check if thi

Re: [PR] Use ReadAdvice.RANDOM by default. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13244: URL: https://github.com/apache/lucene/pull/13244#issuecomment-2036730400 That works for me @uschindler. I updated the code and the PR description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Recommend lowering the default mmap readahead. [lucene]

2024-04-04 Thread via GitHub
jpountz commented on PR #13223: URL: https://github.com/apache/lucene/pull/13223#issuecomment-2036729679 Superseded by #13244. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] Recommend lowering the default mmap readahead. [lucene]

2024-04-04 Thread via GitHub
jpountz closed pull request #13223: Recommend lowering the default mmap readahead. URL: https://github.com/apache/lucene/pull/13223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] BaseVectorSimilarityQueryTestCase assumes connected hnsw graph [lucene]

2024-04-04 Thread via GitHub
uschindler commented on PR #13260: URL: https://github.com/apache/lucene/pull/13260#issuecomment-2036532521 I think the Unwrappable for FilterReader should be added in a separate PR. For now this is fine. I remember vaguely that I left our FilterReader because of the static method co