Re: [I] Lucene99FlatVectorsReader.getFloatVectorValues(): NPE: Cannot read field "vectorEncoding" because "fieldEntry" is null [lucene]

2024-08-12 Thread via GitHub
david-sitsky commented on issue #13626: URL: https://github.com/apache/lucene/issues/13626#issuecomment-2285416022 @benwtrent - many thanks for your advice. Switching to PerFieldKnnVectorsFormat fixed the issue for us. -- This is an automated message from the Apache Git Service. To respo

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1714509772 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/ForUtil.java: ## @@ -300,13 +301,14 @@ int numBytes(int bitsPerValue) { return bitsPerValue << (BLOCK_

Re: [PR] Use Max WAND optimizations with ToParentBlockJoinQuery when using ScoreMode.Max [lucene]

2024-08-12 Thread via GitHub
github-actions[bot] commented on PR #13587: URL: https://github.com/apache/lucene/pull/13587#issuecomment-2285117122 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Compare the missing value with the top value even after the hit queue is full [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on PR #13644: URL: https://github.com/apache/lucene/pull/13644#issuecomment-2284968622 Thanks! I left you one comment in the corresponding issue you opened for this (sorry, meant to put it here in the PR). -- This is an automated message from the Apache Git Service. To

Re: [I] Compare the missing value with the top value even after the hit queue is full [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on issue #13643: URL: https://github.com/apache/lucene/issues/13643#issuecomment-2284967452 Thanks for finding this (and for adding a test case)! Would it be possible to write your test such that it fails without your proposed change? As it's currently written, I believe

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-12 Thread via GitHub
jpountz commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1714276855 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-12 Thread via GitHub
jpountz commented on PR #13636: URL: https://github.com/apache/lucene/pull/13636#issuecomment-2284765651 @uschindler @rmuir Any objections to merging this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [I] `gradlew eclipse` no longer works [lucene]

2024-08-12 Thread via GitHub
dweiss commented on issue #13638: URL: https://github.com/apache/lucene/issues/13638#issuecomment-2284748381 Hi Uwe. Sorry, I was away on holidays. The cr-lf warning is caused by normalization in .gitattributes: ``` # Ignore all differences in line endings for the lock file. version

Re: [PR] Avoid rare random test failures in TestLongValueFacetCounts [lucene]

2024-08-12 Thread via GitHub
gsmiller closed pull request #13646: Avoid rare random test failures in TestLongValueFacetCounts URL: https://github.com/apache/lucene/pull/13646 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Backport #13568 [lucene]

2024-08-12 Thread via GitHub
gsmiller merged PR #13645: URL: https://github.com/apache/lucene/pull/13645 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

[PR] Avoid rare random test failures in TestLongValueFacetCounts [lucene]

2024-08-12 Thread via GitHub
gsmiller opened a new pull request, #13646: URL: https://github.com/apache/lucene/pull/13646 When backporting GH#13568 I realized we have some incorrect usages of RandomNumbers#randomIntBetween in TestLongValueFacetCounts. We use this in place of Random#nextInt for 9x code to work with jdk1

Re: [I] These attributes are better for the final state [lucene]

2024-08-12 Thread via GitHub
gsmiller closed issue #13628: These attributes are better for the final state URL: https://github.com/apache/lucene/issues/13628 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] TermsQuery as MultiTermQuery can dramatically overestimate its cost [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on issue #12483: URL: https://github.com/apache/lucene/issues/12483#issuecomment-2284479926 @mayya-sharipova do you have any additional information on this regression by chance? I know you did some work back in #13454 to help in this space, then @msfroh did work in #13201

Re: [PR] Backport #13568 [lucene]

2024-08-12 Thread via GitHub
epotyom commented on code in PR #13645: URL: https://github.com/apache/lucene/pull/13645#discussion_r1714080529 ## lucene/facet/src/java/org/apache/lucene/facet/DrillSideways.java: ## @@ -240,46 +242,43 @@ public DrillSidewaysResult search(DrillDownQuery query, Collector hitCol

Re: [PR] Backport #13568 [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on PR #13645: URL: https://github.com/apache/lucene/pull/13645#issuecomment-2284392144 @stefanvodita > I'm surprised at how large the [spotless commit](https://github.com/apache/lucene/pull/13645/commits/fb2c0c50705a6a62bd93d4a270ad113a2e82283e) is. I assume it was ju

Re: [PR] Backport #13568 [lucene]

2024-08-12 Thread via GitHub
stefanvodita commented on PR #13645: URL: https://github.com/apache/lucene/pull/13645#issuecomment-2284385447 I'm surprised at how large the [spotless commit](https://github.com/apache/lucene/pull/13645/commits/fb2c0c50705a6a62bd93d4a270ad113a2e82283e) is. I assume it was just `gradlew tidy

[PR] Backport #13568 [lucene]

2024-08-12 Thread via GitHub
gsmiller opened a new pull request, #13645: URL: https://github.com/apache/lucene/pull/13645 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

Re: [PR] Compute facets while collecting [lucene]

2024-08-12 Thread via GitHub
gsmiller merged PR #13568: URL: https://github.com/apache/lucene/pull/13568 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [PR] These attributes are better for the final state(#13628) [lucene]

2024-08-12 Thread via GitHub
gsmiller closed pull request #13630: These attributes are better for the final state(#13628) URL: https://github.com/apache/lucene/pull/13630 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] These attributes are better for the final state [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on issue #13628: URL: https://github.com/apache/lucene/issues/13628#issuecomment-2284113497 Addressed in #13639 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] These attributes are better for the final state(#13628) [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on PR #13630: URL: https://github.com/apache/lucene/pull/13630#issuecomment-2284113028 Closing in favor of #13639 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] These attributes are better for the final state(#13628)(#13630) [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on PR #13639: URL: https://github.com/apache/lucene/pull/13639#issuecomment-2284111753 @mrhbj merged, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] These attributes are better for the final state(#13628)(#13630) [lucene]

2024-08-12 Thread via GitHub
gsmiller merged PR #13639: URL: https://github.com/apache/lucene/pull/13639 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [I] TermsQuery as MultiTermQuery can dramatically overestimate its cost [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on issue #12483: URL: https://github.com/apache/lucene/issues/12483#issuecomment-2284079513 Got it, thanks @romseygeek -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Get better cost estimate on MultiTermQuery over few terms [lucene]

2024-08-12 Thread via GitHub
gsmiller commented on PR #13201: URL: https://github.com/apache/lucene/pull/13201#issuecomment-2284078553 Thanks @msfroh. Yeah, I think it's worth exploring as long as there isn't something expensive-ish hiding in the weight creation of `BooleanQuery`. Thanks again! -- This is an automat

Re: [I] Try applying bipartite graph reordering to KNN graph node ids [lucene]

2024-08-12 Thread via GitHub
jpountz commented on issue #13565: URL: https://github.com/apache/lucene/issues/13565#issuecomment-2284000709 This sounds to me like you are considering renumbering node IDs in the HNSW graph without renumbering doc IDs. I'm curious if you considered renumbering doc IDs like BPIndexReordere

Re: [PR] These attributes are better for the final state(#13628) [lucene]

2024-08-12 Thread via GitHub
mrhbj commented on code in PR #13630: URL: https://github.com/apache/lucene/pull/13630#discussion_r1713395891 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene99/Lucene99SkipWriter.java: ## @@ -46,8 +46,8 @@ * uptos(position, payload). 4. start offset

Re: [I] TermsQuery as MultiTermQuery can dramatically overestimate its cost [lucene]

2024-08-12 Thread via GitHub
romseygeek commented on issue #12483: URL: https://github.com/apache/lucene/issues/12483#issuecomment-2283377281 Hey @gsmiller, thanks for the ping! I'm working elsewhere now, but IIRC @mayya-sharipova was looking at similar issues in the past so may be able to reproduce the issue and see

Re: [PR] Terminate automaton after matched the whole prefix for PrefixQuery. [lucene]

2024-08-12 Thread via GitHub
vsop-479 commented on PR #13072: URL: https://github.com/apache/lucene/pull/13072#issuecomment-2283359055 Conflicts resolved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Avoid SegmentTermsEnumFrame reload block. [lucene]

2024-08-12 Thread via GitHub
vsop-479 commented on PR #13253: URL: https://github.com/apache/lucene/pull/13253#issuecomment-2283279880 Conflicts resolved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.