Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-09 Thread via GitHub
uschindler commented on PR #13636: URL: https://github.com/apache/lucene/pull/13636#issuecomment-2277393866 I cleaned up the constants a bit more, there are still some duplicated in VectorUtilSupport, but I'd change this in the followup PR. -- This is an automated message from the Apache

Re: [I] `gradlew eclipse` no longer works [lucene]

2024-08-09 Thread via GitHub
uschindler commented on issue #13638: URL: https://github.com/apache/lucene/issues/13638#issuecomment-2277400477 Theres also some minor issue: Since this commit, whenever I commit something to the repo it complains about line endings of `versions.toml`: > warning: in the working copy

Re: [PR] Knn(Float-->Byte)VectorField javadocs update in KnnByteVectorQuery [lucene]

2024-08-09 Thread via GitHub
cpoerschke merged PR #13637: URL: https://github.com/apache/lucene/pull/13637 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-09 Thread via GitHub
jpountz commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1711141270 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711163320 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ranges/LongRangeFacetCutter.java: ## @@ -0,0 +1,431 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711174092 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ranges/OverlappingLongRangeFacetCutter.java: ## @@ -0,0 +1,272 @@ +/* + * Licensed to the Apache Software Fo

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711194467 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ranges/ExclusiveLongRangeFacetCutter.java: ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711199728 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ranges/LongRangeFacetCutter.java: ## @@ -0,0 +1,431 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711205288 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ranges/RangeOrdLabelBiMap.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711213814 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ranges/RangeOrdLabelBiMap.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-09 Thread via GitHub
uschindler commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1711262142 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-09 Thread via GitHub
uschindler commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1711262142 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711338640 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/recorders/LongAggregationsFacetRecorder.java: ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711348674 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ordinal_iterators/CandidateSetOrdinalIterator.java: ## @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Softw

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711347682 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/misc/LongValueFacetCutter.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711353146 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/facet/SandboxFacetTestCase.java: ## @@ -0,0 +1,407 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
stefanvodita commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711375034 ## lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java: ## @@ -45,58 +45,26 @@ class DrillSidewaysQuery extends Query { final Query baseQue

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
stefanvodita commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711379107 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/recorders/CountFacetRecorder.java: ## @@ -0,0 +1,189 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
stefanvodita commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711380093 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ranges/OverlappingLongRangeFacetCutter.java: ## @@ -0,0 +1,272 @@ +/* + * Licensed to the Apache Softwa

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711382272 ## lucene/core/src/java/org/apache/lucene/search/CollectorOwner.java: ## @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more +

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
stefanvodita commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711388569 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/recorders/LongAggregationsFacetRecorder.java: ## @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Softw

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711547954 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ordinals/OrdinalGetter.java: ## @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1711646318 ## lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultPostingDecodingUtil.java: ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] These attributes are better for the final state(#13628) [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on code in PR #13630: URL: https://github.com/apache/lucene/pull/13630#discussion_r1711652689 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene99/Lucene99SkipWriter.java: ## @@ -46,8 +46,8 @@ * uptos(position, payload). 4. start off

Re: [PR] These attributes are better for the final state(#13628) [lucene]

2024-08-09 Thread via GitHub
mrhbj commented on PR #13630: URL: https://github.com/apache/lucene/pull/13630#issuecomment-2278184332 ok i will do this ---Original--- From: "Greg ***@***.***> Date: Fri, Aug 9, 2024 23:12 PM To: ***@***.***>; Cc: ***@***.**@***.***>; Subject: Re: [apache/lu

[PR] These attributes are better for the final state(#13628)(#13630) [lucene]

2024-08-09 Thread via GitHub
mrhbj opened a new pull request, #13639: URL: https://github.com/apache/lucene/pull/13639 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] These attributes are better for the final state(#13628)(#13630) [lucene]

2024-08-09 Thread via GitHub
mrhbj commented on PR #13639: URL: https://github.com/apache/lucene/pull/13639#issuecomment-2278225033 @gsmiller thanks for your advise. I think you are right. So I do this. Could you please review it? -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1711693361 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/FacetFieldCollectorManager.java: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
epotyom commented on PR #13568: URL: https://github.com/apache/lucene/pull/13568#issuecomment-2278257636 @stefanvodita , @gsmiller , @mikemccand just wanted to let you know that I think I addressed all existing comments, and I marked as resolved the ones that don't seem to need follow ups.

Re: [PR] Get better cost estimate on MultiTermQuery over few terms [lucene]

2024-08-09 Thread via GitHub
gsmiller merged PR #13201: URL: https://github.com/apache/lucene/pull/13201 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [I] Improve AbstractMultiTermQueryConstantScoreWrapper#RewritingWeight ScorerSupplier cost estimation [lucene]

2024-08-09 Thread via GitHub
gsmiller closed issue #13029: Improve AbstractMultiTermQueryConstantScoreWrapper#RewritingWeight ScorerSupplier cost estimation URL: https://github.com/apache/lucene/issues/13029 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Get better cost estimate on MultiTermQuery over few terms [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on PR #13201: URL: https://github.com/apache/lucene/pull/13201#issuecomment-2278379745 @msfroh hope you don't mind, but since my PR feedback was pretty minor, I went ahead and made those small changes on your branch and merged. I'll work on getting this backported shortly

Re: [I] Deprecate `COSINE` before Lucene 10 release [lucene]

2024-08-09 Thread via GitHub
benwtrent closed issue #13281: Deprecate `COSINE` before Lucene 10 release URL: https://github.com/apache/lucene/issues/13281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Get better cost estimate on MultiTermQuery over few terms [lucene]

2024-08-09 Thread via GitHub
msfroh commented on PR #13201: URL: https://github.com/apache/lucene/pull/13201#issuecomment-2278441860 Thanks, Greg! I can create a follow-up PR with your suggestion on delegating to the rewritten `BooleanQuery` for the cost estimate. I think if we go down that path (basically doing

Re: [I] Lucene99FlatVectorsReader.getFloatVectorValues(): NPE: Cannot read field "vectorEncoding" because "fieldEntry" is null [lucene]

2024-08-09 Thread via GitHub
benwtrent commented on issue #13626: URL: https://github.com/apache/lucene/issues/13626#issuecomment-2278487437 OK, I was able to replicate with the following test: ``` public void testTryToThrowNPE() throws Exception { try (var dir = newDirectory()) { IndexWriterConfig

Re: [I] Lucene99FlatVectorsReader.getFloatVectorValues(): NPE: Cannot read field "vectorEncoding" because "fieldEntry" is null [lucene]

2024-08-09 Thread via GitHub
benwtrent commented on issue #13626: URL: https://github.com/apache/lucene/issues/13626#issuecomment-2278489924 Looking at what PointValues does: ``` FieldInfo fieldInfo = readState.fieldInfos.fieldInfo(fieldName); if (fieldInfo == null) { throw new IllegalArgumentEx

[I] testMergeStability failing for Knn formats [lucene]

2024-08-09 Thread via GitHub
benwtrent opened a new issue, #13640: URL: https://github.com/apache/lucene/issues/13640 ### Description All KNN formats are periodically failing `testMergeStability`. I have verified its due to https://github.com/apache/lucene/pull/13566 The stability failure is due to

Re: [I] testMergeStability failing for Knn formats [lucene]

2024-08-09 Thread via GitHub
benwtrent commented on issue #13640: URL: https://github.com/apache/lucene/issues/13640#issuecomment-2278701946 @msokolov ^ I haven't been able to look into fixing it yet. Just now noticed it. -- This is an automated message from the Apache Git Service. To respond to the message, please l

[PR] Unify how missing field entries are handle in knn formats [lucene]

2024-08-09 Thread via GitHub
benwtrent opened a new pull request, #13641: URL: https://github.com/apache/lucene/pull/13641 It is possible to inappropriately use the knn formats and attempt to merge segments with mismatched field names. None of the formats actually check for `null`, they just all assume that the

Re: [PR] Delegating the matches in PointRangeQuery weight to relate method [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on PR #13599: URL: https://github.com/apache/lucene/pull/13599#issuecomment-2278737360 @jainankitk thanks for the iterations! I'm fine with making this change as you currently having. I'll get it merged. Thanks! -- This is an automated message from the Apache Git Servic

Re: [PR] Add AbstractKnnVectorQuery.seed for seeded HNSW [lucene]

2024-08-09 Thread via GitHub
benwtrent commented on code in PR #13635: URL: https://github.com/apache/lucene/pull/13635#discussion_r1712183885 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -70,6 +72,43 @@ public static void search( search(scorer, knnCollector, graph,

Re: [PR] Delegating the matches in PointRangeQuery weight to relate method [lucene]

2024-08-09 Thread via GitHub
gsmiller merged PR #13599: URL: https://github.com/apache/lucene/pull/13599 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [I] Remove redundant code in PointRangeQuery Weight [lucene]

2024-08-09 Thread via GitHub
gsmiller closed issue #13598: Remove redundant code in PointRangeQuery Weight URL: https://github.com/apache/lucene/issues/13598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] TermsQuery as MultiTermQuery can dramatically overestimate its cost [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on issue #12483: URL: https://github.com/apache/lucene/issues/12483#issuecomment-2278767613 Hey @romseygeek - there's been a recent improvement to `AbstractMultiTermQueryConstantScoreWrapper` that may help the use-case you've described here (#13201). I'm curious if this s

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1712226750 ## lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java: ## @@ -45,58 +45,26 @@ class DrillSidewaysQuery extends Query { final Query baseQuery;

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1712245084 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/cutters/ranges/LongRangeFacetCutter.java: ## @@ -0,0 +1,413 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1712245609 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/cutters/ranges/DoubleRangeFacetCutter.java: ## @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1712247433 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/FacetFieldCollectorManager.java: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1712250405 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ordinals/CandidateSetOrdinalIterator.java: ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1712251401 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/recorders/FacetRecorder.java: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Compute facets while collecting [lucene]

2024-08-09 Thread via GitHub
gsmiller commented on PR #13568: URL: https://github.com/apache/lucene/pull/13568#issuecomment-2278854633 @epotyom I looked over the recent changes and went through the unresolved comments. I don't think there's anything blocking at this point from my point of view. @mikemccand do you have

Re: [PR] expand TestSegmentToThreadMapping coverage w.r.t. (excess) documents per slice [lucene]

2024-08-09 Thread via GitHub
github-actions[bot] commented on PR #13508: URL: https://github.com/apache/lucene/pull/13508#issuecomment-2278903000 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Optimize decoding blocks of postings using the vector API. [lucene]

2024-08-09 Thread via GitHub
jpountz commented on code in PR #13636: URL: https://github.com/apache/lucene/pull/13636#discussion_r1712554332 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software