Re: [I] remove refs to people.apache.org/home.apache.org in build [lucene]

2025-01-15 Thread via GitHub
dweiss commented on issue #13647: URL: https://github.com/apache/lucene/issues/13647#issuecomment-2594748074 Fetched, thanks, David. I'm talking to infra about the possibilities of storing those benchmark files somewhere on Apache services. I don't feel comfortable uploading it to github/gi

Re: [PR] Implement IntersectVisitor#visit(IntsRef) whenever it makes sense [lucene]

2025-01-15 Thread via GitHub
iverase merged PR #14138: URL: https://github.com/apache/lucene/pull/14138 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Implement IntersectVisitor#visit(IntsRef) whenever it makes sense [lucene]

2025-01-15 Thread via GitHub
jpountz commented on PR #14138: URL: https://github.com/apache/lucene/pull/14138#issuecomment-2594042515 > Unrelated test error @jimczi This test is failing because we go through a slightly different code path on the first doc of a segment, which generates a slightly different except

Re: [PR] Specialize DisiPriorityQueue for the 2-clauses case. [lucene]

2025-01-15 Thread via GitHub
jpountz commented on PR #14070: URL: https://github.com/apache/lucene/pull/14070#issuecomment-2593919665 Not completely accidentally, I wanted to merge the more impactful changes I had on my plate before taking another look at the impact of this PR. I'll get back to it shortly. -- This i

Re: [PR] Add a HNSW collector that exits early when nearest neighbor queue saturates [lucene]

2025-01-15 Thread via GitHub
mayya-sharipova commented on code in PR #14094: URL: https://github.com/apache/lucene/pull/14094#discussion_r1917308259 ## lucene/core/src/test/org/apache/lucene/search/HnswQueueSaturationCollectorTest.java: ## @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation

Re: [I] remove refs to people.apache.org/home.apache.org in build [lucene]

2025-01-15 Thread via GitHub
dsmiley commented on issue #13647: URL: https://github.com/apache/lucene/issues/13647#issuecomment-2593762704 [geonames_20130921_randomOrder_allCountries.txt.bz2](http://gofile.me/5MFBZ/edVjck97c) 297.2MB If that works for you, I'll share the other. If it doesn't I'll share in another

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
dweiss commented on PR #14141: URL: https://github.com/apache/lucene/pull/14141#issuecomment-2593741803 Applied to 10x and main - thank you. Seems to work too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Implement IntersectVisitor#visit(IntsRef) whenever it makes sense [lucene]

2025-01-15 Thread via GitHub
iverase commented on PR #14138: URL: https://github.com/apache/lucene/pull/14138#issuecomment-2593620913 Actually we can make BulkAdder a sealed interface and the implementations are Java records. -- This is an automated message from the Apache Git Service. To respond to the message, plea

Re: [PR] Implement IntersectVisitor#visit(IntsRef) whenever it makes sense [lucene]

2025-01-15 Thread via GitHub
iverase commented on PR #14138: URL: https://github.com/apache/lucene/pull/14138#issuecomment-2593604117 I made BulkAdder sealed so we are sure they will always just two implementations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
dweiss merged PR #14141: URL: https://github.com/apache/lucene/pull/14141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [I] remove refs to people.apache.org/home.apache.org in build [lucene]

2025-01-15 Thread via GitHub
dweiss commented on issue #13647: URL: https://github.com/apache/lucene/issues/13647#issuecomment-2593500832 I filed https://issues.apache.org/jira/browse/INFRA-26434 and asked if apache.org can be of any help here. Some of those files are too large to host on github (even in a separate rep

Re: [I] remove refs to people.apache.org/home.apache.org in build [lucene]

2025-01-15 Thread via GitHub
dweiss commented on issue #13647: URL: https://github.com/apache/lucene/issues/13647#issuecomment-2593450811 @mikemccand would you be able to expose the files @dsmiley rescued on your server? -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] Implement IntersectVisitor#visit(IntsRef) whenever it makes sense [lucene]

2025-01-15 Thread via GitHub
iverase commented on PR #14138: URL: https://github.com/apache/lucene/pull/14138#issuecomment-2593403998 Unrelated test error: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Specialize DisiPriorityQueue for the 2-clauses case. [lucene]

2025-01-15 Thread via GitHub
mikemccand commented on PR #14070: URL: https://github.com/apache/lucene/pull/14070#issuecomment-2593357160 Hmm is this PR accidentally dying on the vine @jpountz? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Implement IntersectVisitor#visit(IntsRef) whenever it makes sense [lucene]

2025-01-15 Thread via GitHub
iverase commented on PR #14138: URL: https://github.com/apache/lucene/pull/14138#issuecomment-2593332818 Thanks @jpountz. I added the method to the BulkAdder. The good thing is that for the buffer case we can use `System.arraycopy` which is even better. -- This is an automated message f

[I] :lucene:benchmark:getGeoNames github job fails [lucene]

2025-01-15 Thread via GitHub
dweiss opened a new issue, #14144: URL: https://github.com/apache/lucene/issues/14144 ### Description this task uses home.apache.org https://home.apache.org/~dsmiley/data/${name}.bz2 Related to #13647 I'll take care of cleaning up. ### Version and environment d

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
dweiss commented on PR #14141: URL: https://github.com/apache/lucene/pull/14141#issuecomment-2593304941 Thank you. One of the jobs is failing due to #14144 - I'll fix that and return. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[I] TestBpVectorReorderer.testIndexReorderDense failure in CI [lucene]

2025-01-15 Thread via GitHub
benwtrent opened a new issue, #14143: URL: https://github.com/apache/lucene/issues/14143 ### Description Failed on github CI build: https://github.com/apache/lucene/actions/runs/1279008/job/35654980548 ``` TestBpVectorReorderer > testIndexReorderDense FAILED java.

[PR] Fix `BitSetIterator` to correctly honor the contract of `DocIdSetIterator#intoBitSet`. [lucene]

2025-01-15 Thread via GitHub
jpountz opened a new pull request, #14142: URL: https://github.com/apache/lucene/pull/14142 `BitSetIterator#intoBitSet` would currently fail if `upTo - offset` exceeds the length of the destination bit set. However, `DocIdSetIterator#intoBitSet` only requires matching docs to be set into th

Re: [PR] Add two new "Seeded" Knn queries for seeded vector search [lucene]

2025-01-15 Thread via GitHub
benwtrent merged PR #14084: URL: https://github.com/apache/lucene/pull/14084 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Implement IntersectVisitor#visit(IntsRef) whenever it makes sense [lucene]

2025-01-15 Thread via GitHub
jpountz commented on PR #14138: URL: https://github.com/apache/lucene/pull/14138#issuecomment-2592898984 This makes sense to me, if the `visit()` call is inlined, then feeding the IntsRef into either an int[] (`BufferAdder`) or a `FixedBitSet` (`BitSetAdder`) can be auto-vectorized. Otherwi

Re: [PR] Add a HNSW collector that exits early when nearest neighbor queue saturates [lucene]

2025-01-15 Thread via GitHub
tteofili commented on PR #14094: URL: https://github.com/apache/lucene/pull/14094#issuecomment-2592891629 ![Screenshot 2025-01-15 at 14 40 16](https://github.com/user-attachments/assets/1d32cdbc-9749-40e3-ab4b-20e4f3f7ece5) this sample graph (from Cohere-768) shows how the collection of n

Re: [PR] Add a HNSW collector that exits early when nearest neighbor queue saturates [lucene]

2025-01-15 Thread via GitHub
tteofili commented on PR #14094: URL: https://github.com/apache/lucene/pull/14094#issuecomment-2592891970 ![Screenshot 2025-01-15 at 14 40 16](https://github.com/user-attachments/assets/1d32cdbc-9749-40e3-ab4b-20e4f3f7ece5) this sample graph (from Cohere-768) shows how the collection of n

Re: [PR] Add a HNSW collector that exits early when nearest neighbor queue saturates [lucene]

2025-01-15 Thread via GitHub
tteofili commented on code in PR #14094: URL: https://github.com/apache/lucene/pull/14094#discussion_r1916667467 ## lucene/core/src/java/org/apache/lucene/search/HnswQueueSaturationCollector.java: ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
clayburn opened a new pull request, #14141: URL: https://github.com/apache/lucene/pull/14141 ### Description This PR migrates the Lucene project to publish Build Scans to the the new Develocity instance at develocity.apache.org. Additionally, this PR migrates from the legacy Gr

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
clayburn commented on code in PR #14140: URL: https://github.com/apache/lucene/pull/14140#discussion_r1916614276 ## .github/workflows/run-checks-all.yml: ## @@ -13,7 +13,7 @@ on: - 'branch_10x' env: - GRADLE_ENTERPRISE_ACCESS_KEY: ${{ secrets.GE_ACCESS_TOKEN }} Revie

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
dweiss commented on code in PR #14140: URL: https://github.com/apache/lucene/pull/14140#discussion_r1916612362 ## .github/workflows/run-checks-all.yml: ## @@ -13,7 +13,7 @@ on: - 'branch_10x' env: - GRADLE_ENTERPRISE_ACCESS_KEY: ${{ secrets.GE_ACCESS_TOKEN }} Review

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
dweiss commented on PR #14140: URL: https://github.com/apache/lucene/pull/14140#issuecomment-2592792526 I had to revert this patch, unfortunately. Something doesn't work after changes have been merged - see the actions here: https://github.com/apache/lucene/actions/runs/12788491292 --

Re: [PR] Upgrade OpenNLP from 2.3.2 to 2.5.3 [lucene]

2025-01-15 Thread via GitHub
dweiss commented on PR #14130: URL: https://github.com/apache/lucene/pull/14130#issuecomment-2592736573 LGTM. I think this should go on main and branch_10x? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
dweiss commented on PR #14140: URL: https://github.com/apache/lucene/pull/14140#issuecomment-2592733365 Applied to main and branch_10x. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
dweiss commented on PR #14140: URL: https://github.com/apache/lucene/pull/14140#issuecomment-2592729062 Thanks @clayburn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
dweiss merged PR #14140: URL: https://github.com/apache/lucene/pull/14140 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Publish build scans to develocity.apache.org [lucene]

2025-01-15 Thread via GitHub
clayburn commented on PR #14140: URL: https://github.com/apache/lucene/pull/14140#issuecomment-2592663091 No, the secret already exists. The build scan will not publish from this PR, since my fork cannot access the secret, but post merge the secret is accessible. You can see this from other

Re: [I] Incomplete Javadoc for DirectoryReader#indexExists [lucene]

2025-01-15 Thread via GitHub
dweiss closed issue #13583: Incomplete Javadoc for DirectoryReader#indexExists URL: https://github.com/apache/lucene/issues/13583 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Complete the javadoc for DirectoryReader#indexExists [lucene]

2025-01-15 Thread via GitHub
dweiss merged PR #14136: URL: https://github.com/apache/lucene/pull/14136 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac