Re: [PR] Add target search concurrency to TieredMergePolicy [lucene]

2024-07-17 Thread via GitHub
jpountz commented on code in PR #13430: URL: https://github.com/apache/lucene/pull/13430#discussion_r1680544831 ## lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java: ## @@ -1095,6 +1095,12 @@ public static TieredMergePolicy newTieredMergePolicy(Ran

Re: [PR] Ensure to use IOContext.READONCE when reading segment files [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty merged PR #13574: URL: https://github.com/apache/lucene/pull/13574 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1680705815 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/facet/TestRangeFacet.java: ## @@ -0,0 +1,1654 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] [9.x] Ensure to use IOContext.READONCE when reading segment files [lucene]

2024-07-17 Thread via GitHub
uschindler commented on code in PR #13578: URL: https://github.com/apache/lucene/pull/13578#discussion_r1680744068 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/MockDirectoryWrapper.java: ## @@ -814,6 +814,14 @@ public synchronized IndexInput openInput(String n

Re: [PR] [9.x] Ensure to use IOContext.READONCE when reading segment files [lucene]

2024-07-17 Thread via GitHub
uschindler commented on code in PR #13578: URL: https://github.com/apache/lucene/pull/13578#discussion_r1680744068 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/MockDirectoryWrapper.java: ## @@ -814,6 +814,14 @@ public synchronized IndexInput openInput(String n

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1680751732 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java: ## @@ -1792,61 +1794,88 @@ public DocValuesSkipper getSkipper(FieldInfo field

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
stefanvodita commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1680778354 ## lucene/demo/src/java/org/apache/lucene/demo/facet/SandboxFacetsExample.java: ## @@ -0,0 +1,714 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] Add target search concurrency to TieredMergePolicy [lucene]

2024-07-17 Thread via GitHub
carlosdelest commented on code in PR #13430: URL: https://github.com/apache/lucene/pull/13430#discussion_r1680829569 ## lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java: ## @@ -1095,6 +1095,12 @@ public static TieredMergePolicy newTieredMergePolic

Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

2024-07-17 Thread via GitHub
rmuir commented on code in PR #13572: URL: https://github.com/apache/lucene/pull/13572#discussion_r1680841649 ## lucene/core/build.gradle: ## @@ -14,10 +14,43 @@ * See the License for the specific language governing permissions and * limitations under the License. */ +plug

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-17 Thread via GitHub
benwtrent commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2233036266 > In any case, you have worked with the code far longer than I have, so if you are confident about it please go ahead and commit :) @naveentatikonda could you test my branch

Re: [PR] [9.x] Ensure to use IOContext.READONCE when reading segment files [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty commented on code in PR #13578: URL: https://github.com/apache/lucene/pull/13578#discussion_r1680862628 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/MockDirectoryWrapper.java: ## @@ -814,6 +814,14 @@ public synchronized IndexInput openInput(String

Re: [PR] [9.x] Ensure to use IOContext.READONCE when reading segment files [lucene]

2024-07-17 Thread via GitHub
uschindler commented on code in PR #13578: URL: https://github.com/apache/lucene/pull/13578#discussion_r1680884653 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/MockDirectoryWrapper.java: ## @@ -814,6 +814,14 @@ public synchronized IndexInput openInput(String n

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233089190 > Can someone take a stab at summarizing this for CHANGES.txt, thus avoiding details a reader is unlikely to know about? Like me :-). How will a user of Lucene benefit? ```

Re: [PR] Add target search concurrency to TieredMergePolicy [lucene]

2024-07-17 Thread via GitHub
jpountz merged PR #13430: URL: https://github.com/apache/lucene/pull/13430 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] Add a MergePolicy wrapper that preserves search concurrency? [lucene]

2024-07-17 Thread via GitHub
jpountz closed issue #12877: Add a MergePolicy wrapper that preserves search concurrency? URL: https://github.com/apache/lucene/issues/12877 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
mikemccand commented on PR #13568: URL: https://github.com/apache/lucene/pull/13568#issuecomment-2233193543 > Computing facets during collection is more expensive, because we need to collect in each searcher slice, and then merge results in `CollectorManager#reduce`. At the same time, it re

Re: [PR] [9.x] Ensure to use IOContext.READONCE when reading segment files [lucene]

2024-07-17 Thread via GitHub
uschindler commented on code in PR #13578: URL: https://github.com/apache/lucene/pull/13578#discussion_r1681009450 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/MockDirectoryWrapper.java: ## @@ -814,6 +814,14 @@ public synchronized IndexInput openInput(String n

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
jpountz commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1680897686 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java: ## @@ -207,65 +210,133 @@ void accumulate(long value) { maxValue = Mat

Re: [PR] [9.x] Ensure to use IOContext.READONCE when reading segment files [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty commented on code in PR #13578: URL: https://github.com/apache/lucene/pull/13578#discussion_r1681020288 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/MockDirectoryWrapper.java: ## @@ -814,6 +814,14 @@ public synchronized IndexInput openInput(String

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1681055574 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInputProvider.java: ## @@ -125,4 +134,31 @@ private final MemorySegment[] map( } ret

Re: [PR] [9.x] Ensure to use IOContext.READONCE when reading segment files [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty merged PR #13578: URL: https://github.com/apache/lucene/pull/13578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681076840 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java: ## @@ -207,65 +210,133 @@ void accumulate(long value) { maxValue = Mat

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681079162 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java: ## @@ -207,65 +210,133 @@ void accumulate(long value) { maxValue = Mat

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681080290 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java: ## @@ -207,65 +210,133 @@ void accumulate(long value) { maxValue = Mat

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681080747 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java: ## @@ -207,65 +210,133 @@ void accumulate(long value) { maxValue = Mat

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681081567 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java: ## @@ -207,65 +210,133 @@ void accumulate(long value) { maxValue = Mat

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681082271 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java: ## @@ -1792,61 +1794,91 @@ public DocValuesSkipper getSkipper(FieldInfo field

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681083206 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/AssertingLeafReader.java: ## @@ -1194,24 +1194,27 @@ public int numLevels() { @Override publi

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681096883 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -3301,17 +3301,17 @@ private static void checkDocValueSkipper(FieldInfo fi, DocValuesSkipper sk

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
jpountz commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681097434 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/AssertingLeafReader.java: ## @@ -1194,24 +1194,27 @@ public int numLevels() { @Override publi

Re: [PR] Add levels to DocValues skipper index [lucene]

2024-07-17 Thread via GitHub
iverase commented on code in PR #13563: URL: https://github.com/apache/lucene/pull/13563#discussion_r1681112760 ## lucene/test-framework/src/java/org/apache/lucene/tests/index/AssertingLeafReader.java: ## @@ -1194,24 +1194,27 @@ public int numLevels() { @Override publi

[I] Are we properly accounting for `NeighborArray.rwlock`? [lucene]

2024-07-17 Thread via GitHub
msokolov opened a new issue, #13580: URL: https://github.com/apache/lucene/issues/13580 ### Description Really two issues: 1. `OnHeapHnswGraph.ramBytesUsed` has a complicated job - it's hard to tell whether it takes into account the probably quite significant RAM usage of the `Ree

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
magibney commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233474047 This is looking good. I think making it possible for MMapDirectory to customize/bypass grouping is still important -- something along the lines of what Uwe was thinking in [these](#issu

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233498077 I am fine with everything - BUT: we should still check how the grouping works when for an existing segment an additional `*.del` file is updated (same for softdeletes and docvalues up

Re: [I] Are we properly accounting for `NeighborArray.rwlock`? [lucene]

2024-07-17 Thread via GitHub
msokolov commented on issue #13580: URL: https://github.com/apache/lucene/issues/13580#issuecomment-2233498918 I also found we are acquiring these locks for writing in `HnswGraphBuilder.addDiverseNeighbors` even in the single-threaded case where it is not needed. -- This is an automated

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1681197072 ## lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java: ## @@ -252,9 +213,9 @@ public int hashCode() { final int prime = 31; Review Comment:

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233534218 On Windows I got the following test error - kindly what I expected (Windows is not allowed to delete files which are still open or mmapped): ``` TestIndexWriter > testDeleteU

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233544810 > I am fine with everything - BUT: we should still check how the grouping works when for an existing segment an additional `*.del` file is updated (same for softdeletes and docvalue

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233547375 It fails reproducible. Let me look into it :-) Not sure what the issue is (if it is deletes, softdeletes or docvalues updates). -- This is an automated message from the Apache Git S

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233549425 > On Windows I got the following test error - actually what I expected (Windows is not allowed to delete files which are still open or mmapped Ha! you're too fast. I'm still s

Re: [I] Expose flat vectors in "user space" [lucene]

2024-07-17 Thread via GitHub
benwtrent commented on issue #13468: URL: https://github.com/apache/lucene/issues/13468#issuecomment-2233567273 Maybe this is done? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] Are we properly accounting for `NeighborArray.rwlock`? [lucene]

2024-07-17 Thread via GitHub
benwtrent commented on issue #13580: URL: https://github.com/apache/lucene/issues/13580#issuecomment-2233568004 @msokolov related: https://github.com/apache/lucene/issues/12732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233584894 > It fails reproducible. Let me look into it :-) Not sure what the issue is (if it is deletes, softdeletes or docvalues updates). This test failure is "interesting". This test e

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233610476 Test also fails in main branch, so this is not new. Must have been to do with our previous changes. Interestingly it does not fail on [Policeman Jenkins](https://jenkins.thetaphi.de/j

Re: [I] TestSnapshotDeletionPolicy#testMultiThreadedSnapshotting assertion failure [lucene]

2024-07-17 Thread via GitHub
aoli-al commented on issue #13571: URL: https://github.com/apache/lucene/issues/13571#issuecomment-2233656919 Hi @jpountz and @benwtrent, I saw your previous discussion about the test failure. Do you think this patch will help you to replay the failure? Also, let me know if you have any que

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233693234 Hi, this failure is unrelated. The difference is: Locally I use Windows 11, Jenkins uses Windows 10. Actually it looks like on Windows 11 it is in fact possible to delete memory mappe

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
mikemccand commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1681230468 ## lucene/core/src/java/org/apache/lucene/search/CollectorOwner.java: ## @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [I] Are we properly accounting for `NeighborArray.rwlock`? [lucene]

2024-07-17 Thread via GitHub
msokolov commented on issue #13580: URL: https://github.com/apache/lucene/issues/13580#issuecomment-2233756360 ooh thanks @benwtrent I had forgotten about that. I'm working up a new version based on (3) above that I hope will reduce this usage and only require it for concurrent mergers. -

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2233938475 So somebody what has more knowledge about how the filenames of deletion and/or softdeletes/docvalues instant update files are generated should help us. I think deleteion files s

Re: [I] Expose flat vectors in "user space" [lucene]

2024-07-17 Thread via GitHub
msokolov commented on issue #13468: URL: https://github.com/apache/lucene/issues/13468#issuecomment-2233960860 Yes, thanks - I'll resolve -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [I] Expose flat vectors in "user space" [lucene]

2024-07-17 Thread via GitHub
msokolov closed issue #13468: Expose flat vectors in "user space" URL: https://github.com/apache/lucene/issues/13468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [I] TestSnapshotDeletionPolicy#testMultiThreadedSnapshotting assertion failure [lucene]

2024-07-17 Thread via GitHub
benwtrent commented on issue #13571: URL: https://github.com/apache/lucene/issues/13571#issuecomment-2233982449 This is indeed interesting @aoli-al I will have to dig deeper to see. This part of the codebase is...opaque to say the least :/ -- This is an automated message from the Apache G

Re: [PR] Gradle build: cleanup of dependency resolution and consolidation of dependency versions [lucene]

2024-07-17 Thread via GitHub
dsmiley commented on code in PR #13484: URL: https://github.com/apache/lucene/pull/13484#discussion_r1681635416 ## versions.lock: ## Review Comment: Overall; thanks for doing this @dweiss ! What script generates versions.lock? The "because" sections look mysterious.

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2234125875 I added a grouping function to the API - see [4591f7f](https://github.com/apache/lucene/pull/13570/commits/4591f7f904f0f76d4ef3fa5b4accee184183ddf1) -- This is an automated messag

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
ChrisHegarty commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2234130516 > So somebody what has more knowledge about how the filenames of deletion and/or softdeletes/docvalues instant update files are generated should help us. > > I think deleteion

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1681693916 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/facet/SandboxFacetTestCase.java: ## @@ -0,0 +1,407 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1681713454 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/facet/SandboxFacetTestCase.java: ## @@ -0,0 +1,407 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
magibney commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1681711345 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -83,6 +86,19 @@ public class MMapDirectory extends FSDirectory { */ public static fin

[PR] HnswLock: access locks via hash and only use for concurrent indexing [lucene]

2024-07-17 Thread via GitHub
msokolov opened a new pull request, #13581: URL: https://github.com/apache/lucene/pull/13581 Addresses https://github.com/apache/lucene/issues/13580 by adding a locking wrapper for OnHeapHnswGraph's NeighborArrays, and supplying this when running concurrent merges. With this: 1. We n

Re: [PR] HnswLock: access locks via hash and only use for concurrent indexing [lucene]

2024-07-17 Thread via GitHub
msokolov commented on PR #13581: URL: https://github.com/apache/lucene/pull/13581#issuecomment-2234263814 Tested with 1M 256-dim docs ... at least it doesn't seem to make anything worse? # M=16,width=50 Tests with 1M 256-d vectors, M=16, beam-width-index=50 | condition | merge t

[PR] Stop bounding outer window. [lucene]

2024-07-17 Thread via GitHub
jpountz opened a new pull request, #13582: URL: https://github.com/apache/lucene/pull/13582 Currently `MaxScoreBulkScorer` requires its "outer" window to be at least `WINDOW_SIZE`. The intuition there was that we should make sure we should use the whole range of the bit set that we are usin

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1681768843 ## lucene/core/src/java/org/apache/lucene/search/CollectorOwner.java: ## @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more +

Re: [PR] Compute facets while collecting [lucene]

2024-07-17 Thread via GitHub
epotyom commented on PR #13568: URL: https://github.com/apache/lucene/pull/13568#issuecomment-2234352412 Thank you for reviewing @mikemccand ! I've made the changes and updated the branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1681841607 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -167,6 +188,7 @@ public MMapDirectory(Path path, LockFactory lockFactory, long maxChunkSi

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-17 Thread via GitHub
uschindler commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1681847294 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -83,6 +86,19 @@ public class MMapDirectory extends FSDirectory { */ public static f

Re: [PR] HnswLock: access locks via hash and only use for concurrent indexing [lucene]

2024-07-17 Thread via GitHub
benwtrent commented on PR #13581: URL: https://github.com/apache/lucene/pull/13581#issuecomment-2234506747 It's weird that the merge threads have no impact on index build time. Did you create a bunch of segments and then force merge to exercise this code path? Or was the buffe

Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

2024-07-17 Thread via GitHub
goankur commented on code in PR #13572: URL: https://github.com/apache/lucene/pull/13572#discussion_r1682009005 ## lucene/core/build.gradle: ## @@ -14,10 +14,43 @@ * See the License for the specific language governing permissions and * limitations under the License. */ +pl

Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

2024-07-17 Thread via GitHub
goankur commented on PR #13572: URL: https://github.com/apache/lucene/pull/13572#issuecomment-2235177271 > Do we even need to use intrinsics? function is so simple that the compiler seems to do the right thing, e.g. use `SDOT` dot production instruction, given the correct flags: > >

Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

2024-07-17 Thread via GitHub
rmuir commented on PR #13572: URL: https://github.com/apache/lucene/pull/13572#issuecomment-2235209531 > With the updated compile flags, the performance of auto-vectorized code is slightly better than explicitly vectorized code (see results). Interesting thing to note is that both C-based i

Re: [PR] Early terminate visit BKD leaf when current value greater than upper point in sorted dim. [lucene]

2024-07-17 Thread via GitHub
vsop-479 commented on PR #12528: URL: https://github.com/apache/lucene/pull/12528#issuecomment-2235220539 > if there are slowdowns due to the extra check for each visited point. Maybe this extra check can be weakened by [Branch predictor ](https://en.wikipedia.org/wiki/Branch_predic

Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

2024-07-17 Thread via GitHub
rmuir commented on code in PR #13572: URL: https://github.com/apache/lucene/pull/13572#discussion_r1682041365 ## lucene/core/build.gradle: ## @@ -14,12 +14,59 @@ * See the License for the specific language governing permissions and * limitations under the License. */ +plug

Re: [PR] Early terminate visit BKD leaf when current value greater than upper point in sorted dim. [lucene]

2024-07-17 Thread via GitHub
vsop-479 commented on PR #12528: URL: https://github.com/apache/lucene/pull/12528#issuecomment-2235705078 > if there are slowdowns due to the extra check for each visited point. Maybe this extra check can be weakened by [Branch predictor](https://en.wikipedia.org/wiki/Branch_predictor