Re: [PR] [KNN] Add comment and remove duplicate code [lucene]

2024-07-24 Thread via GitHub
dungba88 commented on PR #13594: URL: https://github.com/apache/lucene/pull/13594#issuecomment-2249529412 There were some recent commits I need to rebase first as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] HnswLock: access locks via hash and only use for concurrent indexing [lucene]

2024-07-24 Thread via GitHub
zhaih commented on PR #13581: URL: https://github.com/apache/lucene/pull/13581#issuecomment-2249439265 I have run the benchmark and got: ``` baseline: reindex takes 416602ms Force merge done in: 275695 ms candidate: reindex takes 410387 ms Force merge done in: 278062

Re: [PR] [KNN] Add comment and remove duplicate code [lucene]

2024-07-24 Thread via GitHub
dungba88 commented on PR #13594: URL: https://github.com/apache/lucene/pull/13594#issuecomment-2249206205 I think common utility makes sense. I'll move both createFilterWeights and createBitSet to a utility class. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] Fix testAddDocumentOnDiskFull to handle IllegalStateException from IndexWriter#close [lucene]

2024-07-24 Thread via GitHub
github-actions[bot] commented on PR #13558: URL: https://github.com/apache/lucene/pull/13558#issuecomment-2249103861 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] SparseFixedBitSet#firstDoc: reduce number of `indices` iterations for a bit set that is not fully built yet. [lucene]

2024-07-24 Thread via GitHub
gsmiller commented on PR #13559: URL: https://github.com/apache/lucene/pull/13559#issuecomment-2249072451 > Another idea -- would it help your use case? -- would be to support nextSetBit(start, end) . We could do this without adding any additional tracking in existing SparseBitSet methods.

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-24 Thread via GitHub
naveentatikonda commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249022240 > @naveentatikonda AH, I see what I did, I pushed one of my experiments to that branch not an actual good change. Sorry for the false alarm. i will correct asap. No w

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-24 Thread via GitHub
naveentatikonda commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249021465 > I just noticed that I might not have pushed up my branch. But I will rerun my tests to verify: > > [main...benwtrent:lucene:fix-8-bit](https://github.com/apache/luc

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-24 Thread via GitHub
benwtrent commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249007016 @naveentatikonda AH, I see what I did, I pushed one of my experiments to that branch not an actual good change. Sorry for the false alarm. i will correct asap. -- This is an au

Re: [PR] Compute facets while collecting [lucene]

2024-07-24 Thread via GitHub
gsmiller commented on PR #13568: URL: https://github.com/apache/lucene/pull/13568#issuecomment-2249005915 I've spent some time wrapping my head around the proposed change but haven't looked at everything in detail yet. I wanted to provide some of my early questions and feedback though to se

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-24 Thread via GitHub
benwtrent commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2248995924 I just noticed that I might not have pushed up my branch. But I will rerun my tests to verify: https://github.com/apache/lucene/compare/main...benwtrent:lucene:fix-8-bit

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-24 Thread via GitHub
naveentatikonda commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2248954471 @benwtrent I ran some tests with changes in your branch for 8 bits and the recall for L2 is almost same as what you got. But, recall for innerproduct and cosinesimilarity sp

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
dsmiley commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1690396863 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory { */ public static fina

Re: [PR] Save allocating some zero length byte arrays [lucene]

2024-07-24 Thread via GitHub
original-brownbear commented on PR #13608: URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248806106 Thanks Uwe! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Save allocating some zero length byte arrays [lucene]

2024-07-24 Thread via GitHub
uschindler commented on PR #13608: URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248795826 Backported. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Save allocating some zero length byte arrays [lucene]

2024-07-24 Thread via GitHub
uschindler merged PR #13608: URL: https://github.com/apache/lucene/pull/13608 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Save allocating some zero length byte arrays [lucene]

2024-07-24 Thread via GitHub
uschindler commented on PR #13608: URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248788504 Sorry I fogot the changes text! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] Save allocating some zero length byte arrays [lucene]

2024-07-24 Thread via GitHub
uschindler commented on PR #13608: URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248785078 > Something I found in an ES heap dump. For large numbers of `FieldReader` where the minimum term is an empty string, we allocate MBs worth of empty `byte[]` for larger nodes. Worth a

Re: [PR] Add WrappedCandidateMatcher for composing matchers [lucene]

2024-07-24 Thread via GitHub
romseygeek commented on PR #13109: URL: https://github.com/apache/lucene/pull/13109#issuecomment-2248744950 Hi @bjacobowitz, thanks for the detailed update! I think this would be easier to reason about if we had some concrete examples. Do you think you could post some code of composite ma

[PR] Save allocating some zero length byte arrays [lucene]

2024-07-24 Thread via GitHub
original-brownbear opened a new pull request, #13608: URL: https://github.com/apache/lucene/pull/13608 Something I found in an ES heap dump. For large numbers of `FieldReader` where the minimum term is an empty string, we allocate MBs worth of empty `byte[]` for larger nodes. Worth adding t

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
uschindler commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1690264711 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory { */ public static f

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
dsmiley commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1690256783 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory { */ public static fina

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-24 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1690239008 ## lucene/core/src/test/org/apache/lucene/search/TestSortRandom.java: ## @@ -119,7 +119,8 @@ private void testRandomStringSort(SortField.Type type) throws Exception {

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-24 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1690241865 ## lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java: ## @@ -362,6 +362,9 @@ public long cost() { final IntersectVisitor visitor = getInt

Re: [PR] [WIP] Multi-Vector support for HNSW search [lucene]

2024-07-24 Thread via GitHub
vigyasharma commented on PR #13525: URL: https://github.com/apache/lucene/pull/13525#issuecomment-2248625418 I started adding support for ParentJoin benchmarks ([issue](https://github.com/mikemccand/luceneutil/issues/284)). Will raise it in multiple small PRs, here's the [first one](https:

Re: [PR] [9.x] Do not randomize readOnce contexts in tests [lucene]

2024-07-24 Thread via GitHub
ChrisHegarty merged PR #13607: URL: https://github.com/apache/lucene/pull/13607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] Add WrappedCandidateMatcher for composing matchers [lucene]

2024-07-24 Thread via GitHub
bjacobowitz commented on PR #13109: URL: https://github.com/apache/lucene/pull/13109#issuecomment-2248546608 @romseygeek I'm wondering if maybe we should make those functions `protected final` as you suggest, but also make some of the `CandidateMatcher` implementations public. Right

Re: [PR] Feature/vector io prefetch [lucene]

2024-07-24 Thread via GitHub
benwtrent commented on PR #13586: URL: https://github.com/apache/lucene/pull/13586#issuecomment-2248470540 @jpountz I build an index with ~1M CohereV3 floating point vectors (this requires about ~4GB of ram), force merged into a single segment, and benchmarked on `e2-medium` (4GB of ram) wi

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
uschindler commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2248449684 > Otherwise, I plan to merge tomorrow. And then figure out how to backport! Code duplication with Arena vs. Session hell! -- This is an automated message from the Apache Git S

Re: [PR] KMeans clustering algorithm [lucene]

2024-07-24 Thread via GitHub
john-wagster commented on code in PR #13604: URL: https://github.com/apache/lucene/pull/13604#discussion_r1690004133 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/codecs/quantization/TestKMeans.java: ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Feature/vector io prefetch [lucene]

2024-07-24 Thread via GitHub
jpountz commented on PR #13586: URL: https://github.com/apache/lucene/pull/13586#issuecomment-2248069867 Thanks @benwtrent, not very enlightening indeed. I wonder what benchmark you ran in case I can reproduce it and play with it? -- This is an automated message from the Apache Git Servic

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
ChrisHegarty commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1689827722 ## lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java: ## @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] KMeans clustering algorithm [lucene]

2024-07-24 Thread via GitHub
tteofili commented on code in PR #13604: URL: https://github.com/apache/lucene/pull/13604#discussion_r1689826703 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java: ## @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
magibney commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1689819985 ## lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java: ## @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] KMeans clustering algorithm [lucene]

2024-07-24 Thread via GitHub
mayya-sharipova commented on code in PR #13604: URL: https://github.com/apache/lucene/pull/13604#discussion_r1689813774 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java: ## @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] KMeans clustering algorithm [lucene]

2024-07-24 Thread via GitHub
mayya-sharipova commented on code in PR #13604: URL: https://github.com/apache/lucene/pull/13604#discussion_r1689812957 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java: ## @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
ChrisHegarty commented on PR #13570: URL: https://github.com/apache/lucene/pull/13570#issuecomment-2247914051 This looks like it's in good shape. @magibney Any final comments? Otherwise, I plan to merge tomorrow. And then figure out how to backport! -- This is an automated message from t

[PR] [9.x] Do not randomize readOnce contexts in tests [lucene]

2024-07-24 Thread via GitHub
ChrisHegarty opened a new pull request, #13607: URL: https://github.com/apache/lucene/pull/13607 This is a follow on to #13578, where the backport generalised the test check to the `readOnce` value of the context, rather than the `READONCE` singleton. The randomisation should be updated too

Re: [PR] Further reduce the search concurrency overhead. [lucene]

2024-07-24 Thread via GitHub
jpountz merged PR #13606: URL: https://github.com/apache/lucene/pull/13606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] KMeans clustering algorithm [lucene]

2024-07-24 Thread via GitHub
tteofili commented on code in PR #13604: URL: https://github.com/apache/lucene/pull/13604#discussion_r1689745724 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java: ## @@ -0,0 +1,346 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] KMeans clustering algorithm [lucene]

2024-07-24 Thread via GitHub
benwtrent commented on code in PR #13604: URL: https://github.com/apache/lucene/pull/13604#discussion_r1689640380 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java: ## @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] KMeans clustering algorithm [lucene]

2024-07-24 Thread via GitHub
mayya-sharipova commented on PR #13604: URL: https://github.com/apache/lucene/pull/13604#issuecomment-2247694408 @mikemccand Here are some numbers on my mac M3: Doing 34 clusters with defaults (5 restarts, 10 inters each) on vectors of 1024 dims: | N docs | Performance in secon

Re: [PR] Add timeout support to AbstractVectorSimilarityQuery [lucene]

2024-07-24 Thread via GitHub
dungba88 commented on code in PR #13285: URL: https://github.com/apache/lucene/pull/13285#discussion_r1689608050 ## lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java: ## @@ -143,27 +156,23 @@ protected boolean match(int doc) { }

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
uschindler commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1689595022 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -83,6 +94,38 @@ public class MMapDirectory extends FSDirectory { */ public static f

Re: [PR] Add timeout support to AbstractVectorSimilarityQuery [lucene]

2024-07-24 Thread via GitHub
kaivalnp commented on code in PR #13285: URL: https://github.com/apache/lucene/pull/13285#discussion_r1689584585 ## lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java: ## @@ -103,16 +114,18 @@ public Explanation explain(LeafReaderContext context, in

Re: [PR] Update TestTopDocsMerge to not rely on search(Query, Collector) [lucene]

2024-07-24 Thread via GitHub
javanna merged PR #13601: URL: https://github.com/apache/lucene/pull/13601 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Update TestTopDocsCollector to no longer rely on search(Query, Collector) [lucene]

2024-07-24 Thread via GitHub
javanna merged PR #13600: URL: https://github.com/apache/lucene/pull/13600 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Add timeout support to AbstractVectorSimilarityQuery [lucene]

2024-07-24 Thread via GitHub
kaivalnp commented on code in PR #13285: URL: https://github.com/apache/lucene/pull/13285#discussion_r1689579251 ## lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java: ## @@ -143,27 +156,23 @@ protected boolean match(int doc) { }

Re: [PR] [DRAFT] Load vector data directly from the memory segment [lucene]

2024-07-24 Thread via GitHub
ChrisHegarty closed pull request #12703: [DRAFT] Load vector data directly from the memory segment URL: https://github.com/apache/lucene/pull/12703 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Inline skip data into postings lists [lucene]

2024-07-24 Thread via GitHub
jpountz commented on code in PR #13585: URL: https://github.com/apache/lucene/pull/13585#discussion_r1689561156 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java: ## @@ -0,0 +1,1998 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Inline skip data into postings lists [lucene]

2024-07-24 Thread via GitHub
mikemccand commented on code in PR #13585: URL: https://github.com/apache/lucene/pull/13585#discussion_r1689557959 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java: ## @@ -0,0 +1,1998 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
ChrisHegarty commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1689529488 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInputProvider.java: ## @@ -125,4 +135,77 @@ private final MemorySegment[] map( } ret

Re: [PR] Aggregate files from the same segment into a single Arena [lucene]

2024-07-24 Thread via GitHub
ChrisHegarty commented on code in PR #13570: URL: https://github.com/apache/lucene/pull/13570#discussion_r1689527370 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -83,6 +93,26 @@ public class MMapDirectory extends FSDirectory { */ public static

Re: [PR] Compute facets while collecting [lucene]

2024-07-24 Thread via GitHub
epotyom commented on PR #13568: URL: https://github.com/apache/lucene/pull/13568#issuecomment-2247513924 > I checked the new commits. Looks good! Thank you for the feedback @stefanvodita ! > A few points: > 1. Can you add CHANGES entries, please? I've added CHANGES.txt

Re: [PR] Further reduce the search concurrency overhead. [lucene]

2024-07-24 Thread via GitHub
original-brownbear commented on PR #13606: URL: https://github.com/apache/lucene/pull/13606#issuecomment-2247457206 LGTM :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Inline skip data into postings lists [lucene]

2024-07-24 Thread via GitHub
jpountz commented on code in PR #13585: URL: https://github.com/apache/lucene/pull/13585#discussion_r1689502270 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java: ## @@ -0,0 +1,1998 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] Inline skip data into postings lists [lucene]

2024-07-24 Thread via GitHub
mikemccand commented on code in PR #13585: URL: https://github.com/apache/lucene/pull/13585#discussion_r1689493102 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java: ## @@ -0,0 +1,1998 @@ +/* + * Licensed to the Apache Software Foundation (A

[PR] Further reduce the search concurrency overhead. [lucene]

2024-07-24 Thread via GitHub
jpountz opened a new pull request, #13606: URL: https://github.com/apache/lucene/pull/13606 This iterates on #13546 to further reduce the overhead of search concurrency by caching whether the hit count threshold has been reached: once the threshold has been reached, it cannot get "un-reache

Re: [PR] Compute facets while collecting [lucene]

2024-07-24 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1689382546 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ordinal_iterators/CandidateSetOrdinalIterator.java: ## @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Softw

Re: [PR] Compute facets while collecting [lucene]

2024-07-24 Thread via GitHub
epotyom commented on code in PR #13568: URL: https://github.com/apache/lucene/pull/13568#discussion_r1689378450 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/misc/LongValueFacetCutter.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF

Re: [PR] Add reopen method in PerThreadPKLookup [lucene]

2024-07-24 Thread via GitHub
vsop-479 commented on PR #13596: URL: https://github.com/apache/lucene/pull/13596#issuecomment-2247188517 @jpountz Please take a look when you get a chance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [KNN] Add comment and remove duplicate code [lucene]

2024-07-24 Thread via GitHub
kaivalnp commented on PR #13594: URL: https://github.com/apache/lucene/pull/13594#issuecomment-2247150383 +1 to share as much logic as possible (including `createFilterWeight`). The `FieldExistsQuery` proposal (to only collect pre-filtered docs which have vectors) seems promising too

Re: [PR] Add timeout support to AbstractVectorSimilarityQuery [lucene]

2024-07-24 Thread via GitHub
dungba88 commented on code in PR #13285: URL: https://github.com/apache/lucene/pull/13285#discussion_r1689296027 ## lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java: ## @@ -103,16 +114,18 @@ public Explanation explain(LeafReaderContext context, in

[PR] Bump the window size of disjunction from 2,048 to 4,096. [lucene]

2024-07-24 Thread via GitHub
jpountz opened a new pull request, #13605: URL: https://github.com/apache/lucene/pull/13605 It's been pointed multiple times that a difference between Tantivy and Lucene is the fact that Tantivy uses windows of 4,096 docs when Lucene has a 2x smaller window size of 2,048 docs and that this