jpountz opened a new pull request, #13605:
URL: https://github.com/apache/lucene/pull/13605
It's been pointed multiple times that a difference between Tantivy and
Lucene is the fact that Tantivy uses windows of 4,096 docs when Lucene has a 2x
smaller window size of 2,048 docs and that this
dungba88 commented on code in PR #13285:
URL: https://github.com/apache/lucene/pull/13285#discussion_r1689296027
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -103,16 +114,18 @@ public Explanation explain(LeafReaderContext context, in
kaivalnp commented on PR #13594:
URL: https://github.com/apache/lucene/pull/13594#issuecomment-2247150383
+1 to share as much logic as possible (including `createFilterWeight`). The
`FieldExistsQuery` proposal (to only collect pre-filtered docs which have
vectors) seems promising too
vsop-479 commented on PR #13596:
URL: https://github.com/apache/lucene/pull/13596#issuecomment-2247188517
@jpountz
Please take a look when you get a chance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1689378450
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/misc/LongValueFacetCutter.java:
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1689382546
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ordinal_iterators/CandidateSetOrdinalIterator.java:
##
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Softw
jpountz opened a new pull request, #13606:
URL: https://github.com/apache/lucene/pull/13606
This iterates on #13546 to further reduce the overhead of search concurrency
by caching whether the hit count threshold has been reached: once the threshold
has been reached, it cannot get "un-reache
mikemccand commented on code in PR #13585:
URL: https://github.com/apache/lucene/pull/13585#discussion_r1689493102
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java:
##
@@ -0,0 +1,1998 @@
+/*
+ * Licensed to the Apache Software Foundation (A
jpountz commented on code in PR #13585:
URL: https://github.com/apache/lucene/pull/13585#discussion_r1689502270
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java:
##
@@ -0,0 +1,1998 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
original-brownbear commented on PR #13606:
URL: https://github.com/apache/lucene/pull/13606#issuecomment-2247457206
LGTM :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
epotyom commented on PR #13568:
URL: https://github.com/apache/lucene/pull/13568#issuecomment-2247513924
> I checked the new commits. Looks good!
Thank you for the feedback @stefanvodita !
> A few points:
> 1. Can you add CHANGES entries, please?
I've added CHANGES.txt
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689527370
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +93,26 @@ public class MMapDirectory extends FSDirectory {
*/
public static
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689529488
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInputProvider.java:
##
@@ -125,4 +135,77 @@ private final MemorySegment[] map(
}
ret
mikemccand commented on code in PR #13585:
URL: https://github.com/apache/lucene/pull/13585#discussion_r1689557959
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java:
##
@@ -0,0 +1,1998 @@
+/*
+ * Licensed to the Apache Software Foundation (A
jpountz commented on code in PR #13585:
URL: https://github.com/apache/lucene/pull/13585#discussion_r1689561156
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java:
##
@@ -0,0 +1,1998 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
ChrisHegarty closed pull request #12703: [DRAFT] Load vector data directly from
the memory segment
URL: https://github.com/apache/lucene/pull/12703
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
kaivalnp commented on code in PR #13285:
URL: https://github.com/apache/lucene/pull/13285#discussion_r1689579251
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -143,27 +156,23 @@ protected boolean match(int doc) {
}
javanna merged PR #13600:
URL: https://github.com/apache/lucene/pull/13600
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
javanna merged PR #13601:
URL: https://github.com/apache/lucene/pull/13601
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
kaivalnp commented on code in PR #13285:
URL: https://github.com/apache/lucene/pull/13285#discussion_r1689584585
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -103,16 +114,18 @@ public Explanation explain(LeafReaderContext context, in
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689595022
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +94,38 @@ public class MMapDirectory extends FSDirectory {
*/
public static f
dungba88 commented on code in PR #13285:
URL: https://github.com/apache/lucene/pull/13285#discussion_r1689608050
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -143,27 +156,23 @@ protected boolean match(int doc) {
}
mayya-sharipova commented on PR #13604:
URL: https://github.com/apache/lucene/pull/13604#issuecomment-2247694408
@mikemccand Here are some numbers on my mac M3:
Doing 34 clusters with defaults (5 restarts, 10 inters each) on vectors of
1024 dims:
| N docs | Performance in secon
benwtrent commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689640380
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
tteofili commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689745724
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) un
jpountz merged PR #13606:
URL: https://github.com/apache/lucene/pull/13606
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
ChrisHegarty opened a new pull request, #13607:
URL: https://github.com/apache/lucene/pull/13607
This is a follow on to #13578, where the backport generalised the test check
to the `readOnce` value of the context, rather than the `READONCE` singleton.
The randomisation should be updated too
ChrisHegarty commented on PR #13570:
URL: https://github.com/apache/lucene/pull/13570#issuecomment-2247914051
This looks like it's in good shape. @magibney Any final comments? Otherwise,
I plan to merge tomorrow. And then figure out how to backport!
--
This is an automated message from t
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689812957
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689813774
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (
magibney commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689819985
##
lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java:
##
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
tteofili commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689826703
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) un
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689827722
##
lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java:
##
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
jpountz commented on PR #13586:
URL: https://github.com/apache/lucene/pull/13586#issuecomment-2248069867
Thanks @benwtrent, not very enlightening indeed. I wonder what benchmark you
ran in case I can reproduce it and play with it?
--
This is an automated message from the Apache Git Servic
john-wagster commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1690004133
##
lucene/sandbox/src/test/org/apache/lucene/sandbox/codecs/quantization/TestKMeans.java:
##
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (
uschindler commented on PR #13570:
URL: https://github.com/apache/lucene/pull/13570#issuecomment-2248449684
> Otherwise, I plan to merge tomorrow. And then figure out how to backport!
Code duplication with Arena vs. Session hell!
--
This is an automated message from the Apache Git S
benwtrent commented on PR #13586:
URL: https://github.com/apache/lucene/pull/13586#issuecomment-2248470540
@jpountz I build an index with ~1M CohereV3 floating point vectors (this
requires about ~4GB of ram), force merged into a single segment, and
benchmarked on `e2-medium` (4GB of ram) wi
bjacobowitz commented on PR #13109:
URL: https://github.com/apache/lucene/pull/13109#issuecomment-2248546608
@romseygeek I'm wondering if maybe we should make those functions `protected
final` as you suggest, but also make some of the `CandidateMatcher`
implementations public.
Right
ChrisHegarty merged PR #13607:
URL: https://github.com/apache/lucene/pull/13607
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucen
vigyasharma commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2248625418
I started adding support for ParentJoin benchmarks
([issue](https://github.com/mikemccand/luceneutil/issues/284)). Will raise it
in multiple small PRs, here's the [first
one](https:
javanna commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1690241865
##
lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java:
##
@@ -362,6 +362,9 @@ public long cost() {
final IntersectVisitor visitor = getInt
javanna commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1690239008
##
lucene/core/src/test/org/apache/lucene/search/TestSortRandom.java:
##
@@ -119,7 +119,8 @@ private void testRandomStringSort(SortField.Type type)
throws Exception {
dsmiley commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1690256783
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory {
*/
public static fina
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1690264711
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory {
*/
public static f
original-brownbear opened a new pull request, #13608:
URL: https://github.com/apache/lucene/pull/13608
Something I found in an ES heap dump. For large numbers of `FieldReader`
where the minimum term is an empty string, we allocate MBs worth of empty
`byte[]` for larger nodes. Worth adding t
romseygeek commented on PR #13109:
URL: https://github.com/apache/lucene/pull/13109#issuecomment-2248744950
Hi @bjacobowitz, thanks for the detailed update! I think this would be
easier to reason about if we had some concrete examples. Do you think you
could post some code of composite ma
uschindler commented on PR #13608:
URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248785078
> Something I found in an ES heap dump. For large numbers of `FieldReader`
where the minimum term is an empty string, we allocate MBs worth of empty
`byte[]` for larger nodes. Worth a
uschindler commented on PR #13608:
URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248788504
Sorry I fogot the changes text!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
uschindler merged PR #13608:
URL: https://github.com/apache/lucene/pull/13608
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
uschindler commented on PR #13608:
URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248795826
Backported.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
original-brownbear commented on PR #13608:
URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248806106
Thanks Uwe!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
dsmiley commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1690396863
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory {
*/
public static fina
naveentatikonda commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2248954471
@benwtrent I ran some tests with changes in your branch for 8 bits and the
recall for L2 is almost same as what you got. But, recall for innerproduct and
cosinesimilarity sp
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2248995924
I just noticed that I might not have pushed up my branch. But I will rerun
my tests to verify:
https://github.com/apache/lucene/compare/main...benwtrent:lucene:fix-8-bit
gsmiller commented on PR #13568:
URL: https://github.com/apache/lucene/pull/13568#issuecomment-2249005915
I've spent some time wrapping my head around the proposed change but haven't
looked at everything in detail yet. I wanted to provide some of my early
questions and feedback though to se
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249007016
@naveentatikonda AH, I see what I did, I pushed one of my experiments to
that branch not an actual good change. Sorry for the false alarm. i will
correct asap.
--
This is an au
naveentatikonda commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249021465
> I just noticed that I might not have pushed up my branch. But I will rerun
my tests to verify:
>
>
[main...benwtrent:lucene:fix-8-bit](https://github.com/apache/luc
naveentatikonda commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249022240
> @naveentatikonda AH, I see what I did, I pushed one of my experiments to
that branch not an actual good change. Sorry for the false alarm. i will
correct asap.
No w
gsmiller commented on PR #13559:
URL: https://github.com/apache/lucene/pull/13559#issuecomment-2249072451
> Another idea -- would it help your use case? -- would be to support
nextSetBit(start, end) . We could do this without adding any additional
tracking in existing SparseBitSet methods.
github-actions[bot] commented on PR #13558:
URL: https://github.com/apache/lucene/pull/13558#issuecomment-2249103861
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
dungba88 commented on PR #13594:
URL: https://github.com/apache/lucene/pull/13594#issuecomment-2249206205
I think common utility makes sense. I'll move both createFilterWeights and
createBitSet to a utility class.
--
This is an automated message from the Apache Git Service.
To respond to
zhaih commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2249439265
I have run the benchmark and got:
```
baseline:
reindex takes 416602ms
Force merge done in: 275695 ms
candidate:
reindex takes 410387 ms
Force merge done in: 278062
dungba88 commented on PR #13594:
URL: https://github.com/apache/lucene/pull/13594#issuecomment-2249529412
There were some recent commits I need to rebase first as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
63 matches
Mail list logo