dungba88 commented on PR #13594:
URL: https://github.com/apache/lucene/pull/13594#issuecomment-2249529412
There were some recent commits I need to rebase first as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
zhaih commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2249439265
I have run the benchmark and got:
```
baseline:
reindex takes 416602ms
Force merge done in: 275695 ms
candidate:
reindex takes 410387 ms
Force merge done in: 278062
dungba88 commented on PR #13594:
URL: https://github.com/apache/lucene/pull/13594#issuecomment-2249206205
I think common utility makes sense. I'll move both createFilterWeights and
createBitSet to a utility class.
--
This is an automated message from the Apache Git Service.
To respond to
github-actions[bot] commented on PR #13558:
URL: https://github.com/apache/lucene/pull/13558#issuecomment-2249103861
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
gsmiller commented on PR #13559:
URL: https://github.com/apache/lucene/pull/13559#issuecomment-2249072451
> Another idea -- would it help your use case? -- would be to support
nextSetBit(start, end) . We could do this without adding any additional
tracking in existing SparseBitSet methods.
naveentatikonda commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249022240
> @naveentatikonda AH, I see what I did, I pushed one of my experiments to
that branch not an actual good change. Sorry for the false alarm. i will
correct asap.
No w
naveentatikonda commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249021465
> I just noticed that I might not have pushed up my branch. But I will rerun
my tests to verify:
>
>
[main...benwtrent:lucene:fix-8-bit](https://github.com/apache/luc
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2249007016
@naveentatikonda AH, I see what I did, I pushed one of my experiments to
that branch not an actual good change. Sorry for the false alarm. i will
correct asap.
--
This is an au
gsmiller commented on PR #13568:
URL: https://github.com/apache/lucene/pull/13568#issuecomment-2249005915
I've spent some time wrapping my head around the proposed change but haven't
looked at everything in detail yet. I wanted to provide some of my early
questions and feedback though to se
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2248995924
I just noticed that I might not have pushed up my branch. But I will rerun
my tests to verify:
https://github.com/apache/lucene/compare/main...benwtrent:lucene:fix-8-bit
naveentatikonda commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2248954471
@benwtrent I ran some tests with changes in your branch for 8 bits and the
recall for L2 is almost same as what you got. But, recall for innerproduct and
cosinesimilarity sp
dsmiley commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1690396863
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory {
*/
public static fina
original-brownbear commented on PR #13608:
URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248806106
Thanks Uwe!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
uschindler commented on PR #13608:
URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248795826
Backported.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To u
uschindler merged PR #13608:
URL: https://github.com/apache/lucene/pull/13608
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
uschindler commented on PR #13608:
URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248788504
Sorry I fogot the changes text!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
uschindler commented on PR #13608:
URL: https://github.com/apache/lucene/pull/13608#issuecomment-2248785078
> Something I found in an ES heap dump. For large numbers of `FieldReader`
where the minimum term is an empty string, we allocate MBs worth of empty
`byte[]` for larger nodes. Worth a
romseygeek commented on PR #13109:
URL: https://github.com/apache/lucene/pull/13109#issuecomment-2248744950
Hi @bjacobowitz, thanks for the detailed update! I think this would be
easier to reason about if we had some concrete examples. Do you think you
could post some code of composite ma
original-brownbear opened a new pull request, #13608:
URL: https://github.com/apache/lucene/pull/13608
Something I found in an ES heap dump. For large numbers of `FieldReader`
where the minimum term is an empty string, we allocate MBs worth of empty
`byte[]` for larger nodes. Worth adding t
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1690264711
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory {
*/
public static f
dsmiley commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1690256783
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +94,41 @@ public class MMapDirectory extends FSDirectory {
*/
public static fina
javanna commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1690239008
##
lucene/core/src/test/org/apache/lucene/search/TestSortRandom.java:
##
@@ -119,7 +119,8 @@ private void testRandomStringSort(SortField.Type type)
throws Exception {
javanna commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1690241865
##
lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java:
##
@@ -362,6 +362,9 @@ public long cost() {
final IntersectVisitor visitor = getInt
vigyasharma commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2248625418
I started adding support for ParentJoin benchmarks
([issue](https://github.com/mikemccand/luceneutil/issues/284)). Will raise it
in multiple small PRs, here's the [first
one](https:
ChrisHegarty merged PR #13607:
URL: https://github.com/apache/lucene/pull/13607
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucen
bjacobowitz commented on PR #13109:
URL: https://github.com/apache/lucene/pull/13109#issuecomment-2248546608
@romseygeek I'm wondering if maybe we should make those functions `protected
final` as you suggest, but also make some of the `CandidateMatcher`
implementations public.
Right
benwtrent commented on PR #13586:
URL: https://github.com/apache/lucene/pull/13586#issuecomment-2248470540
@jpountz I build an index with ~1M CohereV3 floating point vectors (this
requires about ~4GB of ram), force merged into a single segment, and
benchmarked on `e2-medium` (4GB of ram) wi
uschindler commented on PR #13570:
URL: https://github.com/apache/lucene/pull/13570#issuecomment-2248449684
> Otherwise, I plan to merge tomorrow. And then figure out how to backport!
Code duplication with Arena vs. Session hell!
--
This is an automated message from the Apache Git S
john-wagster commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1690004133
##
lucene/sandbox/src/test/org/apache/lucene/sandbox/codecs/quantization/TestKMeans.java:
##
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (
jpountz commented on PR #13586:
URL: https://github.com/apache/lucene/pull/13586#issuecomment-2248069867
Thanks @benwtrent, not very enlightening indeed. I wonder what benchmark you
ran in case I can reproduce it and play with it?
--
This is an automated message from the Apache Git Servic
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689827722
##
lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java:
##
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
tteofili commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689826703
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) un
magibney commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689819985
##
lucene/core/src/java21/org/apache/lucene/store/RefCountedSharedArena.java:
##
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689813774
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689812957
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (
ChrisHegarty commented on PR #13570:
URL: https://github.com/apache/lucene/pull/13570#issuecomment-2247914051
This looks like it's in good shape. @magibney Any final comments? Otherwise,
I plan to merge tomorrow. And then figure out how to backport!
--
This is an automated message from t
ChrisHegarty opened a new pull request, #13607:
URL: https://github.com/apache/lucene/pull/13607
This is a follow on to #13578, where the backport generalised the test check
to the `readOnce` value of the context, rather than the `READONCE` singleton.
The randomisation should be updated too
jpountz merged PR #13606:
URL: https://github.com/apache/lucene/pull/13606
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
tteofili commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689745724
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) un
benwtrent commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1689640380
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
mayya-sharipova commented on PR #13604:
URL: https://github.com/apache/lucene/pull/13604#issuecomment-2247694408
@mikemccand Here are some numbers on my mac M3:
Doing 34 clusters with defaults (5 restarts, 10 inters each) on vectors of
1024 dims:
| N docs | Performance in secon
dungba88 commented on code in PR #13285:
URL: https://github.com/apache/lucene/pull/13285#discussion_r1689608050
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -143,27 +156,23 @@ protected boolean match(int doc) {
}
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689595022
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +94,38 @@ public class MMapDirectory extends FSDirectory {
*/
public static f
kaivalnp commented on code in PR #13285:
URL: https://github.com/apache/lucene/pull/13285#discussion_r1689584585
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -103,16 +114,18 @@ public Explanation explain(LeafReaderContext context, in
javanna merged PR #13601:
URL: https://github.com/apache/lucene/pull/13601
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
javanna merged PR #13600:
URL: https://github.com/apache/lucene/pull/13600
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
kaivalnp commented on code in PR #13285:
URL: https://github.com/apache/lucene/pull/13285#discussion_r1689579251
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -143,27 +156,23 @@ protected boolean match(int doc) {
}
ChrisHegarty closed pull request #12703: [DRAFT] Load vector data directly from
the memory segment
URL: https://github.com/apache/lucene/pull/12703
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to th
jpountz commented on code in PR #13585:
URL: https://github.com/apache/lucene/pull/13585#discussion_r1689561156
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java:
##
@@ -0,0 +1,1998 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
mikemccand commented on code in PR #13585:
URL: https://github.com/apache/lucene/pull/13585#discussion_r1689557959
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java:
##
@@ -0,0 +1,1998 @@
+/*
+ * Licensed to the Apache Software Foundation (A
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689529488
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInputProvider.java:
##
@@ -125,4 +135,77 @@ private final MemorySegment[] map(
}
ret
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1689527370
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +93,26 @@ public class MMapDirectory extends FSDirectory {
*/
public static
epotyom commented on PR #13568:
URL: https://github.com/apache/lucene/pull/13568#issuecomment-2247513924
> I checked the new commits. Looks good!
Thank you for the feedback @stefanvodita !
> A few points:
> 1. Can you add CHANGES entries, please?
I've added CHANGES.txt
original-brownbear commented on PR #13606:
URL: https://github.com/apache/lucene/pull/13606#issuecomment-2247457206
LGTM :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
jpountz commented on code in PR #13585:
URL: https://github.com/apache/lucene/pull/13585#discussion_r1689502270
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java:
##
@@ -0,0 +1,1998 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
mikemccand commented on code in PR #13585:
URL: https://github.com/apache/lucene/pull/13585#discussion_r1689493102
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/Lucene912PostingsReader.java:
##
@@ -0,0 +1,1998 @@
+/*
+ * Licensed to the Apache Software Foundation (A
jpountz opened a new pull request, #13606:
URL: https://github.com/apache/lucene/pull/13606
This iterates on #13546 to further reduce the overhead of search concurrency
by caching whether the hit count threshold has been reached: once the threshold
has been reached, it cannot get "un-reache
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1689382546
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/ordinal_iterators/CandidateSetOrdinalIterator.java:
##
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Softw
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1689378450
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/misc/LongValueFacetCutter.java:
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
vsop-479 commented on PR #13596:
URL: https://github.com/apache/lucene/pull/13596#issuecomment-2247188517
@jpountz
Please take a look when you get a chance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
kaivalnp commented on PR #13594:
URL: https://github.com/apache/lucene/pull/13594#issuecomment-2247150383
+1 to share as much logic as possible (including `createFilterWeight`). The
`FieldExistsQuery` proposal (to only collect pre-filtered docs which have
vectors) seems promising too
dungba88 commented on code in PR #13285:
URL: https://github.com/apache/lucene/pull/13285#discussion_r1689296027
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -103,16 +114,18 @@ public Explanation explain(LeafReaderContext context, in
jpountz opened a new pull request, #13605:
URL: https://github.com/apache/lucene/pull/13605
It's been pointed multiple times that a difference between Tantivy and
Lucene is the fact that Tantivy uses windows of 4,096 docs when Lucene has a 2x
smaller window size of 2,048 docs and that this
63 matches
Mail list logo