jpountz commented on PR #12334:
URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590631585
We just upgraded Elasticsearch to a Lucene snapshot that has this change,
and this triggered major speedups on some queries. In my opinion, the PR title
and description don't do justice
gashutos commented on PR #12334:
URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590638213
> We just upgraded Elasticsearch to a Lucene snapshot that has this change,
and this triggered major speedups on some queries. In my opinion, the PR title
and description don't do justi
jpountz commented on PR #12334:
URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590663201
@gashutos I think we should make users aware of this optimization, would you
be up for opening another PR that adds a CHANGES entry?
--
This is an automated message from the Apache Git
jpountz commented on PR #12334:
URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590666069
Let's also update the title/description of this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
gashutos opened a new pull request, #12367:
URL: https://github.com/apache/lucene/pull/12367
### Description
Adding CHANGES.txt in improvements sections.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
gashutos closed pull request #12367: Add CHANGES.txt for #12334 Honor after
value for skipping documents even if queue is not full for PagingFieldCollector
URL: https://github.com/apache/lucene/pull/12367
--
This is an automated message from the Apache Git Service.
To respond to the message,
gashutos opened a new pull request, #12368:
URL: https://github.com/apache/lucene/pull/12368
Adding CHANGES.txt for #12334
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
gashutos commented on PR #12334:
URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590701060
Sure, changes title/description, LMK if looks good.
CHANGES.txt PR https://github.com/apache/lucene/pull/12368
--
This is an automated message from the Apache Git Service.
To respon
javanna commented on issue #12347:
URL: https://github.com/apache/lucene/issues/12347#issuecomment-1590702834
heya @sohami thanks a lot for sharing more context.
> With custom slice computation to control the max slices per request/index
the limiting factor in SliceExecutor will not
jpountz commented on PR #12334:
URL: https://github.com/apache/lucene/pull/12334#issuecomment-1590703139
Looks great, thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
jpountz merged PR #12368:
URL: https://github.com/apache/lucene/pull/12368
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
LuXugang commented on PR #12349:
URL: https://github.com/apache/lucene/pull/12349#issuecomment-1590716969
```java
public void test111() throws IOException{
Directory dir = newDirectory();
IndexWriterConfig iwc = new IndexWriterConfig(new
MockAnalyzer(random()));
alessandrobenedetti commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1229310619
##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ConstKnnFloatValueSource.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache
uschindler commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1229323827
##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ConstKnnFloatValueSource.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software
uschindler commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1229323827
##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ConstKnnFloatValueSource.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software
uschindler commented on code in PR #12253:
URL: https://github.com/apache/lucene/pull/12253#discussion_r1229329251
##
lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ConstKnnFloatValueSource.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software
jpountz commented on PR #12349:
URL: https://github.com/apache/lucene/pull/12349#issuecomment-1590865303
I agree that we should fix this comparator so that the last call to
`IndexSearcher.search` in your test only collects 2000 hits. This doesn't seem
to be what your PR does though?
--
T
jpountz merged PR #12366:
URL: https://github.com/apache/lucene/pull/12366
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
javanna opened a new pull request, #12369:
URL: https://github.com/apache/lucene/pull/12369
We have recently increased the likelihood of leveraging inter-segment search
concurrency in tests when newSearcher is used to create the index searcher (see
#959). When parallel execution is enabled
javanna commented on code in PR #12369:
URL: https://github.com/apache/lucene/pull/12369#discussion_r1229356704
##
lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java:
##
@@ -1965,9 +1966,9 @@ public static IndexSearcher newSearcher(
.add
javanna commented on code in PR #12369:
URL: https://github.com/apache/lucene/pull/12369#discussion_r1229359014
##
lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java:
##
@@ -1965,9 +1966,9 @@ public static IndexSearcher newSearcher(
.add
jpountz commented on code in PR #12369:
URL: https://github.com/apache/lucene/pull/12369#discussion_r1229481989
##
lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java:
##
@@ -1941,7 +1940,7 @@ public static IndexSearcher newSearcher(
} else {
javanna commented on code in PR #12369:
URL: https://github.com/apache/lucene/pull/12369#discussion_r1229508836
##
lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java:
##
@@ -1941,7 +1940,7 @@ public static IndexSearcher newSearcher(
} else {
LuXugang commented on PR #12349:
URL: https://github.com/apache/lucene/pull/12349#issuecomment-1591110183
> This doesn't seem to be what your PR does though?
It indeed has no relation to this PR》
> If search sort field does not exist, should we early terminate collection
after
jpountz commented on PR #12349:
URL: https://github.com/apache/lucene/pull/12349#issuecomment-159992
+1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe,
uschindler commented on PR #12281:
URL: https://github.com/apache/lucene/pull/12281#issuecomment-1591112679
I did not see any slowdowns in last night @mikemccand benchmark caused by
the check during indexing and on building the query.
--
This is an automated message from the Apache Git Se
uschindler commented on issue #12358:
URL: https://github.com/apache/lucene/issues/12358#issuecomment-1591117808
Hi, thanks for crosschecking. 1 hour warmup is therefor not changing
anything.
Anyways, I'd use a newer JDK like 20.
--
This is an automated message from the Apache Git
nreimers commented on issue #12342:
URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591318871
@msokolov The index / vector DB should return the dot product score as is.
No scaling, no truncation.
Using dot product is tremendously useful for embedding models, they perf
uschindler commented on issue #12342:
URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591331493
> @msokolov The index / vector DB should return the dot product score as is.
No scaling, no truncation.
>
> Using dot product is tremendously useful for embedding models, t
benwtrent commented on issue #12342:
URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591347442
I would think as long as more negative values are scored lower, we will
retrieve documents in a sane manner.
Scaling negatives to restrict them and then not scaling positiv
msokolov commented on issue #12342:
URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591355715
Yeah, after consideration, I think we could maybe argue for changing the
scaling of negative values given that they were documented as unsupported, even
though it would be breaking
msokolov commented on issue #12342:
URL: https://github.com/apache/lucene/issues/12342#issuecomment-1591379022
Yeah. Another thing we could consider is doing this scaling in
KnnVectorQuery and/or its Scorer. These have the ultimate responsibility of
complying with the Scorer contract. If we
alessandrobenedetti merged PR #12253:
URL: https://github.com/apache/lucene/pull/12253
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr..
alessandrobenedetti closed issue #12252: Add function queries for computing
vector similarity between knn vectors
URL: https://github.com/apache/lucene/issues/12252
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
sohami commented on issue #12347:
URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591452446
@javanna Thanks for your input.
> Another thought on my end: executing sometimes on the caller thread, and
sometimes on the executor makes things hard to reason about: how do y
Jackyrie2 opened a new pull request, #12371:
URL: https://github.com/apache/lucene/pull/12371
### Description
Per @zhaih suggestion in #12236, this PR moves the computation of the
similarity score from `initalizedFromGraph` to a later time, when the
`NeighborArray` needs to be sorted and
benwtrent commented on PR #12371:
URL: https://github.com/apache/lucene/pull/12371#issuecomment-1591770109
Hey @Jackyrie2 this does add some extra memory overhead, 4 new object
references. It would be good if it was justified with a benchmark.
Could you share some benchmarking on ind
javanna commented on issue #12347:
URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591822866
> Take the LeafSlice[] in constructor to allow for custom slice computation.
Sounds good, I'll happily review that change.
> Discuss different options to customize Slice
atris commented on issue #12347:
URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591837122
> > Take the LeafSlice[] in constructor to allow for custom slice
computation.
>
> Sounds good, I'll happily review that change.
>
> > Discuss different options to custom
jbellis opened a new pull request, #12372:
URL: https://github.com/apache/lucene/pull/12372
This changes HnswGraphBuilder to re-use the same candidates queues for
adding nodes by allocating them in the Builder instance.
This saves about 2.5% of build time and takes memory allocations
jbellis commented on PR #12372:
URL: https://github.com/apache/lucene/pull/12372#issuecomment-1591859337
Additionally, the original change only re-used the candidates queues within
a single addNode call, so this is improved in that respect as well.
--
This is an automated message from the
sohami commented on issue #12347:
URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591876384
@atris To summarize, there are 2 separate functionality I am looking to add:
1) Custom slice computation which the extension can provide. For this we can
provide a constructor
jbellis opened a new pull request, #12373:
URL: https://github.com/apache/lucene/pull/12373
Following up to PR #12281
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To uns
jbellis commented on PR #12373:
URL: https://github.com/apache/lucene/pull/12373#issuecomment-1591954514
cc @uschindler
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
sohami opened a new pull request, #12374:
URL: https://github.com/apache/lucene/pull/12374
### Description
Add a constructor which takes in the computed slices from extensions and
uses that for running the search concurrently on provided executor. This is
based on the discussion on the i
sohami commented on issue #12347:
URL: https://github.com/apache/lucene/issues/12347#issuecomment-1591995488
@javanna @atris I have create a PR (#12374) for item 1 above for now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
46 matches
Mail list logo