rmuir commented on code in PR #13636:
URL: https://github.com/apache/lucene/pull/13636#discussion_r1706278927
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java:
##
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Fou
rmuir commented on code in PR #13636:
URL: https://github.com/apache/lucene/pull/13636#discussion_r1706272955
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java:
##
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Fou
rmuir commented on code in PR #13636:
URL: https://github.com/apache/lucene/pull/13636#discussion_r1706268616
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java:
##
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Fou
rmuir commented on code in PR #13636:
URL: https://github.com/apache/lucene/pull/13636#discussion_r1706267079
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentPostingDecodingUtil.java:
##
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Fou
jpountz commented on code in PR #13627:
URL: https://github.com/apache/lucene/pull/13627#discussion_r1706129153
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java:
##
@@ -430,10 +430,16 @@ long updateDocuments(
}
flushingDWPT = flushControl.doAfte
jpountz commented on PR #13636:
URL: https://github.com/apache/lucene/pull/13636#issuecomment-2272150091
For what it's worth, this PR is quite different from
https://github.com/apache/lucene/pull/12412 in that it does not rewrite
`ForUtil.java` completely, only the bits where we read some l
jpountz commented on PR #13636:
URL: https://github.com/apache/lucene/pull/13636#issuecomment-2272145780
Thanks @uschindler, it was helpful. I refactored the PR a bit based on your
recommendation. It's now ready for review.
--
This is an automated message from the Apache Git Service.
To r
gsmiller commented on PR #13599:
URL: https://github.com/apache/lucene/pull/13599#issuecomment-2272107532
In general, I really appreciate that you're looking for opportunities to
cleanup the codebase and find ways to avoid duplicated logic. Thanks
@jainankitk ! At the same time, I don't per
benwtrent commented on code in PR #13627:
URL: https://github.com/apache/lucene/pull/13627#discussion_r1705881356
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThreadPool.java:
##
@@ -138,6 +138,15 @@ private synchronized boolean
contains(DocumentsWriterPerT
benwtrent commented on code in PR #13627:
URL: https://github.com/apache/lucene/pull/13627#discussion_r1705880530
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterDeleteQueue.java:
##
@@ -636,7 +636,7 @@ long getMaxSeqNo() {
}
/** Returns true if it was adv
benwtrent commented on code in PR #13627:
URL: https://github.com/apache/lucene/pull/13627#discussion_r1705880002
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriter.java:
##
@@ -430,10 +430,16 @@ long updateDocuments(
}
flushingDWPT = flushControl.doAf
jainankitk closed pull request #13599: Delegating the matches in
PointRangeQuery weight to relate method
URL: https://github.com/apache/lucene/pull/13599
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
jainankitk commented on code in PR #13599:
URL: https://github.com/apache/lucene/pull/13599#discussion_r1705858403
##
lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java:
##
@@ -127,22 +127,8 @@ public final Weight createWeight(IndexSearcher searcher,
ScoreMode s
gsmiller commented on PR #13631:
URL: https://github.com/apache/lucene/pull/13631#issuecomment-2271718319
> which needs slow scalar code to properly decode the last values in a block
Got it, thanks. I assumed it had something to do with this, but my confusion
came from the fact that t
benwtrent commented on code in PR #13627:
URL: https://github.com/apache/lucene/pull/13627#discussion_r1705834154
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThreadPool.java:
##
@@ -140,6 +140,11 @@ void marksAsFreeAndUnlock(DocumentsWriterPerThread state)
msokolov commented on PR #13566:
URL: https://github.com/apache/lucene/pull/13566#issuecomment-2271669367
I plan to revisit this with a modified approach to address some gaps here:
1. Instead of computing the components rooted at node 0 and others, or
trying to compute strongly-conne
msokolov commented on PR #13577:
URL: https://github.com/apache/lucene/pull/13577#issuecomment-2271664069
This strongly-connected test is hard to make efficient and it's actually
more than we need, given the way we search the graphs hierarchically. I'll
follow up with a different approach
msokolov closed pull request #13577: WIP do not merge
URL: https://github.com/apache/lucene/pull/13577
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: i
msokolov commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2271660159
OK, @Edarke that sounds like a good idea. If you care to follow up with a
change to the hash function that avoids autoboxing and smears, I'd be happy to
review it.
--
This is an auto
benwtrent commented on code in PR #13627:
URL: https://github.com/apache/lucene/pull/13627#discussion_r1705753409
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThreadPool.java:
##
@@ -140,6 +140,11 @@ void marksAsFreeAndUnlock(DocumentsWriterPerThread state)
uschindler commented on PR #13636:
URL: https://github.com/apache/lucene/pull/13636#issuecomment-2271580963
> how to make it properly work out of the box (without enabling the vector
module and preview features)
The vector module must always be enabled, but that's also the case for
`
jpountz commented on PR #13636:
URL: https://github.com/apache/lucene/pull/13636#issuecomment-2271546834
Oh, I had forgotten about this other PR! Thanks Uwe, I'll look into it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub a
jpountz commented on code in PR #13627:
URL: https://github.com/apache/lucene/pull/13627#discussion_r1705717977
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThreadPool.java:
##
@@ -140,6 +140,11 @@ void marksAsFreeAndUnlock(DocumentsWriterPerThread state) {
uschindler commented on PR #13636:
URL: https://github.com/apache/lucene/pull/13636#issuecomment-2271539092
Hi,
your setup needs to be a bit different:
- Don't invent new providers, use the existing one, so add a new factory
method to the generic VectorizationProvider. Here you return
jpountz commented on PR #13636:
URL: https://github.com/apache/lucene/pull/13636#issuecomment-2271488324
@uschindler @ChrisHegarty I could use a bit of help with this change
regarding code organization and how to make it properly work out of the box
(without enabling the vector module and p
jpountz commented on PR #13631:
URL: https://github.com/apache/lucene/pull/13631#issuecomment-2271482402
@gsmiller FYI I have another change that speeds up decoding postings at
#13636 that seems to be a bit more impactful, so I'll try to figure that other
one out before coming back to this
jpountz opened a new pull request, #13636:
URL: https://github.com/apache/lucene/pull/13636
Our postings use a layout that helps take advantage of Java's
auto-vectorization to be reasonably fast to decode. But we can make it a bit
faster by using explicit vectorization on MemorySegment:
jpountz commented on PR #13631:
URL: https://github.com/apache/lucene/pull/13631#issuecomment-2271433649
Thanks for looking @gsmiller ! Regarding numbers of bits per value, some
numbers make the code on the JVM/CPU, you can look at the difference in the
generated code for `decode8`, which m
jpountz commented on code in PR #13631:
URL: https://github.com/apache/lucene/pull/13631#discussion_r1705595936
##
lucene/core/src/java/org/apache/lucene/codecs/lucene912/ForDeltaUtil.java:
##
@@ -41,10 +54,275 @@ private static void prefixSumOfOnes(long[] arr, long base)
{
seanmacavaney commented on PR #13635:
URL: https://github.com/apache/lucene/pull/13635#issuecomment-2271353524
> I could see it being very nice, or behaving poorly depending on the seed
query (which, I guess is expected).
We could probably predict whether a seed set is good or bad bas
seanmacavaney commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1705579214
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,43 @@ public static void search(
search(scorer, knnCollector, gr
seanmacavaney commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1705575607
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +189,44 @@ private TopDocs getLeafResults(
}
}
+ private Do
benwtrent commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1705418758
##
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##
@@ -156,6 +189,44 @@ private TopDocs getLeafResults(
}
}
+ private DocIdS
benwtrent commented on code in PR #13635:
URL: https://github.com/apache/lucene/pull/13635#discussion_r1705409803
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -70,6 +72,43 @@ public static void search(
search(scorer, knnCollector, graph,
seanmacavaney commented on issue #13634:
URL: https://github.com/apache/lucene/issues/13634#issuecomment-2271094294
Thanks! I just opened a draft PR (#13635). To answer your questions:
> The API, this is always tricky to get correct
I've struggled a bit with this. The PR has an
seanmacavaney opened a new pull request, #13635:
URL: https://github.com/apache/lucene/pull/13635
### Description
This PR addresses #13634.
The main changes are in:
- `AbstractKnnVectorQuery`, which adds a `seed` field. It scores this query
if provided, and passes these see
benwtrent commented on issue #13634:
URL: https://github.com/apache/lucene/issues/13634#issuecomment-2271052423
@seanmacavaney I like this idea (I remember reading this paper a while back
and getting excited about it).
A couple of concerns I have are:
- The API, this is alway
seanmacavaney opened a new issue, #13634:
URL: https://github.com/apache/lucene/issues/13634
### Description
In some vector search cases, users may already know some documents that are
likely related to a query. Let's support seeding HNSW's scoring stage with
these documents, rather
romseygeek commented on code in PR #13632:
URL: https://github.com/apache/lucene/pull/13632#discussion_r1705166098
##
lucene/monitor/src/test/org/apache/lucene/monitor/outsidepackage/TestCandidateMatcherVisibility.java:
##
@@ -0,0 +1,204 @@
+/*
+ * Licensed to the Apache Softwar
dungba88 commented on code in PR #13594:
URL: https://github.com/apache/lucene/pull/13594#discussion_r1705038897
##
lucene/core/src/java/org/apache/lucene/search/KnnQueryUtils.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+
40 matches
Mail list logo