dnhatn merged PR #13590:
URL: https://github.com/apache/lucene/pull/13590
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
dnhatn merged PR #13588:
URL: https://github.com/apache/lucene/pull/13588
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
vsop-479 opened a new pull request, #13589:
URL: https://github.com/apache/lucene/pull/13589
### Description
No difference, see https://github.com/apache/lucene/pull/13557.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
vsop-479 commented on PR #13557:
URL: https://github.com/apache/lucene/pull/13557#issuecomment-2237974082
I also tested it by adding simulation code under `jmh` ( just ensure they
get right optimization), there is no difference neither :
Benchmark (size) Mo
github-actions[bot] commented on PR #13328:
URL: https://github.com/apache/lucene/pull/13328#issuecomment-2237834830
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
github-actions[bot] commented on PR #13536:
URL: https://github.com/apache/lucene/pull/13536#issuecomment-2237834572
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
dnhatn commented on PR #13588:
URL: https://github.com/apache/lucene/pull/13588#issuecomment-2237702315
@jpountz Thanks for reviewing! I am about to write the description and have
updated it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
dnhatn commented on PR #13588:
URL: https://github.com/apache/lucene/pull/13588#issuecomment-2237700737
https://github.com/user-attachments/assets/3842b32e-e337-4a90-b76c-4f51b1ee9bfa";>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
jpountz commented on PR #13586:
URL: https://github.com/apache/lucene/pull/13586#issuecomment-2237667492
Thanks for looking into this! It's disappointing that this small change
degrades performance so much indeed! I'm curious if you are able to run your
benchmark under a profiler to confirm
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1683534122
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/abstracts/GetOrd.java:
##
@@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under on
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1683523816
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/FacetFieldLeafCollector.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
jpountz commented on code in PR #13587:
URL: https://github.com/apache/lucene/pull/13587#discussion_r1683500985
##
lucene/join/src/java/org/apache/lucene/search/join/ToParentBlockJoinQuery.java:
##
@@ -101,12 +99,7 @@ public Weight createWeight(
.rewrite(new Const
Mikep86 opened a new pull request, #13587:
URL: https://github.com/apache/lucene/pull/13587
Updates `ToParentBlockJoinQuery` to propagate the min competitive score when
using `ScoreMode.Max`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
jmazanec15 commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2237366786
> I am not sure what to do for users who quantize their own vectors & rely
on cosine.
I think I am on same page as @msokolov. Users could "float_vector ->
norm_float_vecto
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1683349131
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/FacetFieldLeafCollector.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1683347154
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/FacetFieldLeafCollector.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1683321175
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/FacetFieldLeafCollector.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
benwtrent commented on PR #13586:
URL: https://github.com/apache/lucene/pull/13586#issuecomment-2237254527
🤔 my benchmarking is suspicious. I wonder if I am doing something wrong.
I have a 4GB index, on a 4GB machine, 1GB set aside for the JVM. So, QPS
should be about the same.
vigyasharma commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2237194980
> The pattern doesn't work well with ColBERT esque models.
+1.. Good question, @navneet1v. I had the same doubts before starting this
effort. There is some discussion in
[1231
benwtrent opened a new pull request, #13586:
URL: https://github.com/apache/lucene/pull/13586
I am trying out some prefetching for vector search and HNSW.
This right now is a dead-simple version that simply prefetches the next
neighbor we will explore.
I will respond with some
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1683116076
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return fi
jmazanec15 commented on issue #13564:
URL: https://github.com/apache/lucene/issues/13564#issuecomment-2237010997
Makes sense thanks @benwtrent . Im working on PoC and some experiments.
Didnt realize that the full-precision vectors for quantized index are exposed
via getFloatVectorValues. Th
magibney commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1683132959
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return file
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1683116076
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return fi
uschindler commented on issue #13583:
URL: https://github.com/apache/lucene/issues/13583#issuecomment-2236932133
Yeah, it looks like its missing the second part of the sentence.
...or if an index in the process of committing the return value is not
reliable. Actually the code only che
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1683097237
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return fi
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1683081812
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/recorders/CountFacetRecorder.java:
##
@@ -0,0 +1,189 @@
+/*
+ * Licensed to the Apache Software Foundation (
magibney commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1683080897
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return file
epotyom commented on code in PR #13568:
URL: https://github.com/apache/lucene/pull/13568#discussion_r1683076964
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/recorders/CountFacetRecorder.java:
##
@@ -0,0 +1,189 @@
+/*
+ * Licensed to the Apache Software Foundation (
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236829406
I definitely want to play around more with @goankur 's PR here and see what
performance looks like across machines, but will be out of town for a bit.
There is a script to run the be
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1683020894
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236809947
And i see from playing around with compiler versions, the advantage of
intrinsics approach: although I worry how many variants we'd maintain. it would
give stability across releasing lucen
benwtrent commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2236769005
@msokolov whenever I had to benchmark the original parallel merge change,
the way to isolate was reduce the KnnIndexer buffer size dramatically to create
multiple segments, then measur
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236739249
Here is my proposal visually: https://godbolt.org/z/a9T8YrroY
As you can see, by passing `-march=cascadelake` it generates VNNI
instructions.
IMO, no need for any intrinsics anyw
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236647191
> I avoided it at the time given the toolchain that we were using, but it's
a good option which I'll reevaluate.
It should work well with any modern gcc (@goankur uses gcc 10 here).
msokolov commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2236632284
I agree it's weird we saw no impact -- I'll retry with -forceMerge --
probably there was not enough merge activity?
--
This is an automated message from the Apache Git Service.
To re
jpountz opened a new pull request, #13585:
URL: https://github.com/apache/lucene/pull/13585
This updates the postings format in order to inline skip data into postings.
This format is generally similar to the current `Lucene99PostingsFormat`, e.g.
it shares the same block encoding logic, bu
magibney commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1682897378
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return file
msokolov commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2236548966
It would be interesting to know how many actual users of COSINE there are. I
agree there may be no workaround, but that does not mean we need to continue to
support, either. One qu
ChrisHegarty commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236540457
Could Lucene ever have this directly in one of its modules? We currently
use the `FlatVectorsScorer` to plugin the "native code optimized" alternative,
when scoring Scalar Quantize
ChrisHegarty commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236530280
> > With the updated compile flags, the performance of auto-vectorized code
is slightly better than explicitly vectorized code (see results). Interesting
thing to note is that both
benwtrent opened a new issue, #13584:
URL: https://github.com/apache/lucene/issues/13584
### Description
Back when knn vectors were introduce, we sort of kicked the can for
MemoryIndex support. While it may not make sense to add the approximate search
API (debatable), it should at le
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1682781560
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return fi
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1682680606
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return
benwtrent commented on issue #13281:
URL: https://github.com/apache/lucene/issues/13281#issuecomment-2236364825
I cannot think of an adequate work around at all for `byte` folks. The
linear transformation of bytes will indeed cause potentially non-uniform
magnitudes and could break scoring
iverase commented on code in PR #13563:
URL: https://github.com/apache/lucene/pull/13563#discussion_r1682694746
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java:
##
@@ -207,65 +210,127 @@ void accumulate(long value) {
maxValue = Mat
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1682680606
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return
iverase commented on code in PR #13563:
URL: https://github.com/apache/lucene/pull/13563#discussion_r1682671882
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java:
##
@@ -207,65 +210,127 @@ void accumulate(long value) {
maxValue = Mat
iverase commented on code in PR #13563:
URL: https://github.com/apache/lucene/pull/13563#discussion_r1682656013
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java:
##
@@ -1792,61 +1794,91 @@ public DocValuesSkipper getSkipper(FieldInfo field
iverase commented on code in PR #13563:
URL: https://github.com/apache/lucene/pull/13563#discussion_r1682654554
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesProducer.java:
##
@@ -1792,61 +1794,91 @@ public DocValuesSkipper getSkipper(FieldInfo field
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1682619481
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return fi
uschindler commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1682619481
##
lucene/core/src/java/org/apache/lucene/index/IndexFileNames.java:
##
@@ -142,6 +143,26 @@ public static String stripSegmentName(String filename) {
return fi
ChrisHegarty commented on code in PR #13570:
URL: https://github.com/apache/lucene/pull/13570#discussion_r1682608032
##
lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java:
##
@@ -83,6 +86,19 @@ public class MMapDirectory extends FSDirectory {
*/
public static
jpountz commented on PR #13517:
URL: https://github.com/apache/lucene/pull/13517#issuecomment-2236096293
> improved the logic to not apply a threshold on the doc count for merges
below the min merge size
Woops I need to revert this for now. This made sense for TieredMergePolicy
but t
jpountz merged PR #13517:
URL: https://github.com/apache/lucene/pull/13517
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz merged PR #13582:
URL: https://github.com/apache/lucene/pull/13582
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz commented on code in PR #13563:
URL: https://github.com/apache/lucene/pull/13563#discussion_r1682502283
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90DocValuesConsumer.java:
##
@@ -207,65 +210,127 @@ void accumulate(long value) {
maxValue = Mat
jpountz commented on PR #13517:
URL: https://github.com/apache/lucene/pull/13517#issuecomment-2236001207
Thanks for looking @stefanvodita! I added a CHANGES entry and improved the
logic to not apply a threshold on the doc count for merges below the min merge
size. I will merge soon.
--
T
dweiss commented on code in PR #13484:
URL: https://github.com/apache/lucene/pull/13484#discussion_r1682394662
##
versions.lock:
##
Review Comment:
Oh - the "because" contain a hash key and all configurations/projects which
refer to a dependency. These hashes are used next
dweiss commented on code in PR #13484:
URL: https://github.com/apache/lucene/pull/13484#discussion_r1682387379
##
versions.lock:
##
Review Comment:
A plugin does this. It is similar in nature to palantir's but "passive" - it
only collects dependencies from selected configu
60 matches
Mail list logo