ChrisHegarty commented on PR #13535:
URL: https://github.com/apache/lucene/pull/13535#issuecomment-2205713035
>Merging uses an IOContext with a MergeInfo attached. This uses same
ReadAdvice SEQUENTIAL, but that is different to the Singleton READONCE
IOContext. Or do I miss something?
vsop-479 commented on PR #13536:
URL: https://github.com/apache/lucene/pull/13536#issuecomment-2205724663
> Can we extract softLiveDocs from liveDocs and hardLiveDocs, and use it
directly, even instead of FieldExistsQuery?
Maybe it won't get any speed up.
--
This is an automated me
uschindler commented on PR #13535:
URL: https://github.com/apache/lucene/pull/13535#issuecomment-2205763601
P.S.: Because of this I proposed that READONCE IndexInputs should throw
exceptions on clone/slice.
--
This is an automated message from the Apache Git Service.
To respond to the mes
cpoerschke commented on code in PR #13525:
URL: https://github.com/apache/lucene/pull/13525#discussion_r1664079407
##
lucene/core/src/java/org/apache/lucene/codecs/KnnTensorsFormat.java:
##
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or m
cpoerschke commented on code in PR #13525:
URL: https://github.com/apache/lucene/pull/13525#discussion_r1664091301
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java:
##
@@ -143,14 +143,22 @@ public int nextDoc() throws IOException {
public static final cl
cpoerschke commented on code in PR #13525:
URL: https://github.com/apache/lucene/pull/13525#discussion_r1664102266
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java:
##
@@ -209,17 +210,17 @@ public static VectorEncoding readVectorEncoding(D
mikemccand commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2206113117
> My concern for 8 bit quantization is the algebraic expansion of
dot-product and the corrective terms.
>
> For scalar quantization, the score corrections for dotProduct ar
mikemccand commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2206145047
Could someone summarize what is actually wrong with Lucene here? From my
`luceneutil` tests it looks like `int8` with `DOT_PRODUCT` is buggy / does not
work (.006 recall on `coh
javanna commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664308783
##
lucene/core/src/test/org/apache/lucene/search/TestTaskExecutor.java:
##
@@ -308,7 +306,7 @@ public void testInvokeAllCatchesMultipleExceptions() {
}
public
original-brownbear commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664333847
##
lucene/CHANGES.txt:
##
@@ -277,6 +277,15 @@ Optimizations
* GITHUB#12941: Don't preserve auxiliary buffer contents in LSBRadixSorter if
it grows. (St
original-brownbear commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664337927
##
lucene/core/src/test/org/apache/lucene/search/TestTaskExecutor.java:
##
@@ -251,14 +271,15 @@ public void testInvokeAllDoesNotLeaveTasksBehind() {
f
original-brownbear commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664338521
##
lucene/core/src/test/org/apache/lucene/search/TestTaskExecutor.java:
##
@@ -251,14 +271,15 @@ public void testInvokeAllDoesNotLeaveTasksBehind() {
f
benwtrent opened a new pull request, #13538:
URL: https://github.com/apache/lucene/pull/13538
This slightly reduces heap utilization for KnnIndex writers by:
- Only constructing one `DocsWithFieldSet` instead of 3 for Scalar
quantized HNSW formats
- Only having one `List` instead
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2206500270
@mikemccand
> Does the Lucene implementation have per-dimension calculation of
quantiles/min/max? Or is it really a global min/max computed across all
dimensions and all v
original-brownbear commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664369069
##
lucene/CHANGES.txt:
##
@@ -277,6 +277,15 @@ Optimizations
* GITHUB#12941: Don't preserve auxiliary buffer contents in LSBRadixSorter if
it grows. (St
mikemccand commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2206798896
> > I know we are discussing how to make the conflated quantization and
distance metric combinations math work out (i.e. how to fix the issue), but I'm
just trying to get the big
cpoerschke commented on code in PR #13525:
URL: https://github.com/apache/lucene/pull/13525#discussion_r1664549231
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java:
##
@@ -143,14 +143,22 @@ public int nextDoc() throws IOException {
public static final cl
cpoerschke commented on code in PR #13525:
URL: https://github.com/apache/lucene/pull/13525#discussion_r1664549231
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java:
##
@@ -143,14 +143,22 @@ public int nextDoc() throws IOException {
public static final cl
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2206891266
@MilindShyani I am not 100% your equation is correct.
(NOTE: `scale` is just the inverse of alpha).
It seems to me that:
```
byte = (float - minQuantile) * alph
cpoerschke commented on code in PR #13539:
URL: https://github.com/apache/lucene/pull/13539#discussion_r1664572651
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java:
##
@@ -143,61 +145,81 @@ public int nextDoc() throws IOException {
public static final cl
javanna commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664608482
##
lucene/CHANGES.txt:
##
@@ -277,6 +277,15 @@ Optimizations
* GITHUB#12941: Don't preserve auxiliary buffer contents in LSBRadixSorter if
it grows. (Stefan Vodita
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2206982541
@naveentatikonda @mikemccand
https://github.com/apache/lucene/compare/main...benwtrent:fix-8-bit?expand=1
That is my current WIP
--
This is an automated message from the
original-brownbear commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664618492
##
lucene/CHANGES.txt:
##
@@ -277,6 +277,15 @@ Optimizations
* GITHUB#12941: Don't preserve auxiliary buffer contents in LSBRadixSorter if
it grows. (St
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2207015770
With this change, I get the following recall@100 over 500_000 coherev2
(using max-inner-product as designed for the model):
```
recall maxConn beamWidth
0.872 1
javanna commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664646220
##
lucene/core/src/test/org/apache/lucene/search/TestTaskExecutor.java:
##
@@ -251,14 +271,15 @@ public void testInvokeAllDoesNotLeaveTasksBehind() {
for (int i =
javanna commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664646838
##
lucene/CHANGES.txt:
##
@@ -277,6 +277,15 @@ Optimizations
* GITHUB#12941: Don't preserve auxiliary buffer contents in LSBRadixSorter if
it grows. (Stefan Vodita
jpountz commented on PR #13535:
URL: https://github.com/apache/lucene/pull/13535#issuecomment-2207164356
Thanks for checking, I misremembered.
@uschindler +1 to disallowing clones and slices. I' m contemplating making
`MockDirectoryWrapper` also fail if `IndexInput`s get used in a dif
original-brownbear commented on code in PR #13472:
URL: https://github.com/apache/lucene/pull/13472#discussion_r1664758046
##
lucene/core/src/test/org/apache/lucene/search/TestTaskExecutor.java:
##
@@ -251,14 +271,15 @@ public void testInvokeAllDoesNotLeaveTasksBehind() {
f
uschindler commented on PR #13535:
URL: https://github.com/apache/lucene/pull/13535#issuecomment-2207421958
bq. I'm contemplating making MockDirectoryWrapper also fail if IndexInputs
get used in a different thread from the one where they were created.
You mean this for READONCE ones,
gautamworah96 commented on PR #13538:
URL: https://github.com/apache/lucene/pull/13538#issuecomment-2207757150
Thanks for the quick turnaround on this Ben. I am not too familiar with this
code and am taking some time off this week. I'll try to review the code by
Friday.
In the meant
naveentatikonda commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2207802401
@benwtrent The new numbers looks good. But, I have few questions based on
your [code changes
](https://github.com/apache/lucene/compare/main...benwtrent:fix-8-bit?expand=1)
vigyasharma commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2207897576
Did some refactor to reuse the same format. Also renamed "tensor" to
"multi-vector"..
--
This is an automated message from the Apache Git Service.
To respond to the message, please
vigyasharma commented on PR #13525:
URL: https://github.com/apache/lucene/pull/13525#issuecomment-2207906861
Diff is down to 45 files now (from 75, yikes!), and most touch points are
still due to FieldInfo changes. These are the files with key changes:
```ruby
lucene/core/src/java/org
33 matches
Mail list logo