dungba88 commented on issue #12543:
URL: https://github.com/apache/lucene/issues/12543#issuecomment-1746403380
Thanks @mikemccand ! Let's continue the discuss in this issue instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Git
iverase commented on code in PR #12600:
URL: https://github.com/apache/lucene/pull/12600#discussion_r1345475483
##
lucene/core/src/java19/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -168,6 +168,28 @@ private void readBytesBoundary(byte[] b, int offset, int
len)
uschindler commented on code in PR #12600:
URL: https://github.com/apache/lucene/pull/12600#discussion_r1345489213
##
lucene/core/src/java19/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -168,6 +168,28 @@ private void readBytesBoundary(byte[] b, int offset, int
le
iverase commented on code in PR #12600:
URL: https://github.com/apache/lucene/pull/12600#discussion_r1345506266
##
lucene/core/src/java19/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -168,6 +168,28 @@ private void readBytesBoundary(byte[] b, int offset, int
len)
iverase commented on code in PR #12600:
URL: https://github.com/apache/lucene/pull/12600#discussion_r1345506266
##
lucene/core/src/java19/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -168,6 +168,28 @@ private void readBytesBoundary(byte[] b, int offset, int
len)
benwtrent opened a new issue, #12621:
URL: https://github.com/apache/lucene/issues/12621
### Description
While testing and digging around, I noticed that our float comparisons are
way faster than byte on my Macbook (M1) and pretty much the same as our byte
comparisons on a GCP Intel
mikemccand commented on issue #12620:
URL: https://github.com/apache/lucene/issues/12620#issuecomment-1746831073
This might be needle moving on the size of the FSTs created by block tree
for the terms index, since it encodes long as `vLong` in its output. We should
only try this "reverse v
jpountz opened a new pull request, #12622:
URL: https://github.com/apache/lucene/pull/12622
This adds `BPReorderingMergePolicy`, a merge policy wrapper that reorders
doc IDs on merge using a `BPIndexReorderer`.
- Reordering always run on forced merges.
- A `minNaturalMergeNumDocs` pa
risdenk opened a new pull request, #2677:
URL: https://github.com/apache/lucene-solr/pull/2677
Backport of https://github.com/apache/lucene/pull/12604
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
risdenk commented on issue #12598:
URL: https://github.com/apache/lucene/issues/12598#issuecomment-1746863472
FWIW I was looking into this a bit when I saw this issue come in.
Specifically on Solr 8.11, but as far as I can tell the changes in #12604 apply
to 8.x as well.
In a 30s asy
iverase merged PR #12600:
URL: https://github.com/apache/lucene/pull/12600
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
iverase closed issue #12599: Add readBytes method to RandomAccessInput
URL: https://github.com/apache/lucene/issues/12599
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To uns
rmuir commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1747002969
the type conversions are what makes it slow. for float case it is the equiv
of:
```
float x = something;
float y = something;
float z = something;
// no conversions
f
rmuir commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1747026111
Also their suggested replacement of 3 instructions for the `VPDPBUSD` is:
> Likewise, for 8-bit values, three instructions are needed - VPMADDUBSW
which is used to multiply two
jpountz commented on PR #12622:
URL: https://github.com/apache/lucene/pull/12622#issuecomment-1747029247
The diff is large because I had to introduce a new
`SlowCompositeCodecReaderWrapper`, which effectively does the merge (lazily)
and can be fed to the reordering logic prior to actually r
iverase commented on PR #12600:
URL: https://github.com/apache/lucene/pull/12600#issuecomment-1747031597
@uschindler I merged the change.
I tried to backported but it is not possible ByteBuffer#get(int, byte[],
int, int) is not available in the java version on line 9.x. I think it is
uschindler commented on PR #12600:
URL: https://github.com/apache/lucene/pull/12600#issuecomment-1747037797
Hi @iverase,
oh yeah. The absolute ByteBuffer gets are not available in older Java
versions.
If you want to backport, you could create a temporary ByteBuffer slice, but
if y
uschindler commented on PR #12600:
URL: https://github.com/apache/lucene/pull/12600#issuecomment-1747042486
P.S.: See
[docs](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/nio/ByteBuffer.html#get(int,byte%5B%5D,int,int))
here. The method came with Java 13.
--
This is a
rmuir commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1747044386
As far as the ARM goes, the fact it has only 128-bit SIMD is the limiting
factor.
For e.g. AVX-256, we use 64-bit vector of 8 byte values -> 128 bit vector of
8 short values ->
uschindler commented on PR #12600:
URL: https://github.com/apache/lucene/pull/12600#issuecomment-1747053947
@iverase, I think you have to move the changes entry to Lucene 10.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
rmuir commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1747066837
My recommendation: stop messing around with `byte` and start thinking about
the new 16-bit half-float support that is present in Java 21. Unfortunately the
half-float *vectorization*
iverase commented on PR #12600:
URL: https://github.com/apache/lucene/pull/12600#issuecomment-1747072284
>@iverase, I think you have to move the changes entry to Lucene 10.
I did it already in ba74da1
>I changed the Policeman Jenkins MMAP job back to Lucene Main branch. The
nex
uschindler commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1747204954
Actually it is worse: Java 20 introduced conversion between short/float, but
we got neither a native `float16` datatype nor vector support. In short:
completely unuseable. 🤮
--
uschindler commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1747206287
See https://github.com/openjdk/jdk/pull/9422 (Java 20)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use th
robertvanwinkle1138 commented on issue #12615:
URL: https://github.com/apache/lucene/issues/12615#issuecomment-1747228177
@benwtrent
For merges there is "FreshDiskANN: A Fast and Accurate Graph-Based
ANN Index for Streaming Similarity Search"
https://arxiv.org/pdf/2105.09613.pdf
benwtrent commented on issue #12615:
URL: https://github.com/apache/lucene/issues/12615#issuecomment-1747298348
> DiskANN is known to be slower at indexing than HNSW and the blog post does
not compare single threaded index times with Lucene.
@robertvanwinkle1138 this is just one of my
gf2121 commented on code in PR #12610:
URL: https://github.com/apache/lucene/pull/12610#discussion_r1346202698
##
lucene/core/src/java/org/apache/lucene/util/bkd/MutablePointTreeReaderUtils.java:
##
@@ -81,6 +86,40 @@ protected int byteAt(int i, int k) {
return (reade
gf2121 commented on code in PR #12610:
URL: https://github.com/apache/lucene/pull/12610#discussion_r1346210779
##
lucene/core/src/java/org/apache/lucene/util/bkd/MutablePointTreeReaderUtils.java:
##
@@ -81,6 +86,40 @@ protected int byteAt(int i, int k) {
return (reade
jmazanec15 commented on issue #12615:
URL: https://github.com/apache/lucene/issues/12615#issuecomment-1747329967
A hybrid disk-memory algorithm would have very strong benefits. I did run a
few tests recently that confirmed HNSW does not function very well when memory
gets constrained (which
benwtrent commented on issue #12615:
URL: https://github.com/apache/lucene/issues/12615#issuecomment-1747350135
@jmazanec15 I agree that SPANN seems more attractive. I would argue though
we don't need to do clustering (in the paper they do clustering, but with
minimal effectiveness), but co
gf2121 opened a new pull request, #12623:
URL: https://github.com/apache/lucene/pull/12623
### Description
As `StableMSBRadixSorter` always requires a `O(n)` extra memory. We can use
a `MergeSorter` taking advantage of the extra memory instead of
`InPlaceMergeSorter`.
### Benc
dungba88 opened a new pull request, #12624:
URL: https://github.com/apache/lucene/pull/12624
### Description
Refactor the method in `BytesStore` needed for FST construction to an
abstract class and allow it to be passed from `FSTCompiler.Builder`. The
Builder will still maintain `byt
risdenk merged PR #2677:
URL: https://github.com/apache/lucene-solr/pull/2677
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
rmuir commented on issue #12621:
URL: https://github.com/apache/lucene/issues/12621#issuecomment-1747775200
> Actually it is worse: Java 20 introduced conversion between short/float,
but we got neither a native `float16` datatype nor vector support. In short:
completely unuseable.
We
risdenk opened a new pull request, #2678:
URL: https://github.com/apache/lucene-solr/pull/2678
Backport SOLR-17004
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubs
risdenk merged PR #2678:
URL: https://github.com/apache/lucene-solr/pull/2678
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
dungba88 commented on issue #12543:
URL: https://github.com/apache/lucene/issues/12543#issuecomment-1748001631
I put together a PR at https://github.com/apache/lucene/pull/12624.
I also verified with a custom dictionary (~1MB in size) that position does
not go backward to previously w
37 matches
Mail list logo