gf2121 merged PR #12604:
URL: https://github.com/apache/lucene/pull/12604
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
gf2121 closed issue #12598: FST#Compiler allocates too much memory
URL: https://github.com/apache/lucene/issues/12598
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubsc
gf2121 opened a new issue, #12620:
URL: https://github.com/apache/lucene/issues/12620
### Description
> We also should really explore the TODO above to write vLong in opposite
byte order -- this might save quite a bit of storage in the FST since outputs
would share more prefixes. Aga
gf2121 opened a new issue, #12619:
URL: https://github.com/apache/lucene/issues/12619
### Description
> Too bad we don't have a writer that uses tiny (like 8 bytes) block at
first, but doubles size for each new block (16 bytes, 32 bytes next, etc.).
Then we would naturally use log(si
tveasey commented on PR #12582:
URL: https://github.com/apache/lucene/pull/12582#issuecomment-1745806230
I saw go by since I’m mentioned on the PR. It seems like Java can’t lay out
byte vectors properly:
https://stackoverflow.com/questions/14531235/in-java-is-it-more-efficient-to-use-byte-o
benwtrent commented on PR #12582:
URL: https://github.com/apache/lucene/pull/12582#issuecomment-1745753817
I was doing some performance testing and was getting weird results.
Quantization search and indexing build were marginally better or exactly the
same.
Attached a zip of async-p
zhaih closed issue #11022: Stop sorting determinize powersets unnecessarily
[LUCENE-9983]
URL: https://github.com/apache/lucene/issues/11022
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the speci
risdenk merged PR #12618:
URL: https://github.com/apache/lucene/pull/12618
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
mikemccand commented on code in PR #12604:
URL: https://github.com/apache/lucene/pull/12604#discussion_r1344628522
##
lucene/CHANGES.txt:
##
@@ -163,6 +163,9 @@ Optimizations
* GITHUB#12382: Faster top-level conjunctions on term queries when sorting by
descending score. (Adr
jimczi commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1344560786
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
risdenk opened a new pull request, #12618:
URL: https://github.com/apache/lucene/pull/12618
### Description
jar-checks.gradle can go into an infinite loop if there are dependencies
that could be circular. In Solr, grpc-utils has a compile dependency on
grpc-core and grpc-core has a r
benwtrent merged PR #12616:
URL: https://github.com/apache/lucene/pull/12616
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
javanna commented on PR #12523:
URL: https://github.com/apache/lucene/pull/12523#issuecomment-1745484064
This looks great thanks @quux00 ! Could you add an entry to the
lucene/CHANGES.txt file under Lucene 9.9 please?
--
This is an automated message from the Apache Git Service.
To respon
benwtrent commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1344514851
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (A
benwtrent commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1344513453
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (A
mikemccand commented on issue #12543:
URL: https://github.com/apache/lucene/issues/12543#issuecomment-1745389156
@dungba88 raised a good point -- FST construction also needs to read prior
bytes it wrote even as it is appending new bytes to the end of the file.
Lucene's IndexInput/Outp
gf2121 commented on PR #12604:
URL: https://github.com/apache/lucene/pull/12604#issuecomment-1745354257
Thanks for all review and suggestions here!
> @mikemccand maybe we can tradeoff here between segments we write the first
time ie through IW and segments we write caused by a merge?
quux00 commented on code in PR #12523:
URL: https://github.com/apache/lucene/pull/12523#discussion_r1344356526
##
lucene/core/src/test/org/apache/lucene/search/TestIndexSearcher.java:
##
@@ -267,7 +265,132 @@ protected LeafSlice[] slices(List
leaves) {
return slice
yugushihuang opened a new issue, #12617:
URL: https://github.com/apache/lucene/issues/12617
### Description
When we build TermStates we pass a flag needStats to determine if we want to
front loading all term statistics, however, we did not have easy access to know
if this flag has be
quux00 commented on code in PR #12523:
URL: https://github.com/apache/lucene/pull/12523#discussion_r1344264569
##
lucene/core/src/test/org/apache/lucene/search/TestIndexSearcher.java:
##
@@ -267,7 +266,133 @@ protected LeafSlice[] slices(List
leaves) {
return slice
iverase commented on PR #12600:
URL: https://github.com/apache/lucene/pull/12600#issuecomment-1745118236
I am just trying to have an abstraction that can replace the BytesRef output
for binary doc values with something that does not impose the internal
representation of the bytes like Bytes
mikemccand commented on issue #12543:
URL: https://github.com/apache/lucene/issues/12543#issuecomment-1745049853
Copying another comment from #10520:
> Maybe we could first allow FSTCompiler to specify its own DataOutput even
when building the tree on-the-fly, instead of always relyin
romseygeek merged PR #12614:
URL: https://github.com/apache/lucene/pull/12614
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on issue #10520:
URL: https://github.com/apache/lucene/issues/10520#issuecomment-1745017914
> Maybe we could first allow FSTCompiler to specify its own DataOutput even
when building the tree on-the-fly, instead of always relying on BytesStore? And
we are free to choose
uschindler commented on PR #12600:
URL: https://github.com/apache/lucene/pull/12600#issuecomment-1744981658
In general to me it is still questionable if we really need a bulk random
access byte[] reader. I am partly agree with thiy, but if somebody asks for
float[] or long[] bulk reads with
uschindler commented on PR #12600:
URL: https://github.com/apache/lucene/pull/12600#issuecomment-1744979771
I changed the https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/ (Linux
MMAP Job to use your branch). Please wait a bit until the checker is happy, as
it tries all differet java ver
uschindler commented on code in PR #12600:
URL: https://github.com/apache/lucene/pull/12600#discussion_r1344093638
##
lucene/core/src/java19/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -168,6 +168,28 @@ private void readBytesBoundary(byte[] b, int offset, int
le
benwtrent opened a new pull request, #12616:
URL: https://github.com/apache/lucene/pull/12616
This is a minor refactor of HNSW graph merging logic.
Instead of directly checking the KnnVectorReader version, this commit
adjusts the logic to see if a specific interface is satisfied for r
mikemccand commented on issue #12543:
URL: https://github.com/apache/lucene/issues/12543#issuecomment-1744945237
Copying the comment from #10520 that's really about this issue:
> I'm planning to refactor the BytesStore into an interface that can be
chosen from the FST builder. And one
mikemccand commented on issue #10520:
URL: https://github.com/apache/lucene/issues/10520#issuecomment-1744944312
> I'm planning to refactor the BytesStore into an interface that can be
chosen from the FST builder. And one can decide whether on-heap or off-heap or
on-heap without blocks is b
benwtrent commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1344049288
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (A
benwtrent commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1344048515
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (A
jimczi commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1344026176
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
benwtrent commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1344010395
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (A
jimczi commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1343976622
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
benwtrent commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1343965288
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (A
dungba88 commented on issue #10520:
URL: https://github.com/apache/lucene/issues/10520#issuecomment-1744786263
I'm planning to refactor the BytesStore into an interface that can be chosen
from the FST builder. And one can decide whether on-heap or off-heap or on-heap
without blocks is best
jimczi commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1343896112
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -0,0 +1,1170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
benwtrent commented on issue #12615:
URL: https://github.com/apache/lucene/issues/12615#issuecomment-1744763221
I do think Lucene's read-only segment based architecture leads itself to
support quantization (required for DiskANN).
It would be an interesting experiment to see how index
gtroitskiy commented on PR #12614:
URL: https://github.com/apache/lucene/pull/12614#issuecomment-1744720246
Thanks for reviewing! I ran tidy and made some refactoring
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
mikemccand opened a new issue, #12615:
URL: https://github.com/apache/lucene/issues/12615
### Description
I came across this compelling sounding [JVector
project](https://foojay.io/today/jvector-1-0/) which looks to have awesome QPS
performance.
It uses
[DiskANN](https://www.
s1monw commented on PR #12604:
URL: https://github.com/apache/lucene/pull/12604#issuecomment-1744672419
@mikemccand maybe we can tradeoff here between segments we write the first
time ie through IW and segments we write caused by a merge? it might mitigate
your concerns.
--
This is an au
javanna commented on code in PR #12523:
URL: https://github.com/apache/lucene/pull/12523#discussion_r1343870579
##
lucene/core/src/test/org/apache/lucene/search/TestIndexSearcher.java:
##
@@ -267,7 +265,132 @@ protected LeafSlice[] slices(List
leaves) {
return slic
mikemccand commented on code in PR #12604:
URL: https://github.com/apache/lucene/pull/12604#discussion_r1343852045
##
lucene/CHANGES.txt:
##
@@ -163,6 +163,8 @@ Optimizations
* GITHUB#12382: Faster top-level conjunctions on term queries when sorting by
descending score. (Adr
mikemccand commented on issue #12598:
URL: https://github.com/apache/lucene/issues/12598#issuecomment-1744644275
Thanks @gf2121 -- this is a great discovery (and thank you
https://blunders.io for the [awesome integrated profiling in Lucene's nightly
benchmarks](https://blunders.io/posts/luc
jpountz commented on PR #12382:
URL: https://github.com/apache/lucene/pull/12382#issuecomment-174462
I just pushed a fix:
https://github.com/apache/lucene/commit/3f81f2f315745f86de3b516d53bf02fde61015a3.
--
This is an automated message from the Apache Git Service.
To respond to the me
romseygeek commented on PR #12614:
URL: https://github.com/apache/lucene/pull/12614#issuecomment-1744588907
Can you run `./gradlew tidy` at the root of the project to make sure the
formatting is all correct?
--
This is an automated message from the Apache Git Service.
To respond to the me
romseygeek commented on code in PR #12614:
URL: https://github.com/apache/lucene/pull/12614#discussion_r1343791876
##
lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java:
##
@@ -385,7 +385,9 @@ public void clearQuery(Query query) {
private void onEviction(Query
javanna commented on code in PR #12523:
URL: https://github.com/apache/lucene/pull/12523#discussion_r1343776898
##
lucene/core/src/test/org/apache/lucene/search/TestIndexSearcher.java:
##
@@ -267,7 +266,133 @@ protected LeafSlice[] slices(List
leaves) {
return slic
javanna commented on code in PR #12523:
URL: https://github.com/apache/lucene/pull/12523#discussion_r1343749320
##
lucene/core/src/test/org/apache/lucene/search/TestIndexSearcher.java:
##
@@ -267,7 +266,133 @@ protected LeafSlice[] slices(List
leaves) {
return slic
javanna merged PR #12606:
URL: https://github.com/apache/lucene/pull/12606
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
51 matches
Mail list logo