iverase commented on PR #13550:
URL: https://github.com/apache/lucene/pull/13550#issuecomment-2216740642
@jpountz I added a new test and change the title and description of the
issue as we don't need to add a new Codec.
--
This is an automated message from the Apache Git Service.
To respo
iverase commented on code in PR #13550:
URL: https://github.com/apache/lucene/pull/13550#discussion_r1669846808
##
lucene/test-framework/src/java/org/apache/lucene/tests/codecs/skipper/SkipperCodec.java:
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
iverase commented on code in PR #13550:
URL: https://github.com/apache/lucene/pull/13550#discussion_r1669846499
##
lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseDocValuesFormatTestCase.java:
##
@@ -157,6 +158,7 @@ public void testNumberMergeAwayAllValuesWithSk
vsop-479 commented on PR #13192:
URL: https://github.com/apache/lucene/pull/13192#issuecomment-2216245652
Or, we just supply this (maybe only for `non-allEqual` leaf blocks) as an
option, So, users can use it when their applications are not busy.
--
This is an automated message from the
hossman commented on issue #12100:
URL: https://github.com/apache/lucene/issues/12100#issuecomment-2215985880
I realized today that I had been working on branch_9x, so i've updated the
patch to apply cleanly to main
[WordBreakSpellChecker.breadthfirst.GH-12100.patch.txt](https://gith
github-actions[bot] commented on PR #13359:
URL: https://github.com/apache/lucene/pull/13359#issuecomment-2215694348
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
benwtrent commented on PR #13553:
URL: https://github.com/apache/lucene/pull/13553#issuecomment-2215645063
@gautamworah96 @msokolov this might be part of the reason for the OOMs, the
estimates were completely ignoring the float[] vector sizes for fieldwriters 🤦
. I plan on iterating on this
benwtrent commented on PR #13538:
URL: https://github.com/apache/lucene/pull/13538#issuecomment-2215637537
Related: https://github.com/apache/lucene/pull/13553
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
benwtrent commented on code in PR #13553:
URL: https://github.com/apache/lucene/pull/13553#discussion_r1669465330
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorsWriter.java:
##
@@ -299,9 +299,7 @@ public void finish() throws IOException {
benwtrent commented on code in PR #13553:
URL: https://github.com/apache/lucene/pull/13553#discussion_r1669464616
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -172,9 +172,6 @@ public void finish() throws IOException {
public
benwtrent opened a new pull request, #13553:
URL: https://github.com/apache/lucene/pull/13553
I still need to write a test, but wanted to open this PR early.
Scalar Quantized vector writer ram usage estimates completely ignores the
raw float vectors. Meaning, if you have flush based o
uschindler commented on PR #13545:
URL: https://github.com/apache/lucene/pull/13545#issuecomment-2215482308
I reverted the addition of the file to 9.x branch:
86d080a4e0b4e53e0c9a3f2e2b120bff204c7276
--
This is an automated message from the Apache Git Service.
To respond to the message, p
MilindShyani commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2215327156
@benwtrent Apologies for the late response! I am traveling (and marveling
2000 year old pyramids) right now. The transformation you wrote indeed matches
mine. Thinking about th
benwtrent commented on issue #12440:
URL: https://github.com/apache/lucene/issues/12440#issuecomment-2215292611
I had another idea, I wonder if we can initialize HNSW via coarse grained
clusters. Depending on the clustering algorithm used, we can use clusters built
from various segments to
zhaih merged PR #13548:
URL: https://github.com/apache/lucene/pull/13548
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
stefanvodita commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1669100332
##
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:
##
@@ -328,42 +336,65 @@ protected LeafSlice[] slices(List
leaves) {
/** Static method to
uschindler commented on issue #13551:
URL: https://github.com/apache/lucene/issues/13551#issuecomment-2214871854
> * it should be a one-liner using `setPreload` to preload "*.vec" if we
wanted to do it either from FSDirectory.open or by default in MMapDirectory
It is trivial:
```ja
uschindler commented on issue #13551:
URL: https://github.com/apache/lucene/issues/13551#issuecomment-2214868089
I don't think we should change anything here in MMapDirectory. It is all
available and easy to do for one that wants to do this. Elasticserach is doing
this for some files, but w
uschindler commented on code in PR #13545:
URL: https://github.com/apache/lucene/pull/13545#discussion_r1669064660
##
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##
@@ -212,6 +212,14 @@ public static int int4DotProductPacked(byte[] unpacked,
byte[] packed) {
uschindler commented on PR #13545:
URL: https://github.com/apache/lucene/pull/13545#issuecomment-2214846241
See:
https://github.com/apache/lucene/commit/c8b4a76ecc93a98c779364b18f62c9b67552c192#diff-dd8d7417893f9b2fecaef29491b94d5daeaae6d496c4b21bb9633b4f7b060e59
--
This is an automated m
uschindler commented on PR #13545:
URL: https://github.com/apache/lucene/pull/13545#issuecomment-2214845553
Hi,
in the backport to 9.x the benchmark file was wrongly merged. It landed in
the test directory. In 9.x we have no benchmark-jmh module in Gradle, so the
file should have been le
aoli-al opened a new issue, #13552:
URL: https://github.com/apache/lucene/issues/13552
### Description
I saw a flaky test,
`TestIndexWriterWithThreads#testIOExceptionDuringWriteSegmentWithThreadsOnlyOnce`
caused by concurrency issues recently:
```
MockDirectoryWrapper: can
zhaih commented on code in PR #13548:
URL: https://github.com/apache/lucene/pull/13548#discussion_r1669056420
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java:
##
@@ -139,6 +142,54 @@ public int nextDoc() throws IOException {
}
}
+ /**
+ * Give
zhaih commented on code in PR #13548:
URL: https://github.com/apache/lucene/pull/13548#discussion_r1669054287
##
lucene/core/src/java/org/apache/lucene/index/FieldUpdatesBuffer.java:
##
@@ -356,7 +356,7 @@ BufferedUpdate next() throws IOException {
}
}
-BytesRe
mikemccand commented on issue #13551:
URL: https://github.com/apache/lucene/issues/13551#issuecomment-2214828022
Oh sorry I used the wrong term (thank you @rmuir for clarifying!): it's not
a swap storm I'm seeing, it's a page storm. The OS has plenty of free ram
(reported by `free`), and t
benwtrent commented on code in PR #13548:
URL: https://github.com/apache/lucene/pull/13548#discussion_r1669043121
##
lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java:
##
@@ -139,6 +142,54 @@ public int nextDoc() throws IOException {
}
}
+ /**
+ *
msokolov commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1668890994
##
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:
##
@@ -328,42 +336,65 @@ protected LeafSlice[] slices(List
leaves) {
/** Static method to seg
msokolov commented on PR #13542:
URL: https://github.com/apache/lucene/pull/13542#issuecomment-2214811101
I wonder if we should tackle the issue with caching / cloning scorers? We
have scorers/scorerSuppliers that do a lot of up-front work when created and we
don't want to duplicate that wo
jpountz commented on PR #13542:
URL: https://github.com/apache/lucene/pull/13542#issuecomment-2214643611
> The change in expectation should be reflected in the Collector API
semantics though (rather that CollectorManager?), is that what you meant?
I was referring to `CollectorManager
ChrisHegarty merged PR #13545:
URL: https://github.com/apache/lucene/pull/13545
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucen
benwtrent commented on issue #13519:
URL: https://github.com/apache/lucene/issues/13519#issuecomment-2214591883
@jbhateja could you unpack how this would actually work when using
dot-product with linear scale corrections?
I would imagine we could switch to an "unsigned byte" compariso
jpountz commented on issue #13551:
URL: https://github.com/apache/lucene/issues/13551#issuecomment-2214541782
It wouldn't solve the issue, only mitigate it, but hopefully cold start
performance gets better when we start leveraging `IndexInput#prefetch` to load
multiple vectors from disk con
rmuir commented on issue #13551:
URL: https://github.com/apache/lucene/issues/13551#issuecomment-2214496576
> It's not easy to do -- you wouldn't know up front that the application
will do KNN searching at all. And, maybe only certain vectors in the `.vec`
will ever be accessed and so you n
jpountz commented on code in PR #13550:
URL: https://github.com/apache/lucene/pull/13550#discussion_r1668890679
##
lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseDocValuesFormatTestCase.java:
##
@@ -157,6 +158,7 @@ public void testNumberMergeAwayAllValuesWithSk
mikemccand opened a new issue, #13551:
URL: https://github.com/apache/lucene/issues/13551
### Description
This is really a "discussion" issue. I'm not sure at all that the idea is
feasible:
I've been testing `luceneutil` with heavy KNN indexing (Cohere wikipedia
`en` 768 dime
original-brownbear commented on PR #13472:
URL: https://github.com/apache/lucene/pull/13472#issuecomment-2213989662
Sure thing, on it! :) sorry could've done that right away, tired me just
didn't realise it this morning .
--
This is an automated message from the Apache Git Service.
To res
uschindler commented on PR #13535:
URL: https://github.com/apache/lucene/pull/13535#issuecomment-2213707284
There are some test failures due to strict thread checking. I think the mock
input should only do this when its in confined mode.
--
This is an automated message from the Apache Git
ChrisHegarty commented on PR #13535:
URL: https://github.com/apache/lucene/pull/13535#issuecomment-2213644712
Thanks for the comments so far. I updated the PR to only check same-thread
semantics for MSII clone and slice. And also added some basic thread checks to
MockIndexInputWrapper. I
javanna commented on PR #13542:
URL: https://github.com/apache/lucene/pull/13542#issuecomment-2213602344
Bulk reply to some of the feedback I got:
hi @shubhamvishu ,
> I know it might be too early to ask(as changes are not yet consolidated),
but curious if we have any early be
jpountz merged PR #13543:
URL: https://github.com/apache/lucene/pull/13543
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz commented on code in PR #13543:
URL: https://github.com/apache/lucene/pull/13543#discussion_r1668261240
##
lucene/core/src/java/org/apache/lucene/store/OutputStreamIndexOutput.java:
##
@@ -135,5 +135,19 @@ void writeLong(long i) throws IOException {
BitUtil.VH_LE_
jpountz merged PR #13544:
URL: https://github.com/apache/lucene/pull/13544
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz merged PR #13546:
URL: https://github.com/apache/lucene/pull/13546
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
javanna commented on code in PR #13542:
URL: https://github.com/apache/lucene/pull/13542#discussion_r1668225258
##
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:
##
@@ -328,42 +336,65 @@ protected LeafSlice[] slices(List
leaves) {
/** Static method to segr
vsop-479 commented on PR #13192:
URL: https://github.com/apache/lucene/pull/13192#issuecomment-2213224505
> This many new allocations
Maybe we can share these allocations(`suffixes`, `positions`, `positions`)
from `searchers`, since they are just immutable and non-stateful data.
--
jpountz commented on PR #13472:
URL: https://github.com/apache/lucene/pull/13472#issuecomment-2213213760
I just pushed an annotation for this change:
https://github.com/mikemccand/luceneutil/commit/a64ac17a9d1a935649837990f2accbace0b93262.
Several queries got a bit faster with a low p
jpountz commented on PR #13472:
URL: https://github.com/apache/lucene/pull/13472#issuecomment-2213200920
@original-brownbear Would you like to work on a PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
47 matches
Mail list logo