[I] TestKnnFloatVectorQuery.testScoreNegativeDotProduct failing seed [lucene]

2024-12-10 Thread via GitHub
msokolov opened a new issue, #14051: URL: https://github.com/apache/lucene/issues/14051 ### Description FAILED: org.apache.lucene.search.TestKnnFloatVectorQuery.testScoreNegativeDotProduct Error Message: java.lang.AssertionError: expected:<0.0> but was:<0.5> Stack Tr

Re: [PR] lucene-monitor: make abstract DocumentBatch public [lucene]

2024-12-10 Thread via GitHub
github-actions[bot] commented on PR #13993: URL: https://github.com/apache/lucene/pull/13993#issuecomment-257824 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] IndexInput.isLoaded seems to return false for mmap index inputs on Windows [lucene]

2024-12-10 Thread via GitHub
dweiss merged PR #14053: URL: https://github.com/apache/lucene/pull/14053 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [I] IndexInput.isLoaded seems to return false for mmap index inputs on Windows [lucene]

2024-12-10 Thread via GitHub
dweiss closed issue #14050: IndexInput.isLoaded seems to return false for mmap index inputs on Windows URL: https://github.com/apache/lucene/issues/14050 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] Speed up advancing on the disjunction iterator. [lucene]

2024-12-10 Thread via GitHub
jpountz commented on PR #14052: URL: https://github.com/apache/lucene/pull/14052#issuecomment-2531882582 `luceneutil` suggests that this change gives a small slowdown when a `DisjunctionDISIApproximation` leads iteration (`AndHighOrMedMed`, `CombinedOrHighMed`, `CombinedAndHighMed`, `Combin

[PR] Speed up advancing on the disjunction iterator. [lucene]

2024-12-10 Thread via GitHub
jpountz opened a new pull request, #14052: URL: https://github.com/apache/lucene/pull/14052 Currently, the disjunction iterator puts all clauses in a heap in order to be able to merge doc IDs in a streaming fashion. This is a good approach for exhaustive evaluation, when only one clause mov

Re: [PR] Add github on-commit tests on MacOS and Windows [lucene]

2024-12-10 Thread via GitHub
dweiss commented on PR #14054: URL: https://github.com/apache/lucene/pull/14054#issuecomment-2532191067 The failing check on Windows is due to #14053 - shows it's working. ;) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] Add github on-commit tests on MacOS and Windows [lucene]

2024-12-10 Thread via GitHub
dweiss opened a new pull request, #14054: URL: https://github.com/apache/lucene/pull/14054 There are some low-level APIs used now that may be surprising and behave differently on different platforms. I suggest we enable MacOS and Windows builds to have a broader coverage and earlier feedbac

Re: [PR] Add github on-commit tests on MacOS and Windows [lucene]

2024-12-10 Thread via GitHub
dweiss merged PR #14054: URL: https://github.com/apache/lucene/pull/14054 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Refactor dummy scorables. [lucene]

2024-12-10 Thread via GitHub
jpountz commented on PR #14046: URL: https://github.com/apache/lucene/pull/14046#issuecomment-2531960537 There was a regression on nightly benchmarks last night (https://benchmarks.mikemccandless.com/2024.12.09.18.05.31.html) that is either due to this PR or to newly added tasks (`CountOrMa

Re: [PR] Allow reading binary doc values as a RandomAccessInput [lucene]

2024-12-10 Thread via GitHub
iverase commented on code in PR #13948: URL: https://github.com/apache/lucene/pull/13948#discussion_r1877891198 ## lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextDocValuesReader.java: ## @@ -383,17 +392,31 @@ public long cost() { @Override p

Re: [PR] Allow reading binary doc values as a RandomAccessInput [lucene]

2024-12-10 Thread via GitHub
iverase commented on code in PR #13948: URL: https://github.com/apache/lucene/pull/13948#discussion_r1877892464 ## lucene/core/src/java/org/apache/lucene/store/RandomAccessInputDataInput.java: ## @@ -0,0 +1,103 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

Re: [PR] Allow reading binary doc values as a RandomAccessInput [lucene]

2024-12-10 Thread via GitHub
iverase commented on code in PR #13948: URL: https://github.com/apache/lucene/pull/13948#discussion_r1877894071 ## lucene/core/src/java/org/apache/lucene/util/UnicodeUtil.java: ## @@ -627,35 +629,58 @@ public static String toHexString(String s) { } /** - * Interprets t

Re: [PR] Allow reading binary doc values as a RandomAccessInput [lucene]

2024-12-10 Thread via GitHub
iverase commented on code in PR #13948: URL: https://github.com/apache/lucene/pull/13948#discussion_r1877904767 ## lucene/core/src/java/org/apache/lucene/search/FieldComparator.java: ## @@ -234,7 +235,8 @@ public TermValComparator(int numHits, String field, boolean sortMissingL

Re: [I] IndexInput.isLoaded seems to return false for mmap index inputs on Windows [lucene]

2024-12-10 Thread via GitHub
uschindler commented on issue #14050: URL: https://github.com/apache/lucene/issues/14050#issuecomment-2531096386 Hi, I would do it like the following on Windows: - if windows returns `true`, `return Optional.of(Boolean.TRUE)` - if windows returns `false`, `return Optional.empty()`

Re: [I] TestSoftDeletesDirectoryReaderWrapper.testAvoidWrappingReadersWithoutSoftDeletes AssertionError: expected:<5> but was:<3> [lucene]

2024-12-10 Thread via GitHub
cwperks commented on issue #14020: URL: https://github.com/apache/lucene/issues/14020#issuecomment-2532987711 I'm interested in contributing a fix for this test. Any pointers? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Add a Better Binary Quantizer format for dense vectors [lucene]

2024-12-10 Thread via GitHub
tanyaroosta commented on PR #13651: URL: https://github.com/apache/lucene/pull/13651#issuecomment-2532887197 FYI, a blog post on RaBitQ: https://dev.to/gaoj0017/quantization-in-the-counterintuitive-high-dimensional-space-4feg -- This is an automated message from the Apache Git Serv

[I] Is there any API which two segments can be manually mergeed? [lucene]

2024-12-10 Thread via GitHub
junneyang opened a new issue, #14055: URL: https://github.com/apache/lucene/issues/14055 ### Description 1、now the documentation says that it can be automatically merged by policy 2、if i need to merge two segments manually, is this supported? thanks ~ -- This is an automated m