rmuir commented on issue #14235:
URL: https://github.com/apache/lucene/issues/14235#issuecomment-2657956964
This one looks to me like another dictionary bug. Unfortunately the current
options we have to "tolerate" such bugs don't work in this case, but perhaps
they can be improved.
T
kaivalnp commented on code in PR #14178:
URL: https://github.com/apache/lucene/pull/14178#discussion_r1955595482
##
lucene/sandbox/src/java22/org/apache/lucene/sandbox/codecs/faiss/LibFaissC.java:
##
@@ -0,0 +1,457 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) unde
kaivalnp commented on code in PR #14178:
URL: https://github.com/apache/lucene/pull/14178#discussion_r1955603323
##
lucene/sandbox/src/java22/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsReader.java:
##
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundatio
kaivalnp commented on code in PR #14178:
URL: https://github.com/apache/lucene/pull/14178#discussion_r1955600494
##
lucene/sandbox/src/java22/org/apache/lucene/sandbox/codecs/faiss/LibFaissC.java:
##
@@ -0,0 +1,457 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) unde
rmuir commented on issue #14235:
URL: https://github.com/apache/lucene/issues/14235#issuecomment-2658111400
The last one was like this, too: https://github.com/apache/lucene/pull/14079
I think people just often have trouble counting and that's why we see errors
around the counts, even wit
houserjohn opened a new pull request, #14238:
URL: https://github.com/apache/lucene/pull/14238
### Description
Adds additional unit tests to increase coverage of Dynamic Range Faceting.
- Adds tests for varying TopN values
- Adds test for inputs with the same weights
- Adds te
gf2121 commented on PR #14203:
URL: https://github.com/apache/lucene/pull/14203#issuecomment-2658082076
Comparison of VectorAPI(Baseline) and InnerLoop(Candidate)
```
TaskQPS baseline StdDevQPS
my_modified_version StdDevPct diff
rmuir commented on PR #14239:
URL: https://github.com/apache/lucene/pull/14239#issuecomment-2658093518
using the `mark()/reset()` like this can be invitation for trouble, but the
situation is contained: the parser will always make forward progress so it
can't go crazy or infinite. Also, the
rmuir opened a new issue, #14230:
URL: https://github.com/apache/lucene/issues/14230
### Description
From CI when pushing:
```
TestByteVectorSimilarityQuery > testFallbackToExact FAILED
junit.framework.AssertionFailedError: Expected exception
UnsupportedOperationException
tteofili commented on code in PR #14094:
URL: https://github.com/apache/lucene/pull/14094#discussion_r1954405953
##
lucene/core/src/java/org/apache/lucene/search/HnswKnnCollector.java:
##
@@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
tteofili commented on PR #14094:
URL: https://github.com/apache/lucene/pull/14094#issuecomment-2656443339
updated results (Cohere 768 200k docs)
baseline
```
recall latency (ms)nDoc topK fanout maxConn beamWidth quantized
visited index s index docs/s num segments
tteofili commented on PR #14094:
URL: https://github.com/apache/lucene/pull/14094#issuecomment-2656446320
reference
[paper](https://cs.uwaterloo.ca/~jimmylin/publications/Teofili_Lin_ECIR2025.pdf)
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
rmuir merged PR #14227:
URL: https://github.com/apache/lucene/pull/14227
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
rmuir closed issue #14224: TestOperations.testGetRandomAcceptedString failing
URL: https://github.com/apache/lucene/issues/14224
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
iverase commented on PR #14213:
URL: https://github.com/apache/lucene/pull/14213#issuecomment-2656318509
@Tim-Brooks Could you add an entry in CHANGES.txt? It should be under the
10.2 version, thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message,
benwtrent commented on issue #14230:
URL: https://github.com/apache/lucene/issues/14230#issuecomment-2656795520
Running with thousands of seeds, it will fail eventually on linux/macbook as
well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
stefanvodita commented on code in PR #14204:
URL: https://github.com/apache/lucene/pull/14204#discussion_r1954612647
##
lucene/facet/src/java/org/apache/lucene/facet/histogram/HistogramCollector.java:
##
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
benwtrent opened a new pull request, #14231:
URL: https://github.com/apache/lucene/pull/14231
Periodically, the similarity requested according to the desired matched docs
actually doesn't explore enough docs to fall back to exact.
Since the purpose of this test is to verify that falli
benwtrent merged PR #14223:
URL: https://github.com/apache/lucene/pull/14223
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
ChrisHegarty opened a new issue, #14229:
URL: https://github.com/apache/lucene/issues/14229
This issue has been filed to help facilitate and track a discussion relating
to bumping the minimum compile Java version.
Currently the minimum compile version is Java 21, for both active
deve
ChrisHegarty commented on PR #14131:
URL: https://github.com/apache/lucene/pull/14131#issuecomment-2656061680
> I think bumping main only for each non LTS release would be cool. Then we
keep it at the next LTS (Java 25)?
I filed the following issue to help facilitate the discussion re
rmuir merged PR #14218:
URL: https://github.com/apache/lucene/pull/14218
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
rmuir commented on PR #14228:
URL: https://github.com/apache/lucene/pull/14228#issuecomment-2656373080
The hunspell test failure is likely an upstream issue with libreoffice
dictionaries. It happened recently to me with mongolian dictionaries.
Problems:
* the test doesn't execute e
msokolov commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657365313
You can tune it by changing the magic number 3 to a bigger number. I found
that with 15 I get slight better recall and slightly lower latencies than the
baseline for my test case
--
benwtrent commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657364706
Ah, on other thought is that we are definitely scoring every seeded entry
point twice. Once when they are gathered during initial query phase, then later
through the seeded provisionin
jpountz commented on issue #14225:
URL: https://github.com/apache/lucene/issues/14225#issuecomment-2657354802
For what it's worth, I'm not a fan of throttling merges based on memory
usage. Merge throttling is already complicated the way it is, so I'm not too
excited about adding more constr
benwtrent opened a new pull request, #14232:
URL: https://github.com/apache/lucene/pull/14232
Ever since: https://github.com/apache/lucene/pull/14165
This test has been flaky. It fails as the number of clone calls during
indexing exceeds 500.
I tried only updating the merge sc
benwtrent closed issue #14175:
org.apache.lucene.search.TestKnnFloatVectorQuery.testFindFewer
ComparisonFailure: expected: but was:
URL: https://github.com/apache/lucene/issues/14175
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitH
benwtrent commented on PR #14232:
URL: https://github.com/apache/lucene/pull/14232#issuecomment-2657320964
@jpountz I tried that while reverting my other changes and it was still over
500 over many runs. I will just merge what I got. Thanks!
--
This is an automated message from the Apache
benwtrent merged PR #14232:
URL: https://github.com/apache/lucene/pull/14232
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
benwtrent closed issue #14220: TestForTooMuchCloning.test fails
URL: https://github.com/apache/lucene/issues/14220
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscrib
jpountz commented on issue #14229:
URL: https://github.com/apache/lucene/issues/14229#issuecomment-2657325084
This sounds like a safe bet.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
Vikasht34 commented on issue #14214:
URL: https://github.com/apache/lucene/issues/14214#issuecomment-2657428175
Interesting , let me quickly run those tests my self also to see what would
be impact!! Thanks for logs ..
--
This is an automated message from the Apache Git Service.
To respon
rmuir commented on issue #14235:
URL: https://github.com/apache/lucene/issues/14235#issuecomment-2657783923
@dweiss didn't mean for it to come as a complaint, i honestly have no ideas
how to improve it.
personally i would LIKE to see the failures on upstream dictionary updates
and ge
jpountz opened a new pull request, #14236:
URL: https://github.com/apache/lucene/pull/14236
`CombinedFieldQuery` is Lucene's most robust way of scoring across multiple
fields, let's move it to core and recommend using it to query multiple fields.
While moving the class, I modified the
dweiss commented on PR #14228:
URL: https://github.com/apache/lucene/pull/14228#issuecomment-2657770674
I'll create a separate issue for hunspell tests and take care of that
tomorrow, no worries.
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
benwtrent commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657612038
I am not 100% sure whats up with the behavior. However, I switched to `16`
(also happens to be the graph conn) instead of `3`.
Its interesting how visited is lower, but recall is
benwtrent merged PR #14160:
URL: https://github.com/apache/lucene/pull/14160
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
benwtrent closed issue #13940: Look into ACORN-1, or another algorithm to aid
in filtered HNSW search
URL: https://github.com/apache/lucene/issues/13940
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
jpountz commented on PR #14203:
URL: https://github.com/apache/lucene/pull/14203#issuecomment-2657293043
These results look even better than the results that you had previously
reported for the vector API, is my understanding correct that it performs even
better?
--
This is an automated
stefanvodita commented on code in PR #14204:
URL: https://github.com/apache/lucene/pull/14204#discussion_r1954612647
##
lucene/facet/src/java/org/apache/lucene/facet/histogram/HistogramCollector.java:
##
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF)
jpountz commented on code in PR #14204:
URL: https://github.com/apache/lucene/pull/14204#discussion_r1954925602
##
lucene/facet/src/java/org/apache/lucene/facet/histogram/HistogramCollector.java:
##
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
msokolov commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657373529
I think maybe what happens here is that the K controls not only how many
hits are returned from each segment, but also what the "beam width" is during
search, so we could have gotten be
risdenk commented on PR #14216:
URL: https://github.com/apache/lucene/pull/14216#issuecomment-2657288304
thanks for merging @dweiss
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
epotyom opened a new pull request, #14237:
URL: https://github.com/apache/lucene/pull/14237
In the initial sandbox facet module PR @gsmiller
[suggested](https://github.com/apache/lucene/pull/13568#issuecomment-2249005915)
adding helpers to make common tasks easier.
This implementatio
rmuir commented on issue #14235:
URL: https://github.com/apache/lucene/issues/14235#issuecomment-2657795678
I think to fix it, we have to look at `checkoutHunspellRegressionRepos()`.
it clones the default branch currently, I think we'd just want to pin to a hash
for now.
We could jus
jpountz commented on PR #14232:
URL: https://github.com/apache/lucene/pull/14232#issuecomment-2657728449
Thanks @benwtrent
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
benwtrent commented on issue #14220:
URL: https://github.com/apache/lucene/issues/14220#issuecomment-2657135532
This is failing pretty often. I tried upping the clone limit to 600, but it
still fails periodically with more than 600 merges (604 in a local run).
@jpountz what do you thi
jpountz commented on code in PR #14204:
URL: https://github.com/apache/lucene/pull/14204#discussion_r1954905919
##
lucene/facet/src/java/org/apache/lucene/facet/histogram/HistogramCollector.java:
##
@@ -0,0 +1,252 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
benwtrent merged PR #14231:
URL: https://github.com/apache/lucene/pull/14231
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
benwtrent closed issue #14230: TestByteVectorSimilaryQuery failure on windows
URL: https://github.com/apache/lucene/issues/14230
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
navneet1v commented on PR #14223:
URL: https://github.com/apache/lucene/pull/14223#issuecomment-2657079232
Thanks @benwtrent for approval and merging the code
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abo
msokolov commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657368730
also - we are not really tracking "visited" properly I think
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
benwtrent opened a new issue, #14233:
URL: https://github.com/apache/lucene/issues/14233
### Description
After a6a96cde1c6 Bugfix/fix hnsw search termination check (#14215) HNSW
format recall tests started failing. Need to investigate.
```
TestLucene94HnswVectorsFormat > tes
benwtrent opened a new pull request, #14234:
URL: https://github.com/apache/lucene/pull/14234
We had many duplicates within the hnsw recall test index. This tripped over
our duplicate score change where we don't explore further unless scores are
strictly better: https://github.com/apache/l
jpountz commented on PR #14232:
URL: https://github.com/apache/lucene/pull/14232#issuecomment-2657233927
Thanks for looking into this and sorry for missing the build failures. The
fact that this test has failures makes sense to me since merging is a bit more
aggressive now, though I don't e
msokolov commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657317490
Well I ran some tests, and surprisingly, I saw a significant different in
both recall and latency (decreases in both). This surprised me: I expected to
see more-or-less similar results,
gf2121 commented on PR #14203:
URL: https://github.com/apache/lucene/pull/14203#issuecomment-2657334178
> is my understanding correct that it performs even better?
Yeah!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
benwtrent commented on PR #14226:
URL: https://github.com/apache/lucene/pull/14226#issuecomment-2657348683
OK, I ran it on 8M data set with 128 segments.
Indeed it visits way fewer vectors (seemingly), and is consistent across
multiple threads.
```
recall latency(ms)
rmuir commented on issue #14235:
URL: https://github.com/apache/lucene/issues/14235#issuecomment-2657937921
I think this is the one triggering current failure: will dig into it
https://github.com/LibreOffice/dictionaries/commit/762abe74008b94b2ff06db6f4024b59a8254c467
--
This is an automa
navneet1v commented on code in PR #14178:
URL: https://github.com/apache/lucene/pull/14178#discussion_r1955514351
##
lucene/sandbox/src/java22/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsFormat.java:
##
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundatio
navneet1v commented on code in PR #14178:
URL: https://github.com/apache/lucene/pull/14178#discussion_r1955514351
##
lucene/sandbox/src/java22/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsFormat.java:
##
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundatio
navneet1v commented on code in PR #14178:
URL: https://github.com/apache/lucene/pull/14178#discussion_r1955523940
##
lucene/sandbox/src/java22/org/apache/lucene/sandbox/codecs/faiss/LibFaissC.java:
##
@@ -0,0 +1,457 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) und
houserjohn commented on PR #13914:
URL: https://github.com/apache/lucene/pull/13914#issuecomment-2658101119
Hey @HoustonPutman, I just published
[GH#14238](https://github.com/apache/lucene/pull/14238) which contains all of
the unit tests that I've created so far. Note that there was a sligh
64 matches
Mail list logo