benwtrent merged PR #14170:
URL: https://github.com/apache/lucene/pull/14170
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
tteofili opened a new pull request, #14191:
URL: https://github.com/apache/lucene/pull/14191
This is a first attempt at fixing
https://github.com/apache/lucene/issues/14180.
It's based on @jpountz idea mentioned
[here](https://github.com/apache/lucene/pull/14167#issuecomment-2616408185).
benwtrent commented on PR #14160:
URL: https://github.com/apache/lucene/pull/14160#issuecomment-2631840332
> I think this 'correlation' is important to test as I imagine many real
world filters involve some correlation, rather than the random filters we get
in luceneutil benchmarks.
github-actions[bot] commented on PR #13747:
URL: https://github.com/apache/lucene/pull/13747#issuecomment-2632478843
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
benwtrent commented on code in PR #14154:
URL: https://github.com/apache/lucene/pull/14154#discussion_r1939910069
##
lucene/core/src/java/org/apache/lucene/analysis/AnalyzerWrapper.java:
##
@@ -151,4 +157,78 @@ protected final Reader initReaderForNormalization(String
fieldName,
john-wagster opened a new pull request, #14192:
URL: https://github.com/apache/lucene/pull/14192
About four years ago ASCII-only case insensitive matching
(https://github.com/apache/lucene-solr/pull/1541) was added to Lucene. In the
past couple of a years a couple of requests have been mad
john-wagster commented on PR #14192:
URL: https://github.com/apache/lucene/pull/14192#issuecomment-2631833927
@jpountz, @jimczi, @mayya-sharipova ya'll may be interested in this PR so
just tagging you here in case you are interested.
--
This is an automated message from the Apache Git S
john-wagster commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1939889352
##
lucene/core/src/test/org/apache/lucene/util/automaton/TestRegExp.java:
##
@@ -35,6 +43,320 @@ public void testSmoke() {
assertFalse(run.run("ad"));
}
rmuir commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1939974637
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -436,6 +478,160 @@ public enum Kind {
*/
@Deprecated public static final int DEPRECATE
benchaplin commented on PR #14160:
URL: https://github.com/apache/lucene/pull/14160#issuecomment-2632762867
Baseline:
```
recall latency (ms) nDoc topK fanout maxConn beamWidth visited
selectivity correlation filterType
1.000 9.020 100 100 100
dsmiley commented on PR #13949:
URL: https://github.com/apache/lucene/pull/13949#issuecomment-2632388607
Couldn't S3 and other file storage be implemented as an NIO FileSystem
instead? AKA JSR-203. Would the Lucene Directory abstraction level have
certain advantages (what)? Ideally we'd
dsmiley commented on PR #13949:
URL: https://github.com/apache/lucene/pull/13949#issuecomment-2632403618
By the way, the Apache Solr project has an impressive
"[BlockCache](https://github.com/apache/solr/blob/07943e87fb762b69a66932f777d56eb14cc72e78/solr/modules/hdfs/src/java/org/apache/solr
rmuir commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1940264096
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -436,6 +478,160 @@ public enum Kind {
*/
@Deprecated public static final int DEPRECATE
rmuir commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1939946041
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -436,6 +478,160 @@ public enum Kind {
*/
@Deprecated public static final int DEPRECATE
rmuir commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1939944485
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -424,6 +426,46 @@ public enum Kind {
/** Allows case insensitive matching of ASCII charact
rmuir commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1940371117
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -696,17 +896,52 @@ private Automaton toAutomaton(
return a;
}
- private Automaton
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2632590044
also `makeCharUnion()` comes to mind as a compelling alternative name, since
there is already a `makeStringUnion()`. Naming is hard. just want to get the
idea out there, since caseless reg
rmuir opened a new pull request, #14193:
URL: https://github.com/apache/lucene/pull/14193
Previously caseless matching was implemented via code such as this:
```java
Operations.union(Automata.makeChar('x'), Automata.makeChar('X'))
```
Proposed unicode caseless matching (
john-wagster commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1940023844
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -424,6 +426,46 @@ public enum Kind {
/** Allows case insensitive matching of ASCII
john-wagster commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1940026321
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -436,6 +478,160 @@ public enum Kind {
*/
@Deprecated public static final int DE
rmuir commented on code in PR #14192:
URL: https://github.com/apache/lucene/pull/14192#discussion_r1940403543
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -436,6 +478,160 @@ public enum Kind {
*/
@Deprecated public static final int DEPRECATE
kaivalnp commented on code in PR #14178:
URL: https://github.com/apache/lucene/pull/14178#discussion_r1935407529
##
lucene/sandbox/src/java22/org/apache/lucene/sandbox/codecs/faiss/LibFaissC.java:
##
@@ -0,0 +1,268 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) unde
mikemccand commented on issue #14190:
URL: https://github.com/apache/lucene/issues/14190#issuecomment-2630817713
It might in theory check the `lucene/CHANGES.txt` to look for an entry (with
the PR/issue number) summarizing the PR? Then it could see which Lucene
release the issue is under.
original-brownbear commented on PR #13622:
URL: https://github.com/apache/lucene/pull/13622#issuecomment-2631740301
I agree @dsmiley , I actually did continue to work on this on the ES side
lately in https://github.com/elastic/elasticsearch/pull/120024.
What I did there was introduce log
mayya-sharipova commented on code in PR #14191:
URL: https://github.com/apache/lucene/pull/14191#discussion_r1939834681
##
lucene/core/src/java/org/apache/lucene/search/knn/MultiLeafKnnCollector.java:
##
@@ -89,6 +91,24 @@ public MultiLeafKnnCollector(
this.nonCompetitiveQu
dsmiley commented on PR #13622:
URL: https://github.com/apache/lucene/pull/13622#issuecomment-2631616345
Looking back at this, might it have been better to instead wrap
`TaskExecutor.invokeAll`'s call of `executor.execute` in a loop to catch
`RejectedExecutionException` and then don't both
jpountz commented on PR #14189:
URL: https://github.com/apache/lucene/pull/14189#issuecomment-2631514489
Thanks for the feedback, I was hesitating. Let's pull this in 10.2 then.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
jpountz commented on PR #14189:
URL: https://github.com/apache/lucene/pull/14189#issuecomment-2631518128
For reference, this is roughly a 10x increase of the floor segment size, so
given that `TieredMergePolicy` defaults to 10 segments per tier, indexes should
have about 10 fewer segments a
mikemccand commented on code in PR #14187:
URL: https://github.com/apache/lucene/pull/14187#discussion_r1939310871
##
lucene/CHANGES.txt:
##
@@ -30,6 +30,10 @@ Bug Fixes
* GITHUB#14075: Remove duplicate and add missing entry on brazilian portuguese
stopwords list. (Arthur Ca
tteofili commented on PR #14191:
URL: https://github.com/apache/lucene/pull/14191#issuecomment-2631453289
preliminary tests with _luceneutil_ on Cohere-768.
**with force-merge=true**
_baseline_
```
recall latency (ms)nDoc topK fanout maxConn beamWidth quantized
vis
30 matches
Mail list logo