uschindler commented on code in PR #13267:
URL: https://github.com/apache/lucene/pull/13267#discussion_r1552492150
##
lucene/CHANGES.txt:
##
@@ -78,6 +78,12 @@ Optimizations
* GITHUB#13149: Made PointRangeQuery faster, for some segment sizes, by
reducing the amount of virtual
uschindler commented on code in PR #13267:
URL: https://github.com/apache/lucene/pull/13267#discussion_r1552487587
##
lucene/CHANGES.txt:
##
@@ -78,6 +78,12 @@ Optimizations
* GITHUB#13149: Made PointRangeQuery faster, for some segment sizes, by
reducing the amount of virtual
rmuir commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038269435
I'll debug it some more... just need a break. Mainly I wanted to make sure I
didn't introduce this with the ICU upgrade...
the two shinglefilters are suspect.
--
This is an a
rmuir commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038258715
I narrowed the fail down so far a bit to just this chain:
```
new Analyzer() {
@Override
protected TokenStreamComponents createComponents(Str
rmuir commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038244230
Attached is a reproducer in TestBugInSomething that seems to work. It is
ugly due to bugs in `javac`, but we don't care as we won't keep it and will
just whittle it down to a proper t
rmuir commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038169885
i'll try to dig into it to at least find the offending component. If we can
narrow it down to the problematic charfilter, tokenizer, or tokenfilter, we can
make an easier-to-reproduce
uschindler commented on code in PR #13267:
URL: https://github.com/apache/lucene/pull/13267#discussion_r1552390907
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsWriter.java:
##
@@ -315,10 +316,13 @@ public CloseableRandomVectorScorerSupplier
merge
benwtrent commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038151821
OK, apologies for the noise, this test keeps failing weirdly for me. Git
bisect has failed me :)
--
This is an automated message from the Apache Git Service.
To respond to the m
original-brownbear commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038144433
++ to @rmuir I can reproduce this after reverting my changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Gi
rmuir commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038141226
I don't understand what that change has to do with analysis chain...
inconsistent offsets has to do with what TokenStream is doing not the index. Be
sure, that you aren't getting conf
benwtrent commented on issue #13272:
URL: https://github.com/apache/lucene/issues/13272#issuecomment-2038142167
OK, this is muted in main now. The original change doesn't seem backported
to 9x. So, nothing to mute there.
--
This is an automated message from the Apache Git Service.
To resp
benwtrent merged PR #13273:
URL: https://github.com/apache/lucene/pull/13273
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
benwtrent commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038136325
So, if I switch to just return ` new Meta(numValues, blockShift)` when its
all zero, the test passes :/. So it might be the single block meta deal...
--
This is an automated mes
original-brownbear commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038127980
Looking into this now. Maybe it's the fact that for the all zeros case we
now always have a single block meta ... let me check
--
This is an automated message from the
benwtrent commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038124499
OK, I simply commented out @original-brownbear 's change and always returned
the meta and the test passes with the seed and settings. When I return the
cached instance, it fails.
zacharymorn commented on issue #13272:
URL: https://github.com/apache/lucene/issues/13272#issuecomment-2038123150
Another seed to reproduce this issue
```
gradlew test --tests TestKnnByteVectorQuery.testTimeout
-Dtests.seed=51247A9C254BB9F -Dtests.locale=ff-Latn -Dtests.timezone=As
zacharymorn commented on PR #13238:
URL: https://github.com/apache/lucene/pull/13238#issuecomment-2038118749
After merge, I saw a test run error and this issue is fixing it
https://github.com/apache/lucene/issues/13272
--
This is an automated message from the Apache Git Service.
To respon
benwtrent opened a new pull request, #13273:
URL: https://github.com/apache/lucene/pull/13273
These tests are failing fairly often, but aren't reliably reproduced. I am
gonna mute to keep away the build noise for now.
Related to: https://github.com/apache/lucene/issues/13272
--
Thi
rmuir commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038092682
I just reproduced it on `main` too :(
So I think the problem is that it doesn't always reproduce: this likely
caused your git-bisect confusion. But at least it isn't NEWLY intro
benwtrent commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038088546
@rmuir
> I can't reproduce it with the latest main branch which has the ICU upgrade.
Oh dang, I might not have fetched latest 🤦 Let me try again.
--
This is an aut
zacharymorn commented on code in PR #13238:
URL: https://github.com/apache/lucene/pull/13238#discussion_r1552338688
##
lucene/MIGRATE.md:
##
@@ -185,6 +185,34 @@ enum.
`IOContext#LOAD` has been replaced with `IOContext#PRELOAD`.
+### IndexSearch#search(Query, Collector) bei
zacharymorn merged PR #13238:
URL: https://github.com/apache/lucene/pull/13238
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene
rmuir commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038087578
I can't find the specific bug, but we've reported several such issues to
them in the past. I'm having trouble with their JIRA...
Either way, fails on 342c814cafdaab341f8d86b80c7
benwtrent commented on PR #13202:
URL: https://github.com/apache/lucene/pull/13202#issuecomment-2038084746
@kaivalnp I too have had this fail, but it is not deterministic. I created
https://github.com/apache/lucene/issues/13272 to track it.
I can mute the test while we wait for the fi
benwtrent opened a new issue, #13272:
URL: https://github.com/apache/lucene/issues/13272
### Description
Periodic failures for testTimeout, its not locally repeatable.
### Gradle command to reproduce
```
./gradlew :lucene:core:test --tests
"org.apache.lucene.search.T
rmuir commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038081951
@benwtrent how long ago did you see this test failure?
I can't reproduce it with the latest `main` branch which has the ICU upgrade.
If i go back one commit to before the ICU
benwtrent commented on issue #13272:
URL: https://github.com/apache/lucene/issues/13272#issuecomment-2038087232
I am going to mute this on main & branch_9x as we wait for a fix.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
stefanvodita commented on PR #12966:
URL: https://github.com/apache/lucene/pull/12966#issuecomment-2038070448
Thank you for reviewing @mikemccand! I had to rebase after #12966. I'll push
tomorrow maybe if there are no objections.
--
This is an automated message from the Apache Git Service
benwtrent commented on issue #13271:
URL: https://github.com/apache/lucene/issues/13271#issuecomment-2038069989
git bisect says: 6cba773318445d1ecd349d42ff85508b58be1752
It isn't immediately obvious why to me. @original-brownbear what do you
think?
--
This is an automated message f
benwtrent opened a new issue, #13271:
URL: https://github.com/apache/lucene/issues/13271
### Description
Haven't finished bisection to figure out the origin of the bug.
```
org.apache.lucene.analysis.tests.TestRandomChains > testRandomChains FAILED
java.lang.IllegalSt
rmuir commented on PR #13270:
URL: https://github.com/apache/lucene/pull/13270#issuecomment-2038041453
1.22+ starts dragging in more stuff such as `commons-lang3` and `commons-io`
which is why I stopped at 1.21
--
This is an automated message from the Apache Git Service.
To respond to the
rmuir opened a new pull request, #13270:
URL: https://github.com/apache/lucene/pull/13270
Another pre-covid dependency version...
I tried to upgrade to the latest version (both with and without the
entangled commons-codec upgrade: https://github.com/apache/lucene/pull/13269)
and ther
rmuir commented on PR #13269:
URL: https://github.com/apache/lucene/pull/13269#issuecomment-2038008126
I will open a separate issue for the commons-compress as there is more
trouble there...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
rmuir commented on PR #13269:
URL: https://github.com/apache/lucene/pull/13269#issuecomment-2037993068
I think i understand the 1.16.1 issue. Looks like there is a bit of
jar/maven-hell going on among the commons dependencies. I hit it when I tried
to upgrade commons-compress...
Bump
stefanvodita commented on code in PR #12966:
URL: https://github.com/apache/lucene/pull/12966#discussion_r1552265766
##
lucene/facet/src/java/org/apache/lucene/facet/TopOrdAndIntQueue.java:
##
@@ -16,37 +16,42 @@
*/
package org.apache.lucene.facet;
-import org.apache.lucene
kaivalnp commented on PR #13202:
URL: https://github.com/apache/lucene/pull/13202#issuecomment-2037932211
I see a couple of recent failures related to the test added here:
1. https://github.com/apache/lucene/actions/runs/8557766498/job/23450804532
2. https://github.com/apache/lucene/act
rmuir opened a new pull request, #13269:
URL: https://github.com/apache/lucene/pull/13269
Currently, we depend upon a pre-COVID version of commons-codec (like several
other dependencies, I am working the issues).
Upgrade to 1.16.0.
NOTE: I'm unable to upgrade to 1.16.1: it caus
jpountz commented on PR #13222:
URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037833835
Backport PR: #13267.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
jpountz opened a new pull request, #13267:
URL: https://github.com/apache/lucene/pull/13267
This switches the following files to `IOContext#RANDOM`:
- Stored fields data file.
- Term vectors data file.
- HNSW graph.
- Temporary file storing vectors at merge time that we use to
jimczi commented on PR #13200:
URL: https://github.com/apache/lucene/pull/13200#issuecomment-2037737097
Thanks for persevering on this @benwtrent !
A few late thoughts on my end:
I am a bit concerned about the generalization here. The whole similarity is
currently modeled around t
uschindler closed issue #13266: `FieldInfo`?
URL: https://github.com/apache/lucene/issues/13266
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-u
uschindler commented on code in PR #13239:
URL: https://github.com/apache/lucene/pull/13239#discussion_r1552071936
##
lucene/analysis/common/src/test/org/apache/lucene/analysis/email/TLDs.txt:
##
@@ -1,6 +1,5 @@
# Generated from IANA TLD Database (gradlew generateTlds).aaa
aar
jpountz commented on PR #13222:
URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037703765
This is exactly what I'm doing. :) All in the same PR, but it shouldn't be
too big.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log o
uschindler commented on PR #13222:
URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037699875
> I'll look into porting this work to branch_9x via a separate PR.
We should maybe also port some recent changes back, too. Not sure how to
best do this. The new record classes
uschindler merged PR #13264:
URL: https://github.com/apache/lucene/pull/13264
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
uschindler commented on code in PR #13200:
URL: https://github.com/apache/lucene/pull/13200#discussion_r1552040639
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/word2vec/Word2VecSynonymProvider.java:
##
@@ -40,8 +40,8 @@
*/
public class Word2VecSynonym
jpountz merged PR #13222:
URL: https://github.com/apache/lucene/pull/13222
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz commented on PR #13222:
URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037674709
I'll look into porting this work to branch_9x via a separate PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
jpountz commented on PR #13265:
URL: https://github.com/apache/lucene/pull/13265#issuecomment-2037661586
I tried to create a test for it but couldn't create a satisfying one.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
jpountz opened a new pull request, #13265:
URL: https://github.com/apache/lucene/pull/13265
This fixes index sorting to pass the correct `IOContext` to stored fields
and term vectors writers when index sorting is enabled. This is important for
things like `NRTCachingDirectory`.
--
Th
uschindler commented on PR #13264:
URL: https://github.com/apache/lucene/pull/13264#issuecomment-2037631425
> This makes sense to me. At first I wondered if we needed this since it's
possible to create a `FilterDirectory` that calls
`ioContext.withReadAdvice(ReadAdvice.NORMAL)` on `Director
benwtrent merged PR #13263:
URL: https://github.com/apache/lucene/pull/13263
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
uschindler merged PR #13261:
URL: https://github.com/apache/lucene/pull/13261
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
uschindler commented on PR #13264:
URL: https://github.com/apache/lucene/pull/13264#issuecomment-2037578866
P.S.: I moved the lookup cache to a new constant name and placed it next to
the `withReadAdvice()` method.
--
This is an automated message from the Apache Git Service.
To respond to
uschindler commented on PR #13244:
URL: https://github.com/apache/lucene/pull/13244#issuecomment-2037575426
See #13264
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To un
uschindler opened a new pull request, #13264:
URL: https://github.com/apache/lucene/pull/13264
Followup after #13244: This makes the default `ReadAdvice` for index files
configurable via system property `org.apache.lucene.store.defaultReadAdvice`.
I moved the corresponding constant to
jpountz commented on PR #13222:
URL: https://github.com/apache/lucene/pull/13222#issuecomment-2037526473
@uschindler I'm curious what you think of this PR now that
`ReadAdvice#RANDOM` is the default. Should we keep calling
`state.context.withReadAdvice(ReadAdvice.RANDOM)` to be explicit abo
benwtrent commented on PR #13200:
URL: https://github.com/apache/lucene/pull/13200#issuecomment-2037464568
I still need to run with Lucene Util to ensure all these changes didn't add
a weird overhead.
This leads me to add back the deprecated apis for the vector fields. I know
I woul
uschindler commented on PR #13244:
URL: https://github.com/apache/lucene/pull/13244#issuecomment-2037455008
Thanks! I will provide a PR to make it configurable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
uschindler commented on PR #13261:
URL: https://github.com/apache/lucene/pull/13261#issuecomment-2037428907
The failing test is a known issue (see mailing list). Just ignore that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu
jpountz merged PR #13244:
URL: https://github.com/apache/lucene/pull/13244
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
mikemccand commented on code in PR #13238:
URL: https://github.com/apache/lucene/pull/13238#discussion_r1551778848
##
lucene/MIGRATE.md:
##
@@ -185,6 +185,34 @@ enum.
`IOContext#LOAD` has been replaced with `IOContext#PRELOAD`.
+### IndexSearch#search(Query, Collector) bein
jpountz opened a new issue, #13262:
URL: https://github.com/apache/lucene/issues/13262
### Description
We have some code that has been deprecated for a long time, see e.g.
`IntField` constructors. Let's remove this deprecated code in the `main` branch
ahead of 10.0.
Note: we m
Pulkitg64 commented on code in PR #13261:
URL: https://github.com/apache/lucene/pull/13261#discussion_r1551738126
##
lucene/core/src/java/org/apache/lucene/search/BooleanClause.java:
##
@@ -67,25 +67,12 @@ public String toString() {
};
}
- /** The query whose matching
jpountz commented on PR #13244:
URL: https://github.com/apache/lucene/pull/13244#issuecomment-2037277263
I hacked luceneutil to clear my page cache with `echo 1 >
/proc/sys/vm/drop_caches` before each query:
```patch
diff --git a/src/main/perf/TaskThreads.java b/src/main/perf/TaskT
Pulkitg64 commented on code in PR #13261:
URL: https://github.com/apache/lucene/pull/13261#discussion_r1551703918
##
lucene/core/src/java/org/apache/lucene/search/BooleanClause.java:
##
@@ -67,25 +67,12 @@ public String toString() {
};
}
- /** The query whose matching
uschindler commented on code in PR #13261:
URL: https://github.com/apache/lucene/pull/13261#discussion_r1551701906
##
lucene/core/src/java/org/apache/lucene/search/BooleanClause.java:
##
@@ -67,25 +67,12 @@ public String toString() {
};
}
- /** The query whose matchin
uschindler commented on code in PR #13261:
URL: https://github.com/apache/lucene/pull/13261#discussion_r1551695502
##
lucene/core/src/java/org/apache/lucene/search/BooleanClause.java:
##
@@ -67,25 +67,12 @@ public String toString() {
};
}
- /** The query whose matchin
benwtrent commented on PR #13200:
URL: https://github.com/apache/lucene/pull/13200#issuecomment-2037236344
I have removed the deprecated `VectorSimilarityFunction` methods from the
kNN fields, I thought this was acceptable because those fields are flagged as
`experimental`. However, I reali
uschindler commented on PR #13261:
URL: https://github.com/apache/lucene/pull/13261#issuecomment-2037230661
> That's a good idea @jpountz. I can add this change under this section:
https://github.com/apache/lucene/blob/main/lucene/MIGRATE.md#some-classes-converted-to-records-classes-github13
Pulkitg64 commented on PR #13261:
URL: https://github.com/apache/lucene/pull/13261#issuecomment-2037226302
That's a good idea @jpountz. I can add this change under this section:
https://github.com/apache/lucene/blob/main/lucene/MIGRATE.md#some-classes-converted-to-records-classes-github13207
benwtrent commented on PR #13260:
URL: https://github.com/apache/lucene/pull/13260#issuecomment-2037214166
@msokolov I have had something similar fail in other code, so it isn't
related to your change here. I think its a flaky test. It doesn't reliably
reproduce for me.
--
This is an aut
rmuir commented on issue #10878:
URL: https://github.com/apache/lucene/issues/10878#issuecomment-2037179564
I wonder if "concurrency issue" is jumping to conclusions. My machine has
almost no concurrency: 1 cpu with 2 cores...
--
This is an automated message from the Apache Git Service.
T
msokolov commented on PR #13260:
URL: https://github.com/apache/lucene/pull/13260#issuecomment-2037168102
hm a failure popped up when I pushed:
```
org.apache.lucene.search.TestKnnByteVectorQuery > testTimeout FAILED
java.lang.AssertionError: expected:<1> but was:<0>
jpountz commented on PR #13261:
URL: https://github.com/apache/lucene/pull/13261#issuecomment-2037159161
> Please add a note to lucene/MIGRATE.txt
By the way, I wonder how we should handle all the classes that we are
converting to records, should we have a single section in `lucene/MI
rmuir commented on issue #10878:
URL: https://github.com/apache/lucene/issues/10878#issuecomment-2037147651
I just popped this when pushing an unrelated commit. I see @mikemccand added
more verbosity, but then reverted the verbosity (maybe because it caused the
test to never fail anymore?)
Pulkitg64 opened a new pull request, #13261:
URL: https://github.com/apache/lucene/pull/13261
### Description
Convert ```BooleanClause``` class to record class which addresses #13207
partially.
--
This is an automated message from the Apache Git Service.
To respond to the message,
benwtrent commented on PR #13197:
URL: https://github.com/apache/lucene/pull/13197#issuecomment-2037108804
@jpountz I can add the parameters today and fix the compilation. I think
your change is the correct one.
--
This is an automated message from the Apache Git Service.
To respond to th
rmuir merged PR #13239:
URL: https://github.com/apache/lucene/pull/13239
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
msokolov commented on PR #13260:
URL: https://github.com/apache/lucene/pull/13260#issuecomment-2037063850
thanks for the reviews!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
msokolov merged PR #13260:
URL: https://github.com/apache/lucene/pull/13260
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
jpountz commented on PR #13244:
URL: https://github.com/apache/lucene/pull/13244#issuecomment-2036934137
Here's a luceneutil run on wikibigall and data hot in the page cache. No
difference, which is what was expected I guess:
```
TaskQPS baseline St
jpountz commented on PR #13197:
URL: https://github.com/apache/lucene/pull/13197#issuecomment-2036830008
@benwtrent I tried to fix the compilation on luceneutil at
https://github.com/mikemccand/luceneutil/commit/027146b05c755c303cac5d95451cc458ed397c51.
I could use your help to check if thi
jpountz commented on PR #13244:
URL: https://github.com/apache/lucene/pull/13244#issuecomment-2036730400
That works for me @uschindler. I updated the code and the PR description.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
jpountz commented on PR #13223:
URL: https://github.com/apache/lucene/pull/13223#issuecomment-2036729679
Superseded by #13244.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
jpountz closed pull request #13223: Recommend lowering the default mmap
readahead.
URL: https://github.com/apache/lucene/pull/13223
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
uschindler commented on PR #13260:
URL: https://github.com/apache/lucene/pull/13260#issuecomment-2036532521
I think the Unwrappable for FilterReader should be added in a separate PR.
For now this is fine.
I remember vaguely that I left our FilterReader because of the static method
co
87 matches
Mail list logo