msfroh commented on PR #14389:
URL: https://github.com/apache/lucene/pull/14389#issuecomment-2744397587
Awesome! Can I go ahead and use this for
https://github.com/apache/lucene/pull/14350 once it's merged?
--
This is an automated message from the Apache Git Service.
To respond to the mes
rmuir opened a new pull request, #14381:
URL: https://github.com/apache/lucene/pull/14381
Add optional flag to support case-insensitive ranges. A minimal DFA is
always created. This works with Unicode but may have a performance cost.
Each codepoint in the range must be iterated, and a
rmuir commented on code in PR #14381:
URL: https://github.com/apache/lucene/pull/14381#discussion_r2007006500
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -778,6 +786,53 @@ private int[] toCaseInsensitiveChar(int codepoint) {
}
}
+ /**
+
alessandrobenedetti commented on code in PR #14173:
URL: https://github.com/apache/lucene/pull/14173#discussion_r2007476642
##
lucene/core/src/java/org/apache/lucene/util/hnsw/UpdatableScoreHeap.java:
##
Review Comment:
For example, what are the benefits of this in comparis
benwtrent merged PR #14366:
URL: https://github.com/apache/lucene/pull/14366
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
benwtrent closed issue #14327: TestKnnGraph.testMultiThreadedSearch random test
failure
URL: https://github.com/apache/lucene/issues/14327
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
benwtrent closed issue #14327: TestKnnGraph.testMultiThreadedSearch random test
failure
URL: https://github.com/apache/lucene/issues/14327
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
iverase opened a new issue, #14382:
URL: https://github.com/apache/lucene/issues/14382
When building BKD trees, we hold two arrays in memory which sizes grows
linearly with the number of leaf nodes. One of the array contains the pointer
to the start of a leaf node, and the other containing
benwtrent commented on code in PR #14094:
URL: https://github.com/apache/lucene/pull/14094#discussion_r2007365351
##
lucene/core/src/java/org/apache/lucene/util/hnsw/OrdinalTranslatedKnnCollector.java:
##
@@ -50,4 +51,11 @@ public TopDocs topDocs() {
: TotalHits
rmuir commented on PR #14384:
URL: https://github.com/apache/lucene/pull/14384#issuecomment-2743662885
This one is pretty easy to understand, the `CaseFolding` class now just
gives you `UnicodeSet(ch).closeOver(UnicodeSet.SIMPLE_CASE_INSENSITIVE)`
without requiring that you have ICU.
gf2121 commented on code in PR #14333:
URL: https://github.com/apache/lucene/pull/14333#discussion_r2007802395
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/TrieBuilder.java:
##
@@ -0,0 +1,552 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) unde
rmuir commented on PR #14384:
URL: https://github.com/apache/lucene/pull/14384#issuecomment-2743717079
It was easy because @uschindler already created a similar groovy script
before.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
rmuir closed issue #14378: Case insensitive regex query with character range
URL: https://github.com/apache/lucene/issues/14378
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
rmuir commented on PR #14350:
URL: https://github.com/apache/lucene/pull/14350#issuecomment-2744342290
Maybe this one helps the issue: https://github.com/apache/lucene/pull/14389
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
john-wagster commented on PR #14389:
URL: https://github.com/apache/lucene/pull/14389#issuecomment-2744360276
This is great; helps me progress some of the regex work in ES for why I
started that CaseFolding work. Thanks for iterating on this @rmuir.
--
This is an automated messa
alessandrobenedetti commented on PR #14173:
URL: https://github.com/apache/lucene/pull/14173#issuecomment-2743148001
Catching up on this and trying to understand how far we are now from my
original idea and implementation:
https://github.com/apache/lucene/pull/12314
Obviously, my c
jpountz commented on issue #14375:
URL: https://github.com/apache/lucene/issues/14375#issuecomment-271819
Done!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsub
benwtrent commented on issue #11787:
URL: https://github.com/apache/lucene/issues/11787#issuecomment-2743830018
I think this has been fixed with all our HNSW filtering fixes:
- we drop to brute force if we explore too much
- we bypass the graph if the filter passes <= `k` docs
jpountz commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-273990
Hurray!
- https://benchmarks.mikemccandless.com/TermDayOfYearSort.html
- https://benchmarks.mikemccandless.com/TermDTSort.html
--
This is an automated message from the Apache Gi
rmuir commented on PR #14384:
URL: https://github.com/apache/lucene/pull/14384#issuecomment-2743831323
There was something about gradle itself that was upset about dependencies
wrt generation tasks, if i recall... cycle detection or something was
complaining about it.
--
This is an autom
benwtrent closed issue #11787: Handle degenerate case where all HNSW search
candidates are filtered
URL: https://github.com/apache/lucene/issues/11787
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
rmuir commented on PR #14384:
URL: https://github.com/apache/lucene/pull/14384#issuecomment-2743736828
I will followup with an ICU upgrade PR to this one. I don't expect that this
file will change except for the version in the comment though.
--
This is an automated message from the Apach
jpountz opened a new pull request, #14390:
URL: https://github.com/apache/lucene/pull/14390
This implements `BlockPostingsEnum#docIDRunEnd()` by comparing the delta
between doc IDs and between doc counts on the various skip levels.
--
This is an automated message from the Apache Git S
vigyasharma commented on PR #14173:
URL: https://github.com/apache/lucene/pull/14173#issuecomment-2744562872
Thanks for looking into this PR @alessandrobenedetti , this is the latest
iteration on multi-vector support.
It does build on the same central idea of assigning a unique ordina
vigyasharma commented on code in PR #14173:
URL: https://github.com/apache/lucene/pull/14173#discussion_r2008411867
##
lucene/core/src/java/org/apache/lucene/util/hnsw/UpdatableScoreHeap.java:
##
Review Comment:
I'd like to keep the logic to update scores for already ingest
rmuir commented on issue #14327:
URL: https://github.com/apache/lucene/issues/14327#issuecomment-2743546449
thank you @benwtrent
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comm
dweiss commented on issue #14385:
URL: https://github.com/apache/lucene/issues/14385#issuecomment-2743927366
Ok, I've added gradle's "user home" tmp cleaning as well. Anything older
than 3 hours is removed. This folder may be shared across builds so the time
limit is there to prevent accide
rmuir opened a new pull request, #14388:
URL: https://github.com/apache/lucene/pull/14388
Dependency is outdated, the main changes to generated code avoid warnings in
java21+
This one didn't magically work like ICU, I simply force-regenerated. I tried
messing around with the gradle d
jpountz commented on PR #14365:
URL: https://github.com/apache/lucene/pull/14365#issuecomment-2744460409
I pushed an annotation
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
dweiss merged PR #14387:
URL: https://github.com/apache/lucene/pull/14387
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
dweiss opened a new issue, #14385:
URL: https://github.com/apache/lucene/issues/14385
### Description
Gradle creates temp files it never cleans up. Until this is resolved, let's
try to keep some housekeeping ourselves.
Related issues:
* #10215
* #10510
* https://githu
dweiss commented on issue #14385:
URL: https://github.com/apache/lucene/issues/14385#issuecomment-2743998381
There are also *.log files to wipe clean.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
dweiss commented on PR #14387:
URL: https://github.com/apache/lucene/pull/14387#issuecomment-2743985270
I'll merge this in. Low risk and we can always revert if needed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
dweiss closed issue #14385: Address gradle temp file pollution insanity
URL: https://github.com/apache/lucene/issues/14385
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To un
dweiss closed issue #10215: gradle build leaks tons of gradle-worker-classpath*
files in tmpdir [LUCENE-9175]
URL: https://github.com/apache/lucene/issues/10215
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
rmuir merged PR #14381:
URL: https://github.com/apache/lucene/pull/14381
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
dweiss commented on code in PR #14388:
URL: https://github.com/apache/lucene/pull/14388#discussion_r2008140343
##
lucene/expressions/src/generated/checksums/generateAntlr.json:
##
@@ -1,7 +1,8 @@
{
"lucene/expressions/src/java/org/apache/lucene/expressions/js/Javascript.g
dweiss commented on PR #14384:
URL: https://github.com/apache/lucene/pull/14384#issuecomment-2743758629
I mean the entire structure of tasks that are used in regenerate. It's
complex. I remember I couldn't do it in any easier way before - maybe something
has changed that would allow it to b
rmuir commented on PR #14381:
URL: https://github.com/apache/lucene/pull/14381#issuecomment-2743798573
after fixing the turkish here's the (correct) automaton for `/[a-z]/`: the
only special cases are long-s and kelvin sign as you expect:
, the ability to "fix up" the
individual graphs that have deletions and THEN doin
rmuir commented on PR #14386:
URL: https://github.com/apache/lucene/pull/14386#issuecomment-2743954863
@dweiss i know you dislike the complexity, but the `gradlew regenerate`
really saves a metric ton of human time and prevents mistakes for updates like
these.
--
This is an automated mes
rmuir closed issue #14378: Case insensitive regex query with character range
URL: https://github.com/apache/lucene/issues/14378
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
rmuir merged PR #14384:
URL: https://github.com/apache/lucene/pull/14384
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
dweiss commented on issue #14385:
URL: https://github.com/apache/lucene/issues/14385#issuecomment-2743756362
It's this commit that moved the temp folder from java.io.tmpdir, which we
redirected and cleaned up.
https://github.com/gradle/gradle/commit/8c2f6b7db50ab071a289fb5c4cbb9b2125
jimczi closed pull request #14067: BlockJoinBulkScorer could check for parent
deletions (not children)
URL: https://github.com/apache/lucene/pull/14067
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
dweiss commented on PR #14386:
URL: https://github.com/apache/lucene/pull/14386#issuecomment-2743977198
I know, I know. I don't think we should remove it - I just hope it can be
implemented in a less hairy way.
--
This is an automated message from the Apache Git Service.
To respond to th
tteofili commented on code in PR #14094:
URL: https://github.com/apache/lucene/pull/14094#discussion_r2007923461
##
lucene/core/src/java/org/apache/lucene/search/HnswQueueSaturationCollector.java:
##
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
rmuir commented on PR #14387:
URL: https://github.com/apache/lucene/pull/14387#issuecomment-2743932179
`./gradlew -XX:UseDweissTempFileGC`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
jainankitk commented on issue #14375:
URL: https://github.com/apache/lucene/issues/14375#issuecomment-2744045551
@jpountz - Can you assign this issue to me? I don't have permissions to do
that myself
--
This is an automated message from the Apache Git Service.
To respond to the message, p
vigyasharma commented on code in PR #14373:
URL: https://github.com/apache/lucene/pull/14373#discussion_r2008678599
##
lucene/core/src/java/org/apache/lucene/index/ParallelLeafReader.java:
##
@@ -348,15 +348,24 @@ public void prefetch(int docID) throws IOException {
@Over
dweiss commented on PR #14388:
URL: https://github.com/apache/lucene/pull/14388#issuecomment-2744155828
> This one didn't magically work like ICU
I've pushed a commit that should do the trick. ICU version wasn't in the
inputs so the build didn't know it'd been updated.
--
This is a
dweiss commented on code in PR #14388:
URL: https://github.com/apache/lucene/pull/14388#discussion_r2008129712
##
lucene/expressions/src/generated/checksums/generateAntlr.json:
##
@@ -1,7 +1,13 @@
{
+
"../../../../../.gradle/caches/modules-2/files-2.1/com.ibm.icu/icu4j/72.1
rmuir commented on PR #14389:
URL: https://github.com/apache/lucene/pull/14389#issuecomment-2744704384
I will straighten out the build, this one is kinda draftish as it needs more
tests etc. just wanted to toss out the idea.
If it is autogenerated we can easily maintain some cohesive
53 matches
Mail list logo