dweiss commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2640330745
> Finally! The concatenate() issue was an easy fix, it neglected to clean up
its dead states. All of its partners in crime do this, but the fact we neglect
it for concatenate messes up to
jpountz commented on code in PR #14193:
URL: https://github.com/apache/lucene/pull/14193#discussion_r1944819744
##
lucene/core/src/test/org/apache/lucene/util/automaton/TestAutomaton.java:
##
@@ -667,11 +667,14 @@ public void testConcatenatePreservesDet() throws
Exception {
rmuir merged PR #14193:
URL: https://github.com/apache/lucene/pull/14193
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
rmuir commented on code in PR #14193:
URL: https://github.com/apache/lucene/pull/14193#discussion_r1944626865
##
lucene/core/src/test/org/apache/lucene/util/automaton/TestAutomaton.java:
##
@@ -667,11 +667,14 @@ public void testConcatenatePreservesDet() throws
Exception {
}
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2638840849
I'm feeling good about this one now, with the change, a lot of regexps now
come out minimal from the start, which is a good thing.
We also eliminate overhead of tons of nodes, which
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2638827849
Finally! The concatenate() issue was an easy fix, it neglected to clean up
its dead states. All of its partners in crime do this, but the fact we neglect
it for concatenate messes up too m
rmuir commented on code in PR #14193:
URL: https://github.com/apache/lucene/pull/14193#discussion_r1944084497
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -648,13 +645,16 @@ private Automaton toAutomaton(
break;
case REGEXP_CHAR:
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2638323042
OK I tried out a List-based API as alternative to array-based API. It isn't
fully correct, which is part of my issue, but see it here:
https://github.com/apache/lucene/commit/8b535a1c2fb4
rmuir commented on code in PR #14193:
URL: https://github.com/apache/lucene/pull/14193#discussion_r1943654993
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -1195,60 +1215,132 @@ final RegExp parseCharClassExp() throws
IllegalArgumentException {
rmuir commented on code in PR #14193:
URL: https://github.com/apache/lucene/pull/14193#discussion_r1943653718
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -1050,14 +1059,25 @@ static RegExp makeDeprecatedComplement(int flags,
RegExp exp) {
}
rmuir commented on code in PR #14193:
URL: https://github.com/apache/lucene/pull/14193#discussion_r1943652340
##
lucene/core/src/java/org/apache/lucene/util/automaton/Automata.java:
##
@@ -140,6 +141,32 @@ public static Automaton makeCharRange(int min, int max) {
return a;
rmuir commented on code in PR #14193:
URL: https://github.com/apache/lucene/pull/14193#discussion_r1943651732
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -648,13 +645,16 @@ private Automaton toAutomaton(
break;
case REGEXP_CHAR:
mikemccand commented on code in PR #14193:
URL: https://github.com/apache/lucene/pull/14193#discussion_r1943595546
##
lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java:
##
@@ -648,13 +645,16 @@ private Automaton toAutomaton(
break;
case REGEXP_CHA
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2637630803
I opened https://github.com/apache/lucene/issues/14200 for the error-prone
situation
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2637016936
I tried to update the error prone to fix its bugs, it is angry about the way
we do gradle. I will YOLO my way thru this stuff.
> The default --should-stop=ifError policy (INIT) is n
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2637219251
@dweiss the `List` is a good idea. I did the arrays only because
@john-wagster has arrays over on #14192, but in the parser lists are used. The
only useful thing about arrays are convenien
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2637079875
I would propose replacing this checker with ast-grep rules for whatever we
need: it is not a good one. the use of internal java APIs is too crazy.
--
This is an automated message from th
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2637077428
I fixed the error-prone
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2637021676
Like i literally have no idea what this tool is trying to tell me there. But
I think errorprone is broken, it depends on too many internals of the java
compiler apis.
--
This is an auto
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2636033017
a few more notes:
* maybe we should deprecate `union(Automaton, Automaton)` and only leave
`union(List)`. I see the former approach has proven trappy, let's
guide developers to do it th
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2636006731
I will figure out what angers the error-prone tomorrow. I am rusty with
java, so this PR needs assistance LOL. but all the tests pass.
--
This is an automated message from the Apache Git
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2635997545
My example for this one, if you have something like `[^a-gklM-O\s]`, with
the case-insensitive flag maybe, it just calls the new
`makeCharClass(int[],int[])` method and you get minimal aut
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2635939281
anyway, I think this is the right path, rather than fight with union(),
let's just get it out of our way. with this change union() is only used for
union operator (`|`) and not internally.
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2635936648
That's error-prone that's broke trying to do some null analysis :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
rmuir commented on PR #14193:
URL: https://github.com/apache/lucene/pull/14193#issuecomment-2635933607
I generalized this to `makeCharClass(int[],int[])`, added a "character
class" node to use it instead of unioning many nodes, replaced the pre-built
class functionality with it too.
25 matches
Mail list logo