zhaih commented on code in PR #12462: URL: https://github.com/apache/lucene/pull/12462#discussion_r1286391478
########## lucene/core/src/java/org/apache/lucene/util/automaton/RegExp.java: ########## @@ -1067,22 +1071,44 @@ private boolean check(int flag) { } final RegExp parseUnionExp() throws IllegalArgumentException { - RegExp e = parseInterExp(); - if (match('|')) e = makeUnion(flags, e, parseUnionExp()); - return e; + return iterativeParseExp(this::parseInterExp, () -> match('|'), RegExp::makeUnion); } final RegExp parseInterExp() throws IllegalArgumentException { - RegExp e = parseConcatExp(); - if (check(INTERSECTION) && match('&')) e = makeIntersection(flags, e, parseInterExp()); - return e; + return iterativeParseExp( + this::parseConcatExp, () -> check(INTERSECTION) && match('&'), RegExp::makeIntersection); } final RegExp parseConcatExp() throws IllegalArgumentException { - RegExp e = parseRepeatExp(); - if (more() && !peek(")|") && (!check(INTERSECTION) || !peek("&"))) - e = makeConcatenation(flags, e, parseConcatExp()); - return e; + return iterativeParseExp( + this::parseRepeatExp, + () -> (more() && !peek(")|") && (!check(INTERSECTION) || !peek("&"))), + RegExp::makeConcatenation); + } + + /** + * Custom Functional Interface for a Supplying methods with signature of RegExp(int int1, RegExp + * exp1, RegExp exp2) + */ + @FunctionalInterface + private interface MakeRegexGroup { + RegExp get(int int1, RegExp exp1, RegExp exp2); + } + + final RegExp iterativeParseExp( + Supplier<RegExp> gather, BooleanSupplier stop, MakeRegexGroup reduce) + throws IllegalArgumentException { + Deque<RegExp> regExpStack = new ArrayDeque<>(); Review Comment: Why we need stack/deque even? If we need to further reduce call stack, then I think we need a **stack that is shared across function calls** and some more rewrite. But here I don't think we need stack and do any FIFO operations? Should be just: 1. parse all the sub component 2. reduce them So why not: ``` RegExp res = null; do { RegExp e = gather.get(); if (res == null) { res = e; } else { res = reduce.get(flags, res, e); } while (stop.getAsBoolean()); ``` I think this may alter the result a bit by changing it from `a | (b | (c | d))` to `((a | b) | c) | d`, but for `union` `intersect` and `concat` the affiliation shouldn't affect the correctness? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org