tang-hi opened a new pull request, #12286:
URL: https://github.com/apache/lucene/pull/12286

   ### Description
   
   In Issue #12284, I observed that Lucene limits the recursion level in the 
`topoSortStatesRecurse` method to avoid a StackOverflow error during automaton 
topological sorting. I propose we could use an iterative approach instead of 
recursion.
   I've implemented an iterative version as shown below. 
   ````
   private static int topoSortStatesRecurse(
         Automaton a, BitSet visited, int[] states) {
         
       Stack<Integer> stack = new Stack<>();
       stack.push(0); // Assuming that the initial state is 0.
       int upto = 0;
       Transition t = new Transition();
   
       while (!stack.empty()) {
           int state = stack.pop();
   
           int count = a.initTransition(state, t);
           for (int i = 0; i < count; i++) {
               a.getNextTransition(t);
               if (!visited.get(t.dest)) {
                   visited.set(t.dest);
                   stack.push(t.dest);
               }
           }
           states[upto] = state;
           upto++;
       }
       return upto;
   }
   ````
   However, I noticed that the test 
[TestAutomaton.java](https://github.com/apache/lucene/blob/963ed7ce888724c2dd55fff8c13a08b81b20f535/lucene/core/src/test/org/apache/lucene/util/automaton/TestAutomaton.java#LL1205C7-L1205C7)
 depends on the order in which recursive calls are made or items are added to 
and removed from the stack. To maintain the same order as the recursive version 
in the iterative approach, so I've used a particular technique in the pull 
request.
   
   To further improve this change, here are my plans:
   1. Remove the trick and modify the test code accordingly.
   2. Perhaps we should check if the automaton contains cycles, and throw an 
`IllegalArgumentException` if it does?
   3. Should we continue to limit the size of the Automaton?
   I welcome any suggestions or feedback on this approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to