[ 
https://issues.apache.org/jira/browse/LUCENE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-10296.
----------------------------------
    Fix Version/s: 10.0 (main)
       Resolution: Fixed

> Stop minimizing regexps
> -----------------------
>
>                 Key: LUCENE-10296
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10296
>             Project: Lucene - Core
>          Issue Type: Task
>    Affects Versions: 10.0 (main)
>            Reporter: Robert Muir
>            Priority: Major
>             Fix For: 10.0 (main)
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> In current trunk, we let caller (e.g. RegExpQuery) try to "reduce" the 
> expression. The parser nor the low-level executors don't implicitly call 
> exponential-time algorithms anymore.
> But now that we have cleaned this up, we can see it is even worse than just 
> calling {{{}determinize(){}}}. We still call {{minimize()}} which is much 
> crazier and much more. 
> We stopped doing this for all other AutomatonQuery subclasses a long time 
> ago, as we determined that it didn't help performance. Additionally, 
> minimization vs. determinization is even less important than early days where 
> we found trouble: the representation got a lot better. Today when you 
> {{finishState}} we do a lot of practical sorting/coalescing on-the-fly. Also 
> we added this fancy UTF32-to-UTF8 automata convertor, that makes the 
> worst-case-space-per-state significantly lower than it was before? So why 
> {{minimize()}} ?
> Let's just replace {{minimize()}} calls with {{determinize()}} calls? I've 
> already swapped them out for all of {{{}src/test{}}}, to get jenkins looking 
> for issues ahead of time.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to