Alan Woodward created LUCENE-9930:
-------------------------------------

             Summary: UkrainianMorfologikAnalyzer reloads its Dictionary for 
every new TokenStreamComponents instance
                 Key: LUCENE-9930
                 URL: https://issues.apache.org/jira/browse/LUCENE-9930
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Alan Woodward
            Assignee: Alan Woodward


Large static data structures should be loaded in Analyzer constructors and 
shared between threads, but the UkrainianMorfologikAnalyzer is loading its 
dictionary in `createComponents`, which means it is reloaded and stored on 
every new analysis thread.  If you have a large dictionary and highly 
concurrent indexing then this can lead to you running out of memory as multiple 
copies of the dictionary are held in thread locals.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to