[ https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140590#comment-17140590 ]
Tomoko Uchida commented on LUCENE-9286: --------------------------------------- Can we have a benchmark for kuromoji (or nori) module in lucene ? Kuromoji (and Nori) analysis module heavily relies on FST to find candidate dictionary entries, so its analysis performance is directly influenced by FST changes. Kuromoji (or Nori) performance benchmarks could find performance degradation in fst, although it is indirect mean to detect problems in FST. > FST arc.copyOf clones BitTables and this can lead to excessive memory use > ------------------------------------------------------------------------- > > Key: LUCENE-9286 > URL: https://issues.apache.org/jira/browse/LUCENE-9286 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 8.5 > Reporter: Dawid Weiss > Assignee: Bruno Roustant > Priority: Major > Fix For: 8.6 > > Attachments: screen-[1].png > > Time Spent: 1h 50m > Remaining Estimate: 0h > > I see a dramatic increase in the amount of memory required for construction > of (arguably large) automata. It currently OOMs with 8GB of memory consumed > for bit tables. I am pretty sure this didn't require so much memory before > (the automaton is ~50MB after construction). > Something bad happened in between. Thoughts, [~broustant], [~sokolov]? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org