[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17073959#comment-17073959
 ] 

Dawid Weiss commented on LUCENE-9286:
-------------------------------------

Hi Bruno. Thank you for looking into it. The problem is not during construction 
of the FST but later on - when the FST is used. In our algorithms we kept a 
significant number of some arcs in memory. Previously they were cheap, now they 
are not: arc.copyOf copies the entire underlying bit table:

bq. What was previously fairly cheap (copyOf) has become fairly heavy and blows 
up memory when you have data structures that require storing intermediate Arcs 
during processing

I didn't look into this but if these bit tables are immutable once the FST is 
constructed then copyOf could just copy the reference. A side note is that 
copyOf doesn't really fully reset the state of an arc (clear bit table 
reference if the copied arc doesn't have the bit table, for example).


> FST construction explodes memory in BitTable
> --------------------------------------------
>
>                 Key: LUCENE-9286
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9286
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 8.5
>            Reporter: Dawid Weiss
>            Assignee: Bruno Roustant
>            Priority: Major
>         Attachments: screen-[1].png
>
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to