[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078075#comment-17078075
 ] 

Dawid Weiss commented on LUCENE-9286:
-------------------------------------

I love the patch and the idea, Bruno. But the cost of construction and 
traversal have gone high up for me. On a slower-ish dell laptop I get the 
following on master (same code as in that tiny repo before):

{code}
FST construction (of=0.8)     2s 299ms   9.4%    11s
TermEnum scan (of=0.8)           391ms   1.6%    14s

FST construction (of=1.0)           5s  24.0%    14s
TermEnum scan (of=1.0)        3s 938ms  16.1%    20s
{code}

whereas after the patch compilation and enumeration goes up to a whopping 35+ 
seconds!
{code}
FST construction (of=0.8)     2s 357ms   2.6%    13s
TermEnum scan (of=0.8)           457ms   0.5%    16s

FST construction (of=1.0)          37s  41.8%    16s
TermEnum scan (of=1.0)             35s  39.6%    54s
{code}

This is a bit strange, isn't it? I don't think I've changed anything regarding 
fst parameters:
https://github.com/dweiss/lucene9286/commit/1fb899e018712e9637984d95937c50b4bc9ffa97#diff-32e7e4311056421eef9f3a87a4ec51f7R56-R66

I'm not sure where the difference comes from and why it's so slow -- I'll try 
to get an execution profile later today.

> FST arc.copyOf clones BitTables and this can lead to excessive memory use
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-9286
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9286
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 8.5
>            Reporter: Dawid Weiss
>            Assignee: Bruno Roustant
>            Priority: Major
>         Attachments: screen-[1].png
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to