[ 
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078431#comment-17078431
 ] 

Bruno Roustant edited comment on LUCENE-9286 at 4/8/20, 4:10 PM:
-----------------------------------------------------------------

That's strange. In the PR I integrated your code to recompile and walk the FST. 
See TestFSTDirectAddressing.main() with a first arg "-recompileAndWalk" and a 
second the path to an FST. I used the FST zip you provided 
"fst-17291407798783309064.fst.gz".

Before the patch, I got roughly the same perf as you got on your side and that 
you shared previously. Then with the patch, I can verify that the perf is fixed:
{code:java}
Reading FST
time = 402 ms

FST construction (oversizingFactor=0.0)
time = 1302 ms
FST RAM = 56055936 B
FST enum
time = 322 ms

FST construction (oversizingFactor=1.0)
time = 1235 ms
FST RAM = 54945816 B
FST enum
time = 239 ms 
{code}
Can you run this TestFSTDirectAddressing.main()?

I run it on master branch. Should I run it on branch 8x to reproduce your env?


was (Author: broustant):
That's strange. In the PR I integrated your code to recompile and walk the FST. 
See TestFSTDirectAddressing.main() with a first arg "-recompileAndWalk" and a 
second the path to an FST. I used the FST zip you provided 
"fst-17291407798783309064.fst.gz".

Before the patch, I got roughly the same perf as you got on your side and that 
you shared previously. Then with the patch, I can verify that the perf is fixed:

 
{code:java}
Reading FST
time = 402 ms
FST construction (oversizingFactor=0.0)
time = 1302 ms
FST RAM = 56055936 B
FST enum
time = 322 ms
FST construction (oversizingFactor=1.0)
time = 1235 ms
FST RAM = 54945816 B
FST enum
time = 239 ms 
{code}
Can you run this TestFSTDirectAddressing.main()?

I run it on master branch. Should I run it on branch 8x to reproduce your env?

> FST arc.copyOf clones BitTables and this can lead to excessive memory use
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-9286
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9286
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 8.5
>            Reporter: Dawid Weiss
>            Assignee: Bruno Roustant
>            Priority: Major
>         Attachments: screen-[1].png
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> I see a dramatic increase in the amount of memory required for construction 
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed 
> for bit tables. I am pretty sure this didn't require so much memory before 
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to