[
https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078431#comment-17078431
]
Bruno Roustant edited comment on LUCENE-9286 at 4/8/20, 4:10 PM:
-----------------------------------------------------------------
That's strange. In the PR I integrated your code to recompile and walk the FST.
See TestFSTDirectAddressing.main() with a first arg "-recompileAndWalk" and a
second the path to an FST. I used the FST zip you provided
"fst-17291407798783309064.fst.gz".
Before the patch, I got roughly the same perf as you got on your side and that
you shared previously. Then with the patch, I can verify that the perf is fixed:
{code:java}
Reading FST
time = 402 ms
FST construction (oversizingFactor=0.0)
time = 1302 ms
FST RAM = 56055936 B
FST enum
time = 322 ms
FST construction (oversizingFactor=1.0)
time = 1235 ms
FST RAM = 54945816 B
FST enum
time = 239 ms
{code}
Can you run this TestFSTDirectAddressing.main()?
I run it on master branch. Should I run it on branch 8x to reproduce your env?
was (Author: broustant):
That's strange. In the PR I integrated your code to recompile and walk the FST.
See TestFSTDirectAddressing.main() with a first arg "-recompileAndWalk" and a
second the path to an FST. I used the FST zip you provided
"fst-17291407798783309064.fst.gz".
Before the patch, I got roughly the same perf as you got on your side and that
you shared previously. Then with the patch, I can verify that the perf is fixed:
{code:java}
Reading FST
time = 402 ms
FST construction (oversizingFactor=0.0)
time = 1302 ms
FST RAM = 56055936 B
FST enum
time = 322 ms
FST construction (oversizingFactor=1.0)
time = 1235 ms
FST RAM = 54945816 B
FST enum
time = 239 ms
{code}
Can you run this TestFSTDirectAddressing.main()?
I run it on master branch. Should I run it on branch 8x to reproduce your env?
> FST arc.copyOf clones BitTables and this can lead to excessive memory use
> -------------------------------------------------------------------------
>
> Key: LUCENE-9286
> URL: https://issues.apache.org/jira/browse/LUCENE-9286
> Project: Lucene - Core
> Issue Type: Bug
> Affects Versions: 8.5
> Reporter: Dawid Weiss
> Assignee: Bruno Roustant
> Priority: Major
> Attachments: screen-[1].png
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> I see a dramatic increase in the amount of memory required for construction
> of (arguably large) automata. It currently OOMs with 8GB of memory consumed
> for bit tables. I am pretty sure this didn't require so much memory before
> (the automaton is ~50MB after construction).
> Something bad happened in between. Thoughts, [~broustant], [~sokolov]?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]