PUBLIC
Hi Matt,
Thanks for your help so far!
One vacation later, I am back looking at this. Unfortunately, the latest
results I am seeing only confuse me more.
I have this small test load of a 4313 module forest that I am typechecking. The
baseline resource usage, i.e. before any tricks about rehydrating the
ModDetails in the HPT, is 1 GB maximum residency, 113s MUT time and 87s GC
time. My aim is to reduce the maximum residency with as little disruption as
possible to the total runtime.
My first test was the completely brute-force approach of rehydrating every
single ModDetails in the HPT after typechecking every single module. Of course,
this has catastrophic runtime performance, since I end up
re-re-re-re-rehydrating every ModDetails for a total of 8,443,380 times (not
counting the initial rehydration just after typechecking to put it in the HPT).
So I get 290s MUT time, 252s GC time. But, the max residency goes down to 490
MB, showing that the idea, at least in principle, has legs.
So far so good. But then my problem starts -- how do I get this max residency
improvement with acceptable runtime? My idea was that when typechecking a
module, it should only unfold parts of ModDetails that are its transitive
dependencies, so it should be enough to rehydrate only those ModDetails. Since
this still results in 3,603,206 rehydrations, I shouldn't be too optimistic
about its performance, but it should still cut the overhead in half. When I try
this out, I get MUT time of 257s, GC time of 186s. However, the max residency
is 883 MB! But how is it possible that max residency is not the same 490 MB?!?!
Does that mean typechecking a module can unfold parts of ModDetails that are
not transitive dependencies of it? How would I track this down?
For reference, here is how I do the rehydration of the HPT, let me know if it
seems fishy:
```
recreateModDetailsInHpt :: HscEnv -> [ModuleName] -> IO ()
recreateModDetailsInHpt hsc_env mods = do
hpt <- readIORef hptr
fixIO \hpt' -> do
writeIORef hptr hpt'
traverse recreate_hmi hpt
pure ()
where
hpt@HPT{ table = hptr } = hsc_HPT hsc_env
recreate_hmi hmi@(HomeModInfo iface _details linkable)
| moduleName mod `elem` mods
= do
!fresh_details <- genModDetails hsc_env iface
pure $ HomeModInfo iface fresh_details linkable
| otherwise
= pure hmi
where
mod = mi_module iface
```
In summary, my questions going forward are:
* How come rehydrating transitive dependencies doesn't help as much for max
residency as rehydrating all already-loaded modules?
* What exactly does GHC itself do to use this new mutable HPT feature to good
effect? I'm sure it doesn't suffer from the above-described quadratic slowdown.
Thanks for the tip on the other two memory usage improvement MRs -- I haven't
had time yet to backport them. !12582 in particular seems like it will need
quite a bit of work to be applied on 9.8.
Unfortunately, I couldn't get eventlog2html to work -- if I pass an .hp file
with the `-p` parameter, I get an HTML file that claims "This eventlog was
generated without heap profiling.".
Thanks,
Gergo
From: Matthew Pickering <[email protected]>
Sent: Thursday, January 23, 2025 5:51 PM
To: Erdi, Gergo <[email protected]>
Cc: ÉRDI Gergő <[email protected]>; Zubin Duggal <[email protected]>;
Montelatici, Raphael Laurent <[email protected]>; GHC Devs
<[email protected]>
Subject: [External] Re: GHC memory usage when typechecking from source vs.
loading ModIfaces
That's good news.
I don't think the first idea will do very much as there are other references to
the final "HomeModInfo" not stored in the HPT.
Have you constructed a time profile to determine why the runtime is higher?
With the second approach you are certainly trading space usage for repeating
work.
If you actually do have a forest, then ideally you would replace the ModDetails
after it will never be used again.
You are likely also missing other patches important for memory usage.
*
https://urldefense.com/v3/__https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12582__;!!ASp95G87aa5DoyK5mB3l!8j2-zkmKQghR93XL-RPF1V9V1kplxBgAdAb456h8PjDVH7dx9jPdv0xP7GyikMyzP3qbiZPYaJL0ytEl2nUOva2t$
*
https://urldefense.com/v3/__https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12347__;!!ASp95G87aa5DoyK5mB3l!8j2-zkmKQghR93XL-RPF1V9V1kplxBgAdAb456h8PjDVH7dx9jPdv0xP7GyikMyzP3qbiZPYaJL0ytEl2kDCIO5S$
I can't comment about the 17 HPT, what do the retainer stacks look like in
ghc-debug?
PS. Please use eventlog2html so the profiles are readable! You can use it on
.hp profiles.
Cheers,
Matt
On Thu, Jan 23, 2025 at 3:19 AM Erdi, Gergo <mailto:[email protected]> wrote:
PUBLIC
Hi Matt & Zubin,
Thanks for the help on this so far!
I managed to hack the linked MR onto 9.8.4 (see
https://urldefense.com/v3/__https://gitlab.haskell.org/cactus/ghc/-/tree/cactus/backport-13675__;!!ASp95G87aa5DoyK5mB3l!8j2-zkmKQghR93XL-RPF1V9V1kplxBgAdAb456h8PjDVH7dx9jPdv0xP7GyikMyzP3qbiZPYaJL0ytEl2mon4aUz$)
and basically it seems to do what it says on the tin on a small example (see
attached heap profile examples for typechecking 4313 modules), but I am unsure
how to actually use it.
So my understanding of the improvement here is that since now there is only one
single HPT [*], I should be able to avoid unnecessary ballooning by doing two
things:
• Evicting `HomeModInfo`s wholesale from the HPT that are not going to be
needed anymore, because I am done with all modules that would transitively
depend on them. This of course only makes sense when typechecking a forest.
• Replacing remaining `HomeModInfo`s with new ones that contain the same
ModInterface but the ModDetails is replaced with a fresh one from
initModDetails.
The attached `-after` profile shows typechecking with both of these ideas
implemented. The first one doesn’t seem to help much on its own, but it’s
tricky to evaluate that because it is very dependent on the shape of the
workload (how tree-y it is). But the second one shows some serious promise in
curtailing memory usage. However, it is also very slow – even on this small
example, you can see its effect. On my full 35k+ module example, it more than
doubles the runtime.
What would be a good policy on when to replace ModDetails with thunks to avoid
both the space leak and excessive rehydration churn?
Also, perhaps unrelated, perhaps not – what’s with all those lists?!
Thanks,
Gergo
[*] BTW is it normal that I am still seeing several (17 in a small test case
involving a couple hundred modules) HPT constructors in the heap? (I hacked it
locally to be a datatype instead of a newtype just so I can see it in the
heap). I expected to see only one.
From: Matthew Pickering <mailto:[email protected]>
Sent: Tuesday, January 21, 2025 8:24 PM
To: ÉRDI Gergő <mailto:[email protected]>
Cc: Zubin Duggal <mailto:[email protected]>; Erdi, Gergo
<mailto:[email protected]>; Montelatici, Raphael Laurent
<mailto:[email protected]>; GHC Devs <mailto:[email protected]>
Subject: [External] Re: GHC memory usage when typechecking from source vs.
loading ModIfaces
Thanks Gergo, I think that unless we have access to your code base or a
realistic example then the before vs after snapshot will not be so helpful.
It's known that `ModDetails` will leak space like this.
Let us know how it goes for you.
Cheers,
Matt
On Fri, Jan 17, 2025 at 11:30 AM ÉRDI Gergő <mailto:[email protected]> wrote:
On Fri, 17 Jan 2025, Matthew Pickering wrote:
> 1. As Zubin points out we have recently been concerned with improving the
> memory usage
> of large module sessions (#25511, !13675, !13593)
>
> I imagine all these patches will greatly help the memory usage in your use
> case.
I'll try these out and report back.
> 2. You are absolutely right that ModDetails can get forced and is never reset.
>
> If you try !13675, it should be much more easily possible to reset the
> ModDetails by
> writing into the IORef which stores each home package.
Yes, that makes sense.
> 3. If you share your example or perhaps even a trace from ghc-debug then I
> will be
> happy to investigate further as it seems like a great test case for the work
> we have
> recently been doing.
Untangling just the parts that exercise the GHC API from all the other
in-house bits will be quite a lot of work. But if just a ghc-debug
snapshot of e.g. a small example from scratch vs. from existing ModIfaces
would be helpful (with e.g. the top HscEnv at the time of finishing all
typechecking as a saved closure), I can provide that no prob.
Thanks,
Gergo
________________________________________
This email and any attachments are confidential and may also be privileged. If
you are not the intended recipient, please delete all copies and notify the
sender immediately. You may wish to refer to the incorporation details of
Standard Chartered PLC, Standard Chartered Bank and their subsidiaries together
with Standard Chartered Bank’s Privacy Policy via our public website.
________________________________________
This email and any attachments are confidential and may also be privileged. If
you are not the intended recipient, please delete all copies and notify the
sender immediately. You may wish to refer to the incorporation details of
Standard Chartered PLC, Standard Chartered Bank and their subsidiaries together
with Standard Chartered Bank’s Privacy Policy via our main Standard Chartered
PLC (UK) website at sc. com
----------------------------------------------------------------------
This email and any attachments are confidential and may also be privileged. If
you are not the intended recipient, please delete all copies and notify the
sender immediately. You may wish to refer to the incorporation details of
Standard Chartered PLC, Standard Chartered Bank and their subsidiaries together
with Standard Chartered Bank’s Privacy Policy via our main Standard Chartered
PLC (UK) website at sc. com
_______________________________________________
ghc-devs mailing list
[email protected]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs