Re: The GHC build system

Neil Mitchell Mon, 30 May 2011 11:27:16 -0700

Hi Simon,

>> With Shake, it's possible to run it single threaded and dump a list of
>> all system commands performed. With a little bit of care (you couldn't
>> use your Haskell Shake code to query the global environment in
>> non-obvious ways), this could be used as the initial bootstrap. It's
>> not been done before, but I see no reason it won't work.
>>
>> Yhc died because of build system choice, but I'm sure that won't
>> happen to GHC - it will just suck away fun and valuable Haskell
>> hacking time. But Shake has other benefits, such as profiling. With
>> detailed profile statistics about your build system you can usually
>> make sensible modifications and get some build time performance
>> improvement. Similarly, I'm sure Shake would increase the parallelism
>> of the build.
>
> I remember you made that claim before, but I don't remember the reasoning
> for it - why do you think there's more parallelism to be extracted?  As far
> as I know we already give make a pretty accurate dependency graph, the only
> place where we are a bit innaccurate is in the dependencies between the
> configure steps for packages, but those are quite a small part of the
> overall build time.  Oh, and we could calcualte dependencies on Haskell
> source files individually rather than all at once, but that really only
> impacts rebuilds, and it could be fixed.
>
> Perhaps make's scheduling algorithm is suboptimal for our graph.  I seem to
> recall Shake does some kind of random scheduling, is that right?


There are a number of reasons I think Shake would be faster:

* Random scheduling is a big win (~20% I think), since it tends to
interleave disk heavy and CPU heavy work. Ian's suggestion of
"remembering" how long a task takes is a lot of work, but randomness
tends to paper over any discrepancies quite nicely.

* Shake can extract the maximum parallelism from a task, whereas I've
seen make doesn't do as well.

* I've previously heard that GHC doesn't really use all the CPU's
available while compiling? If that's true, then I suspect Shake could
push it up closer to 100%, or at least using the Shake profiling tools
you could figure out where the bottlenecks are. Getting Shake to max
out at least one of the disk/memory/CPU hasn't been very difficult
whenever I've tried it.

* GHC currently proceeds in phases, which naturally cause complete
serialisation, with Shake you might be able to accurately interleave
the phases (although perhaps you really do need all of phase 1
complete before starting on phase 2).

But my basic experience is that Shake gives really good parallelism,
whereas make usually doesn't do as well - I just don't trust make.

Thanks, Neil

_______________________________________________
Cvs-ghc mailing list
Cvs-ghc@haskell.org
http://www.haskell.org/mailman/listinfo/cvs-ghc

Re: The GHC build system

Reply via email to