> On Jul 14, 2015, at 8:25 AM, David Blaikie <[email protected]> wrote: > > > > On Mon, Jul 13, 2015 at 7:25 PM, Richard Smith <[email protected] > <mailto:[email protected]>> wrote: > On Mon, Jul 13, 2015 at 6:02 PM, Adrian Prantl <[email protected] > <mailto:[email protected]>> wrote: > >> On Jul 13, 2015, at 5:47 PM, Richard Smith <[email protected] >> <mailto:[email protected]>> wrote: >> >> On Mon, Jul 13, 2015 at 3:06 PM, Adrian Prantl <[email protected] >> <mailto:[email protected]>> wrote: >> > On Jul 13, 2015, at 2:00 PM, Eric Christopher <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Hi Adrian, >> > >> > Finally getting around to looking at some of this and I think it's going >> > in slightly the wrong direction. In general I think begin -able- to put >> > modules in object files to simplify wrapping, use, etc is a good thing. I >> > think being required to do so is somewhat problematic. >> > >> >> Let me start with that the current infrastructure already allows selecting >> whether you want wrapped modules or not by passing the appropriate >> PCHContainerOperations object to CompilerInstance. Clang currently >> unconditionally uses an object file wrapper, all of clang-tools-extra >> doesn’t. We could easily control the behavior of clang based on a (new) >> command line option. >> >> But.. on a platform with a shared module cache you always have to assume >> that a module once built will eventually be used by a client that wants to >> read the debug info. Think llvm-dsymutil — it does not know and does not >> want to know how to build clang modules, but does want to read all the debug >> info from a clang module. >> >> > Imagine, for example, you have a giant distributed build system... >> > >> > You'd want to create a pile of modules (that may reference/include/etc >> > other modules) that aren't don't or may not have debug information as part >> > of them (because you might want to build without it or have the debug info >> > alongside it as a separate compilation). Waiting on the full build of the >> > module including debug is going to adversely affect your overall build >> > time and so shouldn't be necessary - especially if you want to be able to >> > have information separate ultimately. >> > >> > Make sense? >> >> Not sure if you would be saving much by having the debug info separately, >> from what I’ve measured so far the debug info for a module makes up less >> than 10% of the total size. Admittedly, build-time-wise going through the >> backend to emit the object file is a lot more expensive than just dumping >> the raw PCH. [1] >> >> Yeah, I think wanting to be able to control the behavior is reasonable, we >> just need to be careful what the implications for consumers are. If we add >> a, e.g., an “-fraw-modules” [2] or switch to clang to turn off the object >> file wrapping, I’d strongly suggest that we add the value of this switch to >> the module hash (or add a an optional “-g” to the module file name after the >> hash or something like that) to avoid ugly race conditions between debug >> info and non-debug-info builds of the same module. This way we’d have >> essentially two separate module caches, with and without debug info. >> >> That's fine, I think (we don't use a module cache at all in our build >> system; it doesn't really make much sense for a distributed build) and most >> command-line flag changes already have this effect. > > Great! >> >> would that work for you? >> -- adrian >> >> [1] If you want to be serious about building the module debug info in >> parallel to the rest of the build, you could even have a clang-based tool >> import the just-built raw clang module and emit the debug info without >> having to parse the headers again :-) >> >> That is what we intend to do :) (Assuming this turns out to actually be >> faster than re-parsing; faulting in the entire contents of a module has much >> worse locality than parsing.) >> >> [2] -fraw-modules, -fmodule-format-raw, -fmodule-debug-info, ...? >> I would imagine that the driver enables module debug info when >> "-gmodules” is present and by default on Darwin. >> >> That seems reasonable to me. For the frontend flag, I think a flag to turn >> this on or to select the module format makes more sense than a flag to >> switch to the raw format. > > Okay then let’s narrow this down. Other possibilities in that direction > include (sorted from subjectively best to worst) > > -fmodule-format=obj > -fmodule-debug-info > -ffat-modules > -fmodule-container > -fmodule-container-object > > It's a -cc1 flag, so it doesn't really matter much. If this will eventually > govern whether we put code for inline functions into the module, then I think > we should avoid names like -fmodule-debug-info. Other than that, I don't > really have a preference. >
Unless the “=“ part turns out to be an implementation nightmare, I think I’ll be going with -fmodule-format=[raw,obj] then and implicitly emit debug info in the obj case. If necessary, we can make this more fine grained later. > What you're picturing there is essentially a flag that would indicate if we > should build all module-related-object-things into the module, or not? That > seems like a useful broad flag (with an eventual corresponding compiler mode > where we pass another flag and explicitly pass just the module and say "build > a separate object with all the module-related-object-things - for use in a > non-implicit-cache build) > > (Hmm, we're going to have a weird middle ground in here - where the IR for > the inline functions needs to go in the module itself (as an > available_externally definition for use in non-LTO compilations of dependent > object files) and then the build-separate-module-related-object-things would > turn those into (weak?) definitions, compile them (& the debug info) into a > separate object file, to be linked in at the end) Can you elaborate this use-case? Are you saying you’d want a module object file with ast+bitcode and another one with bitcode'+debug info built from the first one? Or one raw ast file and two object files? > > Should this just be keyed/defaulted off implicit/explicit modules, or > orthogonal to that choice? >> [One other thing... I think we may have made a mistake by putting the reader >> and writer code behind the same interface: it forces tools that want to read >> the module format to link against all of LLVM IR, code generation, and so >> on, when all they really need is something like libObject.] > > We can always split it into two implementations of the interface or two > interfaces, that’s not a very big deal. My assumption was that every tool > that wants to read the clang module format also wants to create modules > (because module cache... but as you noted that’s a Darwin-centric view) and > more low-level tools like llvm-bcanalyzer could be piped through llvm-objdump. -- adrian
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
