On Wed, 2013-06-26 at 20:21 +0000, Joseph S. Myers wrote: > For a shared library you need a well-defined namespace for GCC functions / > variables so it doesn't interfere with users. Are you going to put > everything inside a "gcc" namespace or similar?
FWIW I deliberately avoided talking about API/ABIs in the document: removal of global state is IMHO a necessary but not sufficient condition for being able to use GCC as a shared library; I want to focus on the global variables. That said, putting everything in a "gcc" namespace sounds reasonable to me (apart from classes used solely within a single source file, which can go in anonymous namespaces, right?). > (You also need to avoid > host libraries such as libiberty - which have C interfaces - interfering > with users of the shared library. If the initial hosts for shared library > builds are ELF, I suppose you can do that with a version map to hide > everything not in the "gcc" namespace.) (nods) > Some observations specifically on your top-40 globals: FWIW I wonder to what extent the discussions that follow all exhibit a tradeoff between the desire to provide a clean API vs the desire to minimize the size of the patch (to reduce backporting pain). I guess I have a vested interest in the latter approach, since a smaller patch is likely to be easier for me to write. However it also helps people backporting to 4.8 and 4.7 > * global_options: where available, using a pointer to a gcc_options > structure is of course better than using a universe (some attributes could > be implemented more cleanly if there were per-function gcc_options > pointers, for example). Note also the various TARGET_* macros that are > generated to AND a variable with a mask - really, such macros should gain > a parameter that is a pointer to a struct gcc_options (and then you'd pass > &global_options to them if no more specific structure is available). Interesting. > * For dump_file and associated variables such as dump_flags, I sort of > think there should be a proper API for code writing to dumps rather than > directly accessing dump_file all over the compiler. That should massively > reduce the number of places needing to access those fields of the > universe. Right, but doing this would be a big patch, touching numerous files. At one extreme, a minimal patch (with no new API) would be something like this: #define dump_file GET_UNIVERSE().dump_file_ #define dump_flags GET_UNIVERSE().dump_flags_ The approach currently in my plan document is somewhat more invasive, to try to avoid all of the GET_UNIVERSE() TLS uses in a shared-library build. What would a new API look like? The classic problem with logging/dumping is that you want to avoid doing work in the no-logging case, so that rather than: log ("some message: %s", perform_some_expensive_computation ()); with enable/disable in "log" (and thus always doing the expensive work and typically discarding it), you have things like: if (logging) { log ("some message: %s", perform_some_expensive_computation ()); } If we're going to have a "logging" conditional there, that's going to involve passing around or looking up a variable, so why not simply make it the dump_file? FWIW I originally came up with a very simple API, but I filed it in the "rejected ideas" appendix: http://dmalcolm.fedorapeople.org/gcc/global-state/rejected-ideas.html#rejected-idea-dump-if-details > * For targetm, make it const (which would move any further enhancements to > it firmly into the domain of Andrew MacLeod's proposal and out of yours, > because once it's const it's not global state for a compiler only > supporting one target at a time). There are only a few places, for a few > targets, that write to targetm at runtime; I think it should be possible > to fix those by e.g. changing some data hooks in targetm into function > hooks. Interesting; I'll explore this. > * For stderr and stdout, really the compiler should only be using them > through narrow diagnostic interfaces and not other code using them > directly. Put them in global_dc (which would, I suppose, in turn go in > the universe)? A shared library user should be able to replace them with > streams opened with open_memstream, for example, rather than having > diagnostics go direct to stderr/stdout at all. Yes, this requires dealing > with implicit uses through functions such as printf as well as explicit > references directly in GCC. Doublechecking, I think many of the reported uses of stdout/stderr are false-positives, where stdout is merely used in gen* tools when building the compiler, which isn't a problem; also in various debug functions intended to be invoked from inside the debugger - again, not an issue. > * For input_location, you no doubt know we want almost everywhere to use > the location of some well-defined source-code construct instead of the > global (the diagnostic functions that implicitly use input_location should > go away completely, for example). But changing that is a lot of work so > things probably do need to start by putting it in the universe as you > suggest. Yes; this is a case of minimizing patch size. > * For flag_isoc99 and associated variables defined in c-common.c, move > them to Variable entries in c.opt (i.e., into global_options). In > general, if a variable describes state determined by command-line options, > moving it into global_options is probably sensible. > > * For asm_out_file, see dump_file, stdout and stderr above - there should > be a well-defined API for writing assembler output and only the > implementation of that API should refer directly to this global. In the plan I suggested using TLS for this in the shared-library build, out of a desire to minimize the patch size. An alternative approach would be to move much of output.h into a class; something like this: class MAYBE_SINGLETON asm_out { public: void putc (char ); void puts (const char *); void printf (const char *, ...) some_attribute; /* Most out output.h gets moved to here. */ void assemble_zeros (unsigned HOST_WIDE_INT); void assemble_align (int); void assemble_string (const char *, int); void assemble_external_libcall (rtx); void assemble_label (FILE *, const char *); // etc private: FILE *asm_out_file; }; and have an "asm_out_" in the universe. The TLS approach would be much less work for me, so I prefer it. > * For lang_hooks, see targetm and make it const. Some hooks are > deliberately modified in free_lang_data - the answer to that is, I think, > to stop using those hooks directly except via a wrapper that checks > whether free_lang_data has been called (one new global) and decides > whether to call the langhook based on that. Interesting; again, I'll explore this approach. Thanks for the great feedback Dave