>> That is what "Discard all previous inputs" in stage 2 linking is for. > > But what does that mean? Are you saying that the linker interface to > the plugin should change to work that way? If we do that, then we > should change other aspects of the plugin interface as well. It could > probably become quite a bit simpler. > > The only reason we would ever need to do a complete relink is if the LTO > plugin can introduce arbitrary new symbol references. Is that ever > possible? If it is, we need to rethink the whole approach. If the LTO > plugin can introduce arbitrary new symbol references, that means that > LTO plugin can cause arbitrary objects to be pulled in from archives. > And that means that if we only run the plugin once, we are losing > possible optimizations, because the plugin will never those new objects. > > My suspicion is that the LTO plugin can only introduce a small bounded > set of new symbol references, namely those which we assume can be > satisified from -lc or -lgcc. Is that true?
Exactly. The plugin API was designed for this model -- if you want to start the link all over again, you may as well stick with the collect2 approach and enhance it to deal with archives of IR files. The plugin API, as implemented in gold (not sure about gnu ld), does maintain the original order of input files as far as symbol binding is concerned. When IR files are claimed, the plugin provides the list of symbols defined and referenced, and the linker builds the symbol table as if those files were linked in at that particular spot in the command line. When the compiler provides real definitions of those symbols later, the real definitions simply replace the "placeholders" that were left in the linker's symbol table. The only aspect of link order that isn't maintained is the physical order of the sections in memory. As Ian noted, if the compiler introduces new references that weren't there before, the new references must be from a limited set of libcalls that the backend can introduce, and those should all be resolved with an extra pass through -lc or -lgcc. That's not exactly pretty, but I don't see how it destroys the notion of link order -- the only way those new symbols could have been resolved differently is if a user library interposed definitions for the libcall, and those certainly can't be what the compiler intended to bind to. In PR 12248, I think it's questionable to claim that the compiler-introduced call to __udivdi3 should not resolve to the version in libgcc. Sure, I understand it's useful for library developers while debugging and testing, but an ordinary user certainly can't count on his own definition of that routine to get called -- the compiler might generate the division inline, or call a different specialized version. All of these routines are outside the user's namespace, and we should be able to optimize without regard for what the user's libraries might contain. An improvement could be for the claim file handler to determine what libcalls might be introduced and add them to the list of referenced symbols so that the linker can bring in the definitions in the original pass through the input files -- any that end up not being referenced can be garbage collected. Alternatively, we could do a whole-archive link of the library that contains the libcalls, again discarding unreferenced routines via garbage collection. Neither of these require a change to the API. -cary