On Tue, Mar 6, 2018 at 4:02 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >> On Tue, Mar 6, 2018 at 2:30 PM, Hrishikesh Kulkarni >> <hrishikeshpa...@gmail.com> wrote: >> > Hi, >> > >> > Thank you Richard and Honza for the suggestions. If I understand correctly, >> > the issue is that LTO file format keeps changing per compiler versions, so >> > we need a more “stable” representation and the first step for that would be >> > to “stabilize” representations for lto-cgraph and symbol table ? >> >> Yes. Note the issue is that the current format is a 1:1 representation of >> the internal representation -- which means it is the internal representation >> that changes frequently across releases. I'm not sure how Honza wants >> to deal with those changes in the context of a "stable" IL format. Given >> we haven't been able to provide a stable API to plugins I think it's much >> harder to provide a stable streaming format for all the IL details.... >> >> > Could you >> > please elaborate on what initial steps need to be taken in this regard, and >> > if it’s feasible within GSoC timeframe ? >> >> I don't think it is feasible in the GSoC timeframe (nor do I think it's >> feasible >> at all ...) > > I skipped this, with GSoC timeframe I fully agree. With feasibility at all > not so > much - LLVM documents its bitcode to reasonable extend > https://llvm.org/docs/BitCodeFormat.html > > Reason why i mentioned it is that I would like to use this as an excuse to get > things incrementally cleaned up and it would be nice to keep it in mind while > working on this.
Ok. It's probably close enough to what I recommended doing with respect to make the LTO bytecode "self-descriptive" -- thus start with making the structure documented and parseable without assigning semantics to every bit ;) I think that can be achieved top-down in a very incremental way if you get the bottom implemented first (the data-streamer part). Richard. > Honza >> >> > Thanks! >> > >> > >> > I am trying to break down the project into milestones for the proposal. So >> > far, I have identified the following objectives: >> > >> > 1] Creating a separate driver, that can read LTO object files. Following >> > Richard’s estimate, I’d leave around first half of the period for this >> > task. >> > >> > Would that be OK ? >> >> Yes. >> >> > Coming to 2nd half: >> > >> > 2] Dumping pass summaries. >> > >> > 3] Stabilizing lto-cgraph and symbol table. >> >> So I'd instead do >> >> 3] Enhance the user-interface of the driver >> >> like providing a way to list all function bodies, a way to dump >> the IL of a single function body, a way to create a dot graph file >> for the cgraph in the file, etc. >> >> Basically while there's a lot of dumping infrastructure in GCC >> it may not always fit the needs of a LTO IL dumping tool 1:1 >> and may need refactoring enhancement. >> >> Richard. >> >> > >> > Thanks, >> > >> > Hrishikesh >> > >> > >> > >> > On Fri, Mar 2, 2018 at 6:31 PM, Jan Hubicka <hubi...@ucw.cz> wrote: >> >> >> >> Hello, >> >> > On Fri, Mar 2, 2018 at 10:24 AM, Hrishikesh Kulkarni >> >> > <hrishikeshpa...@gmail.com> wrote: >> >> > > Hello everyone, >> >> > > >> >> > > >> >> > > Thanks for your suggestions and engaging response. >> >> > > >> >> > > Based on the feedback I think that the scope of this project comprises >> >> > > of >> >> > > following three indicative actions: >> >> > > >> >> > > >> >> > > 1. Creating separate driver i.e. separate dump tool that uses lto >> >> > > object API >> >> > > for reading the lto file. >> >> > >> >> > Yes. I expect this will take the whole first half of the project, >> >> > after this you >> >> > should be somewhat familiar with the infrastructure as well. With the >> >> > existing dumping infrastructure it should be possible to dump the >> >> > callgraph and individual function bodies. >> >> > >> >> > > >> >> > > 2. Extending LTO dump infrastructure: >> >> > > >> >> > > GCC already seems to have dump infrastructure for pretty-printing tree >> >> > > nodes, gimple statements etc. However I suppose we’d need to extend >> >> > > that for >> >> > > dumping pass summaries ? For instance, should we add a new hook say >> >> > > “dump” >> >> > > to ipa_opt_pass_d that’d dump the pass >> >> > > summary ? >> >> > >> >> > That sounds like a good idea indeed. I'm not sure if this is the most >> >> > interesting >> >> > missing part - I guess we'll find out once a dump tool is available. >> >> >> >> Concering the LTO file format my longer term aim is to make the symbol >> >> table sections (symtab used by lto-plugin as well as the callgraph >> >> section) >> >> and hopefully also the Gimple streams) documented and well behaving >> >> without changing the format in every revision. >> >> >> >> On the other hand the summaries used by individual passes are intended to >> >> be >> >> pass specific and envolving as individula passes become stronger/new >> >> passes >> >> are added. >> >> >> >> It is quite a lot of work to stabilize gimple representation to this >> >> extend, >> >> For callgraph&symbol table this is however more realistic. That would mean >> >> to >> >> move some of existing random stuff streamed there into summaries and >> >> additionaly >> >> cleaning up/rewriting lto-cgraph so the on disk format actually makes >> >> sense. >> >> >> >> I will be happy to help with any steps in this direction as well. >> >> >> >> Honza >> > >> >