Re: GSOC 2018 - Textual LTO dump tool project

Richard Biener Fri, 02 Mar 2018 01:49:39 -0800

On Fri, Mar 2, 2018 at 10:24 AM, Hrishikesh Kulkarni
<[email protected]> wrote:
> Hello everyone,
>
>
> Thanks for your suggestions and engaging response.
>
> Based on the feedback I think that the scope of this project comprises of
> following three indicative actions:
>
>
> 1. Creating separate driver i.e. separate dump tool that uses lto object API
> for reading the lto file.


Yes.  I expect this will take the whole first half of the project,
after this you
should be somewhat familiar with the infrastructure as well.  With the
existing dumping infrastructure it should be possible to dump the
callgraph and individual function bodies.

>
> 2. Extending LTO dump infrastructure:
>
> GCC already seems to have dump infrastructure for pretty-printing tree
> nodes, gimple statements etc. However I suppose we’d need to extend that for
> dumping pass summaries ? For instance, should we add a new hook say “dump”
> to ipa_opt_pass_d that’d dump the pass
> summary ?

That sounds like a good idea indeed.  I'm not sure if this is the most
interesting
missing part - I guess we'll find out once a dump tool is available.

> 3. Refactoring streaming API - Could you please elaborate more on what
> improvements could be made to the streaming API ? Would it be a good idea to
> make it more “C++ style” similar to iostream interface ? Also while going
> thru ipa-cp/ipa-prop I noticed the following in ipa_prop_read_functions(),
> which looks like some kind of “preamble” for setting up header to read the
> summary. Perhaps this could be abstracted into streaming API too ?
>
> const struct lto_function_header *header =
>
>    (const struct lto_function_header *) data;
>
>  const int cfg_offset = sizeof (struct lto_function_header);
>
>  const int main_offset = cfg_offset + header->cfg_size;
>
>  const int string_offset = main_offset + header->main_size;

This is a very hard task so I suggest to not venture into the area of
refactoring the API for this project.

What I thought of that would be nice to debug streamer issues is to
(optionally!) make the LTO bytecode (if you can name it so...) (more)
self-descriptive.
Currently the bytecode is simply a series of bytes and in case the reading part
doesn't 1:1 match the writing part you get garbage.  So a first baby
step would be
to emit markers, like

 '0x00' raw byte follows
 '0x01' uhwi follows
 '0x02' bitpack follows, 'n' with N bits
 ...

basically look at data-streamer.[ch] as the lowest level of the stream encoding
and make it self-desctriptive and thus "dumpable" independent of the LTO reader.

Then go to tree-streamer-*.c and do the same for the various tree parts,
add tree streamer specific 'markers' (again optionally, just for debugging).

Then go to lto-streamer-*.c and repeat.

I'm not sure if we should go all the way to do sth like DWARF with its
abbrevs to optimize the encoding given it's just for debugging but it
would be maybe interesting to get an approximate idea about the overhead
streaming trees with full abbrevs.  I suppose it wouldn't be too bad.

As said the refactoring shouldn't be part of the project - 1. and 2. are
large enough already.

Richard.

>
> I would be grateful for suggestions, on how to proceed further, especially
> with modifying makefiles for creating the new driver. Unfortunately I have
> some school exams next week and won’t be able to work much on GCC during the
> period.
>
>
> Best Regards,
>
> Hrishikesh
>
>
>
> On Wed, Feb 28, 2018 at 4:05 PM, Martin Liška <[email protected]> wrote:
>>
>> On 02/25/2018 10:46 AM, Martin Jambor wrote:
>> > Hello Hrishikesh,
>> >
>> > I apologize for replying to you this late, this has been a busy week
>> > and now I am traveling.
>> >
>> > On Mon, Feb 19 2018, Hrishikesh Kulkarni wrote:
>> >> Hi,
>> >>
>> >> I am Hrishikesh Kulkarni currently studying as an undergrad student in
>> >> Computer Engineering at Pune University, India. I find compilers quite
>> >> interesting as a subject,  and would like to apply to GSoC to gain some
>> >> understanding of how real-world compilers work. So far, I have managed
>> >> to
>> >> build gcc and perform some simple tweaks to the codebase. In
>> >> particular, I
>> >> would like to apply to the Textual LTO dump tool project.
>> >>
>> >
>> > I must say I am impressed by the research you have already done.
>> > Nevertheless, please note that Ray Kim has also expressed interest in
>> > the project.  Martin Liska will be the mentor, so I will let him drive
>> > the selection process.  On the other hand, Ray also liked another
>> > project, so maybe he will pick that and everyone will be happy.
>>
>> Hello.
>>
>> I'm really happy that there are multiple volunteers that want to work on
>> LTO dump
>> tool project. According to what I've took a look I would like to have
>> Hrishikesh
>> working on the project. He's got experience with C, C++ and also with
>> Python language
>> that can be well used for prototyping. Apart from that he's spent quite
>> some time
>> with investigation of LTO internals in GCC.
>>
>> That said, may I please ask other candidates to seek for a different GSoC
>> project
>> we offered? I believe the other topics are also interesting and important
>> for the project.
>>
>> >
>> >> As far as I understand, the motivation for LTO framework was to enable
>> >> cross file interprocedural optimizations, and for this purpose an ipa
>> >> pass
>> >> is divided into following three stages:
>> >>
>> >>    1.
>> >>
>> >>    LGEN: The pass does a local analysis of the function and generates a
>> >>    “summary”, ie, the information relevant to the pass and writes it to
>> >> LTO
>> >>    object file.
>> >
>> > A pass might do that, but the output of the whole stage is not just the
>> > pass summaries, it also writes the function IL (the function gimple
>> > statements, above all) to the object file.
>> >
>> >>    2.
>> >>
>> >>    WPA: The LTO object files are given as input to the linker, which
>> >> then
>> >>    invokes the lto1 frontend to perform global ipa analysis over the
>> >>    call-graph and write optimized summaries to LTO object files
>> >>    (partitioning). The global ipa analysis is done over summary and not
>> >> the
>> >>    actual function bodies.
>> >
>> > Well... note that partitioning actually means dividing the whole
>> > compiled program/library into chunks that are then compiled
>> > independently in the LTRANS stage.  But you are basically right that WPA
>> > does also do whole-program analysis based on summaries and then writes
>> > its decisions to optimization summaries, yes.
>> >
>> >>    3.
>> >
>> >>
>> >>    LTRANS: The partitions are read back, and the function bodies are
>> >>    reconstructed from summary and are then compiled to produce real
>> >> object
>> >>    files.
>> >
>> > Function bodies and the summaries are distinct things.  The body
>> > consists of gimple statements and all the associated stuff (such as
>> > types, so a lot of stuff), whereas when we refer to summaries, we mean
>> > small chunks of data that interprocedural optimizations such as inlining
>> > or IPA-CP scurry away because they cannot feasibly work on bodies of the
>> > entire program.
>> >
>> > But apart from this terminology issue, you are basically correct, at the
>> > LTRANS stage, IPA passes apply transformations to the bodies according
>> > to the optimization summary generated by the WPA phase.  And then, all
>> > normal, intra-procedural passes and code generation runs.
>> >
>> >>
>> >>
>> >> If I understand correctly, the motivation for textual LTO dump tool is
>> >> to
>> >> easily analyze contents of LTO object file, similar to readelf or
>> >> objdump ?
>>
>> Yes. Richi in previous email defined how that could be done.
>>
>> >
>> > That is how I understand it too, but Martin may have some further uses
>> > in mind.
>> >
>> >>
>> >> Assume that LTO object file contains in pureconst section: 0b0110 (0b
>> >> for
>> >> binary prefix) corresponding to values of fs->pure_const_state and
>> >> fs->state_previously_known.
>> >>
>> >> If I understand correctly, the output of dump tool should then be:
>> >>
>> >> pure_const pass:
>> >>
>> >> pure_const_state = IPA_PURE (enum value of pure_const_state_e
>> >> corresponding
>> >> to 0b01)
>> >>
>> >> state_previously_known = IPA_NEITHER (enum value of pure_const_state_e
>> >> corresponding to 0b10)
>> >>
>> >> Is this the expected output of the dump tool ?
>> >
>> > I think the tool would have to a bit more than just dumping summaries of
>> > IPA passes.  I tend to think that the task should also include dumping
>> > gimple bodies (but we already do that in GCC and so it should be mostly
>> > easy) and also of types (that are merged as one of the first steps of
>> > WPA and interesting things happen when mergingit does something
>> > "interesting").  And perhaps quite a bit more.  Martin?
>>
>> Yes, as we transitioned to early-debug info in LTO mode, printing tree
>> types
>> that reside in LTO stream would help us to reduce the stream in the
>> future.
>>
>> >
>> >>
>> >> I am reasonably familiar working with C, C++ and python. My prior
>> >> experience includes opportunities to work in areas of NLP. Some of my
>> >> accomplishments in the area include presenting project VicharDhara- A
>> >> thought Mapper that was selected among top five ideas in Accenture
>> >> Innovation Challenge among 7000 nationwide entries. My paper on this
>> >> topic
>> >> won the best paper award in IEEE Conference ICCUBEA-2017. My previous
>> >> work
>> >> was focused on simple parsers, student psychology, thought process
>> >> detection for team selection.
>> >
>> > Interesting, congratulations.
>> >
>> >>
>> >> In the interim, I have been through a few docs on GCC and LTO [1][2][3]
>> >> and
>> >> am trying to write a toy ipa pass to better understand LTO/IPA
>> >> infrastructure.
>> >
>> > Great, I believe that's exactly what my advice would be
>> >
>> >> I would be grateful for feedback on the textual LTO dump
>> >> tool.
>> >
>> > I hope that Martin will shed a bit more light on what output he
>> > envisions the tool to have.  I will talk to him about it too when I get
>> > back to the office (so maybe on Tuesday but probably on Wednesday).
>>
>> As mentioned above it was mentioned by Richard. First step would be to
>> provide
>> write-only mode, where lto-dump will only provide verbose information
>> usable
>> for debugging.
>>
>> One another topic is current LTO dumping infrastructure. I know Honza does
>> not
>> like the interface. Maybe it can be improved in respect to bitpack_d and
>> maybe
>> some generalization can be done. Honza?
>>
>> Thanks,
>> Martin
>>
>> >
>> > Thanks,
>> >
>> > Martin
>> >
>> >
>> >
>> >>
>> >> [1] http://www.ucw.cz/~hubicka/slides/labs2013.pdf
>> >>
>> >> [2] https://gcc.gnu.org/wiki/LinkTimeOptimizatio
>> >> <https://gcc.gnu.org/wiki/LinkTimeOptimization>
>> >>
>> >> [3] https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html
>> >>
>> >> My two recent publications are listed below:
>> >>
>> >> [A] Hrishikesh Kulkarni, "Contextual Data Representation Using Prime
>> >> Number
>> >> Route Mapping Method and Ontology" IEEE Conference, ICCUBEA, 2017
>> >>
>> >> [B] Hrishikesh Kulkarni, “Multi-Graph based Intent Hierarchy Generation
>> >> to
>> >> Determine Action Sequence”, Springer Conference, ICDECT, December 2017,
>> >> Pune
>> >>
>> >> Thanks,
>> >>
>> >> Hrishikesh Kulkarni
>>
>

Re: GSOC 2018 - Textual LTO dump tool project

Reply via email to