Re: New feature: -fdump-gimple-nodes (once more, with feeling)
On Tue, Feb 13, 2024 at 8:47 PM Robert Dubner wrote: > > I have not contributed to GCC before, so I am not totally sure how to go > about it. > > So, I am letting you know what I want to do, so that I can get advice on a > good way to do it. I have read https://gcc.gnu.org/contribute.html, and I > have reviewed the Gnu Coding Standards and the GCC additional coding > standards, so I have some idea of what's needed. But there is a gulf > between theory and practice, and I am hoping for guidance. > > Jim Lowden and I have been developing a COBOL front end for GCC. He's > primarily been parsing the language. It's been my task to generate the > GENERIC/GIMPLE trees for the parsed code. We've been working at this for > a couple of years. We have reached the point where we want to start > submitting patches for the community to evaluate. > > I figured I would start small, where "small" means mainly one new source > code file of 1,580 lines. > > When I first started trying to generate GIMPLE trees to implement > functions, it became clear to me that I needed to be able to > reverse-engineer known good trees generated by the C front end. Oh, I > could see what other front ends were doing in their source code. But I > didn't know what the goal was. I wanted to see not just individual nodes, > but how they all related to each other. > > There didn't seem to be any such functionality in GCC. I found a routine > in print-tree.cc which printed out a single node, but I needed to > understand the entire tree of nodes for a function. And I very quickly > got tired -- very tired -- of trying to figure out the relationships > between nodes, and I wanted more information than the print-tree routines > were providing. > > So, I created the gcc/dump-gimple-nodes.cc source code, which implements > the dump_gimple_nodes() function, which is controlled by the new > -fdump-gimple-nodes GCC command-line option. That option hooks into the > top of the gimplify_function_tree() function in gcc/gimplify.cc. A first comment is that you seem to dump the GENERIC graph the frontend feeds to the gimplifier. So this isn't GIMPLE just yet, so it possibly should be dump_generic_nodes (). We dump a textual representation at a similar state with -fdump-tree-original. There's a -raw modifier that for example for C streams ;; Function main (null) ;; enabled by -tree-original @1 statement_list 0 : @2 1 : @3 @2 bind_exprtype: @4 body: @5 @3 return_expr type: @4 expr: @6 @4 void_typename: @7 algn: 8 @5 statement_list @6 modify_expr type: @8 op 0: @9 op 1: @10 @7 type_declname: @11 type: @4 @8 integer_type name: @12 size: @13 algn: 32 prec: 32 sign: signed min : @14 max : @15 ... I didn't track down where the C frontend triggers this or what utility it uses in the end. It is also somewhat frontend specific, likely before genericization. I agree with Andi that these days sth more structured might be preferable (but your html example might be good to parse and click through for a human) > The dump_gimple_nodes() function does a depth-first walk of the specified > function_decl, outputting each node once in a readable format. Each node > gets an arbitrary identifying number. There are two output files; the > first, "func_name.nodes", is pure text. After I got tired of endlessly > searching through the text file for the next node of interest, I created > the "func_name.nodes.html" file, which is the same information with > internal hyperlinks between the nodes. > > Here are the first two nodes of a typical simple function: > > ***This is NodeNumber0 > (0x7f12e13b0d00) NodeNumber0 > tree_code: function_decl > tree_code_class: tcc_declaration > base_flags: static public > type: NodeNumber1 function_type > name: NodeNumber6410 identifier_node "main" > context: NodeNumber107 translation_unit_decl "bigger.c" > source_location: bigger.c:7:5 > uid: 3663 > initial(bindings): NodeNumber6411 block > machine_mode: QI(15) > align: 8 > warn_if_not_align: 0 > pt_uid: 3663 > raw_assembler_name: NodeNumber6410 identifier_node "main" > visibility: default > result: NodeNumber6412 result_decl > function(pointer): 0x7f12e135d508 > arguments: NodeNumber6413 parm_decl "argc" > saved_tree(function_body): NodeNumber6417 statement_list > function_code: 0 > function_flags: public no_instrument_function_entry_exit > ***This is NodeNumber1 > (0x7f12e13b3d20) NodeNumber1 > tree_code: function_type > tree_code_class: tcc_type > machine_mode: QI(15) > type: NodeNumber2 integer_type > address_space:0 > size(in bits): NodeNumber55 uint128 8 > size_unit(in bytes): NodeNumber12 uint64 1 > uid: 1515 > precision: 0 > contains_placeholder: 0 > align: 8 > warn_if_not_align: 0 > alias_set_type: -1 > canonical: NodeNu
Re: New feature: -fdump-gimple-nodes (once more, with feeling)
On Tue, 2024-02-13 at 23:40 -0800, Andi Kleen via Gcc wrote: > Robert Dubner writes: > > > There didn't seem to be any such functionality in GCC. I found a > > routine > > in print-tree.cc which printed out a single node, but I needed to > > understand the entire tree of nodes for a function. > > FWIW the standard way to do this is to run the compiler in gdb with > the .gdbinit in the object directory, set a suitable break > point and then use pt etc to dump the trees. It prints all the fields > and you can use the gdb command line to explore further. > > > ***This is NodeNumber0 > > (0x7f12e13b0d00) NodeNumber0 > > tree_code: function_decl > > tree_code_class: tcc_declaration > > My suggestion if you go this route would be to generate > some standard format like YAML or JSON that other tools > can easily parse. I'd love it if we had a JSON output for our IR. FWIW, as of r14-6228- g3bd8241a1f1982 our JSON output routines can nicely format the generated JSON in a way that I find very readable in debugging (and I'm using this when debugging the analyzer). Dave
Re: New feature: -fdump-gimple-nodes (once more, with feeling)
On Tue, Feb 13, 2024 at 01:46:11PM -0600, Robert Dubner wrote: ... > An example of a complete dump is available at > https://www.dubner.com/main.nodes.html. The C source code that generated > it is available at the end of > https://cobolworx.com/pages/dump-gimple-nodes.html > Hyperlinked text is useful. But I would love a graphical visualization even more, e.g. via either Graphviz or Plantuml. Regards, Dimitar
Valid types for a binary op in GENERIC?
The ICE in PR analyzer/111441 is due to this assertion in fold_binary_loc failing: 11722 gcc_assert (TYPE_PRECISION (atype) == TYPE_PRECISION (type)); where code=MULT_EXPR, type=, and: (gdb) p type $1 = (gdb) p atype $2 = due to the analyzer building a mult_expr node with those types for the arguments. I have a fix for this (by adding some missing casts within the analyzer's svalue representation), but it got me wondering: is there a way to check valid types for binary operations in GENERIC? Looking at https://gcc.gnu.org/onlinedocs/gccint/Unary-and-Binary-Expressions.html I see that for PLUS_EXPR, MINUS_EXPR and MULT_EXPR their "operands may have either integral or floating type, but there will never be [sic] case in which one operand is of floating type and the other is of integral type." Is it the case that for PLUS_EXPR, MINUS_EXPR and MULT_EXPR, their arguments *must* have the same precision? Or that types_compatible_p is true? What about other binary operations? FWIW I currently have this hacked-up assertion in my working copy: const svalue * region_model_manager::get_or_create_binop (tree type, enum tree_code op, const svalue *arg0, const svalue *arg1) { if (arg0->get_type () && arg1->get_type () && op != POINTER_PLUS_EXPR) { // FIXME: what ops does this apply to? MULT_EXPR? gcc_assert (types_compatible_p (arg0->get_type (), arg1->get_type ())); } Is there a function to check type-compatibility of the args given a particular enum tree_code? Sorry if I'm missing something here Dave
Re: Valid types for a binary op in GENERIC?
> Am 14.02.2024 um 18:16 schrieb David Malcolm via Gcc : > > The ICE in PR analyzer/111441 is due to this assertion in > fold_binary_loc failing: > > 11722 gcc_assert (TYPE_PRECISION (atype) == TYPE_PRECISION > (type)); > > where code=MULT_EXPR, type=, and: > > (gdb) p type > $1 = > (gdb) p atype > $2 = > > due to the analyzer building a mult_expr node with those types for the > arguments. > > I have a fix for this (by adding some missing casts within the > analyzer's svalue representation), but it got me wondering: is there a > way to check valid types for binary operations in GENERIC? > > Looking at > https://gcc.gnu.org/onlinedocs/gccint/Unary-and-Binary-Expressions.html > I see that for PLUS_EXPR, MINUS_EXPR and MULT_EXPR their "operands may > have either integral or floating type, but there will never be [sic] > case in which one operand is of floating type and the other is of > integral type." > > Is it the case that for PLUS_EXPR, MINUS_EXPR and MULT_EXPR, their > arguments *must* have the same precision? Or that types_compatible_p > is true? What about other binary operations? > > FWIW I currently have this hacked-up assertion in my working copy: > > const svalue * > region_model_manager::get_or_create_binop (tree type, enum tree_code op, > const svalue *arg0, > const svalue *arg1) > { > if (arg0->get_type () > && arg1->get_type () > && op != POINTER_PLUS_EXPR) >{ > // FIXME: what ops does this apply to? MULT_EXPR? > gcc_assert (types_compatible_p (arg0->get_type (), arg1->get_type ())); >} > > > Is there a function to check type-compatibility of the args given a > particular enum tree_code? No. The best source is the GIMPLE verifier in tree-cfg.cc > Sorry if I'm missing something here > Dave >
RE: New feature: -fdump-gimple-nodes (once more, with feeling)
I have thought about a graphical representation more than once. Heck, the connections between nodes is one of the things I needed to know in the first place. And certainly the information necessary is all there in the output I generate; I have drawn by hand pieces of the tree connections many times. But it doesn't seem to me to scale. A candidate for the absolute minimally sized executable program one can write in C and compile with GCC is void main(){} /* I didn't say it would do anything *useful* */ The generic tree for that program has in excess of fifty nodes. #include void main(){printf("hello, world\n");} has in excess of 4,800 nodes because stdio.h was brought in. Without a plotter that draws on the sides of buildings or on football pitches (you'd sit in the stands with binoculars to read the results), it's difficult for me to envision how the graphical representation could be useful. I don't claim my imagination should be the limiting factor. On the other hand, I don't think the compiler should be generating that directly, anyway. (I've managed to distract myself. Now I want to build a wheeled robot that wanders around a football pitch drawing with colored chalk dust.) My current takeaway from these responses -- thank you so much!, incidentally -- is that whatever utility I have created here would be enhanced by JSON (one and a half "votes", so far) or YAML (half a vote) output. Once the tree were available in JSON, then separate utilities to take that output and display it graphically would be straightforward. Okay then. I'll change the naming from "*gimple*" to "*generic*" as more accurate, and I'll generate JSON in addition to the other two files. Thanks again. -Original Message- From: Dimitar Dimitrov Sent: Wednesday, February 14, 2024 11:31 To: Robert Dubner Cc: 'GCC Mailing List' Subject: Re: New feature: -fdump-gimple-nodes (once more, with feeling) On Tue, Feb 13, 2024 at 01:46:11PM -0600, Robert Dubner wrote: ... > An example of a complete dump is available at > https://www.dubner.com/main.nodes.html. The C source code that > generated it is available at the end of > https://cobolworx.com/pages/dump-gimple-nodes.html > Hyperlinked text is useful. But I would love a graphical visualization even more, e.g. via either Graphviz or Plantuml. Regards, Dimitar