Re: New feature: -fdump-gimple-nodes (once more, with feeling)

2024-02-14 Thread Richard Biener via Gcc
On Tue, Feb 13, 2024 at 8:47 PM Robert Dubner  wrote:
>
> I have not contributed to GCC before, so I am not totally sure how to go
> about it.
>
> So, I am letting you know what I want to do, so that I can get advice on a
> good way to do it.  I have read https://gcc.gnu.org/contribute.html, and I
> have reviewed the Gnu Coding Standards and the GCC additional coding
> standards, so I have some idea of what's needed.  But there is a gulf
> between theory and practice, and I am hoping for guidance.
>
> Jim Lowden and I have been developing a COBOL front end for GCC.  He's
> primarily been parsing the language.  It's been my task to generate the
> GENERIC/GIMPLE trees for the parsed code.  We've been working at this for
> a couple of years.  We have reached the point where we want to start
> submitting patches for the community to evaluate.
>
> I figured I would start small, where "small" means mainly one new source
> code file of 1,580 lines.
>
> When I first started trying to generate GIMPLE trees to implement
> functions, it became clear to me that I needed to be able to
> reverse-engineer known good trees generated by the C front end.  Oh, I
> could see what other front ends were doing in their source code.  But I
> didn't know what the goal was.  I wanted to see not just individual nodes,
> but how they all related to each other.
>
> There didn't seem to be any such functionality in GCC.  I found a routine
> in print-tree.cc which printed out a single node, but I needed to
> understand the entire tree of nodes for a function.  And I very quickly
> got tired -- very tired -- of trying to figure out the relationships
> between nodes, and I wanted more information than the print-tree routines
> were providing.
>
> So, I created the gcc/dump-gimple-nodes.cc source code, which implements
> the dump_gimple_nodes() function, which is controlled by the new
> -fdump-gimple-nodes GCC command-line option.  That option hooks into the
> top of the gimplify_function_tree() function in gcc/gimplify.cc.

A first comment is that you seem to dump the GENERIC graph the frontend
feeds to the gimplifier.  So this isn't GIMPLE just yet, so it possibly should
be dump_generic_nodes ().

We dump a textual representation at a similar state with -fdump-tree-original.
There's a -raw modifier that for example for C streams

;; Function main (null)
;; enabled by -tree-original

@1  statement_list   0   : @2   1   : @3
@2  bind_exprtype: @4   body: @5
@3  return_expr  type: @4   expr: @6
@4  void_typename: @7   algn: 8
@5  statement_list
@6  modify_expr  type: @8   op 0: @9   op 1: @10
@7  type_declname: @11  type: @4
@8  integer_type name: @12  size: @13  algn: 32
 prec: 32   sign: signed   min : @14
 max : @15
...

I didn't track down where the C frontend triggers this or what utility
it uses in the
end.  It is also somewhat frontend specific, likely before genericization.

I agree with Andi that these days sth more structured might be preferable
(but your html example might be good to parse and click through for a human)

> The dump_gimple_nodes() function does a depth-first walk of the specified
> function_decl, outputting each node once in a readable format.  Each node
> gets an arbitrary identifying number.  There are two output files; the
> first, "func_name.nodes", is pure text.  After I got tired of endlessly
> searching through the text file for the next node of interest, I created
> the "func_name.nodes.html" file, which is the same information with
> internal hyperlinks between the nodes.
>
> Here are the first two nodes of a typical simple function:
>
> ***This is NodeNumber0
> (0x7f12e13b0d00) NodeNumber0
> tree_code: function_decl
> tree_code_class: tcc_declaration
> base_flags: static public
> type: NodeNumber1 function_type
> name: NodeNumber6410 identifier_node "main"
> context: NodeNumber107 translation_unit_decl "bigger.c"
> source_location: bigger.c:7:5
> uid: 3663
> initial(bindings): NodeNumber6411 block
> machine_mode: QI(15)
> align: 8
> warn_if_not_align: 0
> pt_uid: 3663
> raw_assembler_name: NodeNumber6410 identifier_node "main"
> visibility: default
> result: NodeNumber6412 result_decl
> function(pointer): 0x7f12e135d508
> arguments: NodeNumber6413 parm_decl "argc"
> saved_tree(function_body): NodeNumber6417 statement_list
> function_code: 0
> function_flags: public no_instrument_function_entry_exit
> ***This is NodeNumber1
> (0x7f12e13b3d20) NodeNumber1
> tree_code: function_type
> tree_code_class: tcc_type
> machine_mode: QI(15)
> type: NodeNumber2 integer_type
> address_space:0
> size(in bits): NodeNumber55 uint128 8
> size_unit(in bytes): NodeNumber12 uint64 1
> uid: 1515
> precision: 0
> contains_placeholder: 0
> align: 8
> warn_if_not_align: 0
> alias_set_type: -1
> canonical: NodeNu

Re: New feature: -fdump-gimple-nodes (once more, with feeling)

2024-02-14 Thread David Malcolm via Gcc
On Tue, 2024-02-13 at 23:40 -0800, Andi Kleen via Gcc wrote:
> Robert Dubner  writes:
> 
> > There didn't seem to be any such functionality in GCC.  I found a
> > routine
> > in print-tree.cc which printed out a single node, but I needed to
> > understand the entire tree of nodes for a function.
> 
> FWIW the standard way to do this is to run the compiler in gdb with
> the .gdbinit in the object directory, set a suitable break
> point and then use pt etc to dump the trees. It prints all the fields
> and you can use the gdb command line to explore further.
> 
> > ***This is NodeNumber0
> > (0x7f12e13b0d00) NodeNumber0
> > tree_code: function_decl
> > tree_code_class: tcc_declaration
> 
> My suggestion if you go this route would be to generate
> some standard format like YAML or JSON that other tools
> can easily parse.

I'd love it if we had a JSON output for our IR.  FWIW, as of r14-6228-
g3bd8241a1f1982 our JSON output routines can nicely format the
generated JSON in a way that I find very readable in debugging (and I'm
using this when debugging the analyzer).

Dave



Re: New feature: -fdump-gimple-nodes (once more, with feeling)

2024-02-14 Thread Dimitar Dimitrov
On Tue, Feb 13, 2024 at 01:46:11PM -0600, Robert Dubner wrote:
...
> An example of a complete dump is available at
> https://www.dubner.com/main.nodes.html.  The C source code that generated
> it is available at the end of
> https://cobolworx.com/pages/dump-gimple-nodes.html
> 

Hyperlinked text is useful.  But I would love a graphical visualization
even more, e.g. via either Graphviz or Plantuml.

Regards,
Dimitar


Valid types for a binary op in GENERIC?

2024-02-14 Thread David Malcolm via Gcc
The ICE in PR analyzer/111441 is due to this assertion in
fold_binary_loc failing:

11722 gcc_assert (TYPE_PRECISION (atype) == TYPE_PRECISION 
(type));

where code=MULT_EXPR, type=, and:

(gdb) p type
$1 = 
(gdb) p atype
$2 = 

due to the analyzer building a mult_expr node with those types for the
arguments.

I have a fix for this (by adding some missing casts within the
analyzer's svalue representation), but it got me wondering: is there a
way to check valid types for binary operations in GENERIC?

Looking at
https://gcc.gnu.org/onlinedocs/gccint/Unary-and-Binary-Expressions.html
I see that for PLUS_EXPR, MINUS_EXPR and MULT_EXPR their "operands may
have either integral or floating type, but there will never be [sic]
case in which one operand is of floating type and the other is of
integral type."

Is it the case that for PLUS_EXPR, MINUS_EXPR and MULT_EXPR, their
arguments *must* have the same precision?  Or that types_compatible_p
is true?  What about other binary operations?

FWIW I currently have this hacked-up assertion in my working copy:

const svalue *
region_model_manager::get_or_create_binop (tree type, enum tree_code op,
   const svalue *arg0,
   const svalue *arg1)
{
  if (arg0->get_type ()
  && arg1->get_type ()
  && op != POINTER_PLUS_EXPR)
{
  // FIXME: what ops does this apply to?  MULT_EXPR?
  gcc_assert (types_compatible_p (arg0->get_type (), arg1->get_type ()));
}


Is there a function to check type-compatibility of the args given a
particular enum tree_code?

Sorry if I'm missing something here
Dave



Re: Valid types for a binary op in GENERIC?

2024-02-14 Thread Richard Biener via Gcc



> Am 14.02.2024 um 18:16 schrieb David Malcolm via Gcc :
> 
> The ICE in PR analyzer/111441 is due to this assertion in
> fold_binary_loc failing:
> 
> 11722  gcc_assert (TYPE_PRECISION (atype) == TYPE_PRECISION 
> (type));
> 
> where code=MULT_EXPR, type=, and:
> 
> (gdb) p type
> $1 = 
> (gdb) p atype
> $2 = 
> 
> due to the analyzer building a mult_expr node with those types for the
> arguments.
> 
> I have a fix for this (by adding some missing casts within the
> analyzer's svalue representation), but it got me wondering: is there a
> way to check valid types for binary operations in GENERIC?
> 
> Looking at
> https://gcc.gnu.org/onlinedocs/gccint/Unary-and-Binary-Expressions.html
> I see that for PLUS_EXPR, MINUS_EXPR and MULT_EXPR their "operands may
> have either integral or floating type, but there will never be [sic]
> case in which one operand is of floating type and the other is of
> integral type."
> 
> Is it the case that for PLUS_EXPR, MINUS_EXPR and MULT_EXPR, their
> arguments *must* have the same precision?  Or that types_compatible_p
> is true?  What about other binary operations?
> 
> FWIW I currently have this hacked-up assertion in my working copy:
> 
> const svalue *
> region_model_manager::get_or_create_binop (tree type, enum tree_code op,
>   const svalue *arg0,
>   const svalue *arg1)
> {
>  if (arg0->get_type ()
>  && arg1->get_type ()
>  && op != POINTER_PLUS_EXPR)
>{
>  // FIXME: what ops does this apply to?  MULT_EXPR?
>  gcc_assert (types_compatible_p (arg0->get_type (), arg1->get_type ()));
>}
> 
> 
> Is there a function to check type-compatibility of the args given a
> particular enum tree_code?

No.  The best source is the GIMPLE verifier in tree-cfg.cc

> Sorry if I'm missing something here
> Dave
> 


RE: New feature: -fdump-gimple-nodes (once more, with feeling)

2024-02-14 Thread Robert Dubner
I have thought about a graphical representation more than once.  Heck, the
connections between nodes is one of the things I needed to know in the
first place.  And certainly the information necessary is all there in the
output I generate; I have drawn by hand pieces of the tree connections
many times.

But it doesn't seem to me to scale.

A candidate for the absolute minimally sized executable program one can
write in C and compile with GCC is

  void main(){}  /* I didn't say it would do anything *useful* */

The generic tree for that program has in excess of fifty nodes.

#include 
  void main(){printf("hello, world\n");}

has in excess of 4,800 nodes because stdio.h was brought in.  Without a
plotter that draws on the sides of buildings or on football pitches (you'd
sit in the stands with binoculars to read the results), it's difficult for
me to envision how the graphical representation could be useful.  I don't
claim my imagination should be the limiting factor.  On the other hand, I
don't think the compiler should be generating that directly, anyway.

(I've managed to distract myself.  Now I want to build a wheeled robot
that wanders around a football pitch drawing with colored chalk dust.)

My current takeaway from these responses -- thank you so much!,
incidentally -- is that whatever utility I have created here would be
enhanced by JSON (one and a half "votes", so far) or YAML (half a vote)
output.

Once the tree were available in JSON, then separate utilities to take that
output and display it graphically would be straightforward.

Okay then.  I'll change the naming from "*gimple*" to "*generic*" as more
accurate, and I'll generate JSON in addition to the other two files.

Thanks again.

-Original Message-
From: Dimitar Dimitrov  
Sent: Wednesday, February 14, 2024 11:31
To: Robert Dubner 
Cc: 'GCC Mailing List' 
Subject: Re: New feature: -fdump-gimple-nodes (once more, with feeling)

On Tue, Feb 13, 2024 at 01:46:11PM -0600, Robert Dubner wrote:
...
> An example of a complete dump is available at 
> https://www.dubner.com/main.nodes.html.  The C source code that 
> generated it is available at the end of 
> https://cobolworx.com/pages/dump-gimple-nodes.html
> 

Hyperlinked text is useful.  But I would love a graphical visualization
even more, e.g. via either Graphviz or Plantuml.

Regards,
Dimitar