Re: [PATCH] JSON dumping for GENERIC trees

2024-09-12 Thread Thor Preimesberger
> There are three oddities I immediately notice:
>
> The PLUS_EXPR operands are in a array "operands" while the RETURN_EXPR
> "operand" or "child pointer" is refered to from "return_expr".  I think both 
> are
> tcc_expression trees and the operands are in exp.operands.  Ideally the
> JSON would more closely reflect that rather than possibly following the 
> "pretty"
> printing logic.

Ah - for the binary operator, operands may have been a poor choice of
words there.
There's an abstract nonsense definition that would not necessarily be
reasonable here.
Would it make more sense to dump it e.g. as
.
   "bin_operator": "+",
   "op0": {"addr": "0x7f8256bda360"
   ...}
   "op1": {"addr": ...}
.
I think there are some parts of the code that I wrote that don't have
the accessor used as their key when referring to a different node -
e.g. case PLACEHOLDER_EXPR. Would this be an issue?

> While the tree_code of a tree node is the most important bit (maybe besides of
> its address), the "tree code" attribute is after the locations (which
> are also quite
> verbose and distracting - at least when looking at raw JSON).  For locations
> one could honor TDF_LINENO and only dump those when using
> -fdump-tree-original-json-lineno.  I'd re-order "tree code" after "addr".

Sounds good - I'll implement dumping locs iff TDF_LINENO is enabled.

> The third issue is that above the tree node with address 0x7f8256a10c60
> (and its children) appear twice - while you maintain a splay tree and assign
> unique numbers the duplicate nodes are not output by reference?  I would
> suggest to use { "ref_addr" : "0x7f8256a10c60" } for the output of such
> reference for example.
>
> I'm not sure whether JSON allows different object kinds or if that's solely
> done by having a special attribute if that's needed.  With the above
> regular tree nodes would be "addr" and references be "ref_addr".  A recursive
> JSON structure like above is OK to look at in RAW, I'm not sure whether
> for automatic processing and for example translating to a graph a linear
> collection of nodes with references would be easier to handle.

I agree that it should be easier to process the JSON if the references have a
different key. Should be easy to implement.

> Few comments on the patch itself - the #include of tree-emit-json.h from
> dumpfile.cc doesn't seem to be necessary.  Since you declare
> dump_node_json in dumpfile.h it should be possible to elide the header
> and put the contents into the tree-emit-json.cc file.
>
> Another #include is duplicated (and also looks unnecessary).
>

All fixed now on my working tree.

> I know you have some crude JSON -> html translation script written in
> python - can you share that as well?  I'd suggest to post separate from
> this main patch, adding it to contrib/.

Sure - let me get the fixes suggested in this email done since it'll
change (and simplify) the logic a bit.

> Can we solve the multi-function issue somehow?  I know we have some
> abstraction for a dump file, we'd need a hook that's invoked on opening
> and closing to emit a something there - I guess it would be even OK to
> hard-code this into dumpfile.cc for the -JSON dump variant.  It might
> be possible to register dump specific data with that object and get
> to the "current" dump file in dump_node_json so the splay-tree could
> be kept live and the allocations released on dump-file close?  Again,
> two hard-coded hooks from dumpfile.cc at open/close time into
> the JSON dumping for this might be feasible and track the global state
> with global variables.  That's to allow references to global objects and
> types streamed in a previous function context.

If the multi-function issue is that the dump pass currently produces
a series of JSON objects rather than a single one - I think what you're
suggesting is essentially done by optrecord_json_writer, for
-fsave-optimization-record. One approach I have in my head
is for, let's call it a tree_json_writer, to hold a
json::array, append each node we traverse, and then
flush this array to the dumpfile at the end.

This would also enable a way to address what you brought
up at the very end.

(In the python script I have written up, I just call the bash command
I posted in the first email to turn the output into a single JSON object.
I don't expect that it's really possible to call sed from within gcc.)

Best,
Thor

On Thu, Sep 12, 2024 at 7:14 AM Richard Biener
 wrote:
>
> On Thu, Sep 12, 2024 at 12:51 PM David Malcolm  wrote:
> >
> > On Wed, 2024-09-11 at 20:49 -0500, tcpreimesber...@gmail.com wrote:
> > > From: Thor C Preimesberger 
> > >
> > > This patch allows the compiler to dump GENERIC trees as JSON objects.
> > >
> > > The dump flag -fdump-tree-original-json dumps each fndecl node in the
> > > C frontend's gimplifier as a JSON object and traverses related nodes
> > > in an analagous manner as to raw-dumping.
> >
> > Thanks for posting this patch.
> >
> > Are you able to upload somewhe

Re: [PATCH 1/2] JSON Dumping of GENERIC trees

2024-09-27 Thread Thor Preimesberger
That's all correct. I think I got it.

There are times where the code is emitting a json::object* that is
contained in another json object. Is it good style to return these
still as a unique_ptr? I'm looking over what I wrote again, and in
some parts I wrap the new json object in a unique_ptr (as a return in
some function calls) and in others I use new and delete.

Thanks,
Thor Preimesberger

On Fri, Sep 27, 2024 at 9:18 AM David Malcolm  wrote:
>
> On Sat, 2024-09-21 at 22:49 -0500, -thor wrote:
> > From: thor 
> >
> > This is the second revision of:
> >
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662849.html
> >
> > I've incorporated the feedback given both by Richard and David - I
> > didn't
> > find any memory leaks when testing in valgrind :)
>
> Thanks for the updated patch.
>
> [...snip...]
>
> > diff --git a/gcc/tree-emit-json.cc b/gcc/tree-emit-json.cc
> > new file mode 100644
> > index 000..df97069b922
> > --- /dev/null
> > +++ b/gcc/tree-emit-json.cc
>
> [...snip...]
>
> Thanks for using std::unique_ptr, but I may have been unclear in my
> earlier email - please use it to indicate ownership of a heap-allocated
> pointer...
>
> > +/* Adds Identifier information to JSON object */
> > +
> > +void
> > +identifier_node_add_json (tree t, std::unique_ptr & json_obj)
> > +  {
> > +const char* buff = IDENTIFIER_POINTER (t);
> > +json_obj->set_string("id_to_locale", identifier_to_locale(buff));
> > +buff = IDENTIFIER_POINTER (t);
> > +json_obj->set_string("id_point", buff);
> > +  }
>
> ...whereas here (and in many other places), the patch has a
>
>std::unique_ptr &json_obj
>
> (expressing a reference to a a unique_ptr i.e. a non-modifiable non-
> null pointer to a pointer that manages the lifetime of a modifiable
> json::object)
>
> where, if I'm reading the code correctly,
>
>json::object &json_obj
>
> (expressing a non-modifiable non-null pointer to a modifiable
> json::object).
>
> would be much clearer and simpler.
>
> [...snip...]
>
> Hope the above makes sense; sorry if I'm being unclear.
> Dave
>