> There are three oddities I immediately notice:
>
> The PLUS_EXPR operands are in a array "operands" while the RETURN_EXPR
> "operand" or "child pointer" is refered to from "return_expr". I think both
> are
> tcc_expression trees and the operands are in exp.operands. Ideally the
> JSON would more closely reflect that rather than possibly following the
> "pretty"
> printing logic.
Ah - for the binary operator, operands may have been a poor choice of
words there.
There's an abstract nonsense definition that would not necessarily be
reasonable here.
Would it make more sense to dump it e.g. as
.
"bin_operator": "+",
"op0": {"addr": "0x7f8256bda360"
...}
"op1": {"addr": ...}
.
I think there are some parts of the code that I wrote that don't have
the accessor used as their key when referring to a different node -
e.g. case PLACEHOLDER_EXPR. Would this be an issue?
> While the tree_code of a tree node is the most important bit (maybe besides of
> its address), the "tree code" attribute is after the locations (which
> are also quite
> verbose and distracting - at least when looking at raw JSON). For locations
> one could honor TDF_LINENO and only dump those when using
> -fdump-tree-original-json-lineno. I'd re-order "tree code" after "addr".
Sounds good - I'll implement dumping locs iff TDF_LINENO is enabled.
> The third issue is that above the tree node with address 0x7f8256a10c60
> (and its children) appear twice - while you maintain a splay tree and assign
> unique numbers the duplicate nodes are not output by reference? I would
> suggest to use { "ref_addr" : "0x7f8256a10c60" } for the output of such
> reference for example.
>
> I'm not sure whether JSON allows different object kinds or if that's solely
> done by having a special attribute if that's needed. With the above
> regular tree nodes would be "addr" and references be "ref_addr". A recursive
> JSON structure like above is OK to look at in RAW, I'm not sure whether
> for automatic processing and for example translating to a graph a linear
> collection of nodes with references would be easier to handle.
I agree that it should be easier to process the JSON if the references have a
different key. Should be easy to implement.
> Few comments on the patch itself - the #include of tree-emit-json.h from
> dumpfile.cc doesn't seem to be necessary. Since you declare
> dump_node_json in dumpfile.h it should be possible to elide the header
> and put the contents into the tree-emit-json.cc file.
>
> Another #include is duplicated (and also looks unnecessary).
>
All fixed now on my working tree.
> I know you have some crude JSON -> html translation script written in
> python - can you share that as well? I'd suggest to post separate from
> this main patch, adding it to contrib/.
Sure - let me get the fixes suggested in this email done since it'll
change (and simplify) the logic a bit.
> Can we solve the multi-function issue somehow? I know we have some
> abstraction for a dump file, we'd need a hook that's invoked on opening
> and closing to emit a something there - I guess it would be even OK to
> hard-code this into dumpfile.cc for the -JSON dump variant. It might
> be possible to register dump specific data with that object and get
> to the "current" dump file in dump_node_json so the splay-tree could
> be kept live and the allocations released on dump-file close? Again,
> two hard-coded hooks from dumpfile.cc at open/close time into
> the JSON dumping for this might be feasible and track the global state
> with global variables. That's to allow references to global objects and
> types streamed in a previous function context.
If the multi-function issue is that the dump pass currently produces
a series of JSON objects rather than a single one - I think what you're
suggesting is essentially done by optrecord_json_writer, for
-fsave-optimization-record. One approach I have in my head
is for, let's call it a tree_json_writer, to hold a
json::array, append each node we traverse, and then
flush this array to the dumpfile at the end.
This would also enable a way to address what you brought
up at the very end.
(In the python script I have written up, I just call the bash command
I posted in the first email to turn the output into a single JSON object.
I don't expect that it's really possible to call sed from within gcc.)
Best,
Thor
On Thu, Sep 12, 2024 at 7:14 AM Richard Biener
wrote:
>
> On Thu, Sep 12, 2024 at 12:51 PM David Malcolm wrote:
> >
> > On Wed, 2024-09-11 at 20:49 -0500, tcpreimesber...@gmail.com wrote:
> > > From: Thor C Preimesberger
> > >
> > > This patch allows the compiler to dump GENERIC trees as JSON objects.
> > >
> > > The dump flag -fdump-tree-original-json dumps each fndecl node in the
> > > C frontend's gimplifier as a JSON object and traverses related nodes
> > > in an analagous manner as to raw-dumping.
> >
> > Thanks for posting this patch.
> >
> > Are you able to upload somewhe