On 10/07/2016 03:26 PM, David Malcolm wrote:

We could simply print the INSN_UID for CODE_LABELs; something like this
(see the "(code_label 16" below):

I think that should work.

You appear to have trimmed the idea of enclosing the insns with
(basic-block) directives without commenting on it.  Did you like this
idea?

Sorry - I appear to have completely missed it.

It would make the above look like:

  (basic-block 2
    ;; insns snipped
    (jump_insn (set (pc)
            (if_then_else (ge (reg:CCGC 17 flags)
                    (const_int 0 [0]))
                (label_ref 16)
                (pc))) "test.c":3
   -> 16)
  ) ;; basic-block 2
  (basic-block 4
    (note [bb 4] NOTE_INSN_BASIC_BLOCK)
    ;; insns snipped
    (jump_insn (set (pc) (label_ref 20)) "test.c":4
     -> 20)
  ) ;; basic-block 4
  (barrier)
  (basic-block 5
    (code_label 16 [1 uses])
    (note [bb 5] NOTE_INSN_BASIC_BLOCK)
    ;; etc
  ) ;; basic-block 5

Note how the above format expresses clearly that:
* the (barrier) is part of the insn chain, but not in a basic block, and
* some insns can happen in a basic block

That looks really nice IMO. Except maybe drop the "-> 16" bit for the jump_insn (that's the JUMP_LABEL, isn't it?)

Taking this idea further: if we have (basic-block) directives
integrated into the insn-chain like this, we could express the CFG by
adding (edge) directives. Here's a suggestion for doing it with
(edge-from) and (edge-to) directives, expressing the predecessor and
successor edges in the CFG, along with :

That also looks reasonable. Probably a small but maybe not a huge improvement over the other syntax. Having both from and to edges seems redundant but might help readability. The reader should check consistency in that case.

Should we spell "0" and "1" as "entry" and "exit" when parsing/dumping
basic block indices? e.g.:

  (basic-block 2
    (edge-from entry)

If that can be done it would be an improvement.

I think maybe you want a separate compact form of insns and notes
(cinsn/cnote maybe), with a flag selecting which gets written out in
the
dumper. The reader could then read them in like any other rtx code,
and
there'd be a postprocessing stage to reconstruct the insn chain.

By this separate compact form, do you mean the form we've been
discussing above, with no INSN_UID/PREV/NEXT, etc?  Or something else?

Yes, the form we're discussing, except instead of (insn ...) you'd have (cinsn ...), which I assume would make it easier for the reader and less ambiguous overall.

As for "cinsn", "cnote", how about just "insn" and "note", and having
the compactness be expressed at the top of the dump e.g. implicitly by
the presence of a "(function" directive.  Could even have a format
version specifier in the function directive, to give us some future
-proofing e.g.
  (function (format 20161007)
or somesuch.

Having it implicit should also be ok - I have no strong preference really. I'm not sure we want versioning - better to have an automatic conversion if we ever feel we need to change the format.

Do you want to want to try hand-edited a test case, using some of the
format ideas we've been discussing?  That might suggest further
improvements to the format.

We'll definitely want to have a look at one or two. Also, we ought to try to set up situations we haven't discussed: ADDR_VECs (in light of the basic block dumping) and ASMs maybe. I'm probably forgetting some.

One other thing in terms of format is the printout of CONST_INT - I think it should suffice to have either decimal, or hex, but not really both. The reader should maybe accept either.

I think all hosts have 64-bit HWI these days, so CONST_INT ought to always stay valid through a roundtrip.

I may have missed it, but is there any kind of provision yet for providing an "after" dump for what is expected after a pass is run? Might be worth thinking about whether the reader could have a mode where it matches internal RTL against an input.

OK.  If so, do we need to print the regno for hard registers?  Or
should we just print the name for those, for consistency with virtual
regs?  (i.e. have it be controlled by the same flag?)

Just the name should work, leaving only pseudos with numbers - that ought to be reasonable. In the reader, when encountering a reg with a number, just add FIRST_PSEUDO_REGISTER, and you should end up with something that's consistent. Or maybe even dump the expected FIRST_PSEUDO_REGISTER, and adjust for it in case the one we have at run-time differs.

Does the C one have an advantage in terms of meaningful decls in
MEM_ATTRs/SYMBOL_REFs?

Probably.

I think that might be the more promising path then.


Bernd

Reply via email to