adriangb opened a new pull request, #21767: URL: https://github.com/apache/datafusion/pull/21767
## Which issue does this PR close? - Closes #. ## Rationale for this change DataFusion already emits PostgreSQL JSON (pgjson) for logical plans via `EXPLAIN (FORMAT pgjson) ...`. This PR extends that support to `EXPLAIN ANALYZE` so the physical plan, along with live execution metrics, can be fed into pgjson visualizers such as [Dalibo](https://explain.dalibo.com/) and PEV2. Today, `EXPLAIN ANALYZE FORMAT pgjson` is explicitly rejected in the planner with `"EXPLAIN ANALYZE with FORMAT is not supported"`. With this PR the restriction is lifted for pgjson. ## What changes are included in this PR? - Add a `format: ExplainFormat` field to the logical `Analyze` node and the physical `AnalyzeExec` operator, threaded through SQL parsing, logical planning, and physical planning. - Accept `EXPLAIN ANALYZE FORMAT pgjson <stmt>`. `Tree` and `Graphviz` with `ANALYZE` still error with a clear message (out of scope for this PR). - Add `DisplayableExecutionPlan::pgjson()` and a new `PgJsonExecutionPlanVisitor` that mirror the logical-plan `PgJsonVisitor`. Per-node output includes: - `Node Type` — `ExecutionPlan::name()` - `Details` — the one-line `DisplayAs::Default` rendering - `Actual Rows` / `Actual Total Time` — PG-canonical metric keys populated from `output_rows` / `elapsed_compute` (emitted as float milliseconds; note DataFusion records compute time, not wall time) - `Extras` — remaining DataFusion metrics keyed by their native name - `Plans` — child nodes - Add an optional `set_summary()` builder so `AnalyzeExec` can attach `Total Rows` and `Duration` at the root in verbose mode. - Honor existing `analyze_level` / `analyze_categories` config exactly as `indent()` does. ## Are these changes tested? - Unit tests in `datafusion/physical-plan/src/display.rs`: - `pgjson_renders_plan_without_metrics` - `pgjson_includes_summary_when_set` - `pgjson_snapshot_of_sample_plan` (insta snapshot) - sqllogictest coverage in `datafusion/sqllogictest/test_files/explain_analyze.slt`: - Structural golden for `EXPLAIN ANALYZE FORMAT pgjson` with `analyze_categories = 'none'` - Negative tests for `EXPLAIN ANALYZE FORMAT tree` and `EXPLAIN ANALYZE FORMAT graphviz` - `cargo clippy --all-targets --all-features -- -D warnings` clean on the touched crates; `cargo fmt --all` clean. ## Are there any user-facing changes? Yes — new syntax is accepted: ```sql EXPLAIN ANALYZE FORMAT pgjson SELECT count(*) FROM t; ``` No existing behavior changes: the default (`EXPLAIN ANALYZE ...` with no `FORMAT`) still emits the indent-format plan with metrics, and `EXPLAIN (FORMAT pgjson) ...` on the logical plan is unchanged. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
