This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 51938fea36a [SPARK-45274][CORE][SQL][UI] Implementation of a new DAG
drawing approach for job/stage/plan graphics to avoid fork
51938fea36a is described below
commit 51938fea36af19824a657c0326af9de03393e1dd
Author: Kent Yao <[email protected]>
AuthorDate: Fri Sep 22 19:25:24 2023 -0700
[SPARK-45274][CORE][SQL][UI] Implementation of a new DAG drawing approach
for job/stage/plan graphics to avoid fork
### What changes were proposed in this pull request?
Currently, we use a forked
[repo](https://github.com/yaooqinn/dagre-d3/commit/faf82545c4e6d8a72f690107ba0b372868e7b508)
of `dagre-d3` for UI to render DAG graphics for jobs, stages and plans.
The reasons for the fork can be summarized as follows:
- (1) Add identifiers to the `class` of clusters and nodes for late
retrieving
- (2) Calculate the coordinates and size for complex clusters. For example,
- The width of `Stage 8` must fit the maximum width of its child
clusters, which are `AQEShuffleRead`, `WholeStageCodegen (8)` and
`mapPartitionsInternal`, and also fit its label text length.
- The height of `Stage 8` needs to be increased to display its label on
the top right corner to avoid overlapping, which also needs to add the max
increment of its children.

In this PR,
- To achieve the goal of (1), we apply `id` for graph/cluster/node in
dot-file, which will be rendered as `HTML/SVG element id. And use id for
selections.
- To achieve the goal of (2), we post-process it right after rendering the
dot file. Especially for job graph rendering, we need to complete it before
rendering the second dot file, as we need to deliver the final coordinates to
the follower.
Additionally, tooltips for WholeStageCodegen clusters on the execution page
are supported. It's convenient for users to learn the information of a plan
node when it is out of the screen.
### Why are the changes needed?
Avoid fork and custom upstream libs
### Does this PR introduce _any_ user-facing change?
Yes, tooltips for WholeStageCodegen clusters on the execution page are
supported; otherwise, mainly code refactoring.
### How was this patch tested?
Tested manually,
#### use case
```sql
SELECT Avg(t1.id)
FROM RANGE(1) t1
join RANGE(2) t2
join RANGE(3) t3
join RANGE(4) t4
join RANGE(5) t5
join RANGE(6) t6
join RANGE(7) t7
ON t1.id = t2.id
AND t2.id = t3.id
AND t3.id=t4.id
AND t4.id=t5.id
AND t5.id = t6.id
AND t6.id=t7.id
GROUP BY t1.id % 5;
```
#### Job DAG Visualization


##### Cached

#### Stage DAG Visualization

#### Plan Visualization

#####
### Was this patch authored or co-authored using generative AI tooling?
NO
Closes #43053 from yaooqinn/SPARK-45274.
Authored-by: Kent Yao <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.../org/apache/spark/ui/static/dagre-d3.min.js | 53 +-----
.../org/apache/spark/ui/static/spark-dag-viz.js | 201 ++++++++++++---------
.../apache/spark/ui/scope/RDDOperationGraph.scala | 28 ++-
.../org/apache/spark/ui/UISeleniumSuite.scala | 65 +++++--
.../spark/sql/execution/ui/static/spark-sql-viz.js | 59 ++++--
.../spark/sql/execution/ui/ExecutionPage.scala | 6 -
.../spark/sql/execution/ui/SparkPlanGraph.scala | 17 +-
7 files changed, 241 insertions(+), 188 deletions(-)
diff --git a/core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
b/core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
index 42aa133a28b..651b5380fea 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js
@@ -1,4 +1,6 @@
-/* This is a custom version of dagre-d3 on top of v0.6.4. The full list of
commits can be found at
https://github.com/yaooqinn/dagre-d3/releases/tag/v0.6.4-patch */
+/**
+ * https://github.com/dagrejs/dagre-d3/blob/v0.6.4/dist/dagre-d3.min.js
+ */
(function(f){if(typeof exports==="object"&&typeof
module!=="undefined"){module.exports=f()}else if(typeof
define==="function"&&define.amd){define([],f)}else{var g;if(typeof
window!=="undefined"){g=window}else if(typeof
global!=="undefined"){g=global}else if(typeof
self!=="undefined"){g=self}else{g=this}g.dagreD3=f()}})(function(){var
define,module,exports;return function(){function r(e,n,t){function
o(i,f){if(!n[i]){if(!e[i]){var c="function"==typeof
require&&require;if(!f&&c)return c(i, [...]
/**
* @license
@@ -22,37 +24,11 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
-module.exports={graphlib:require("./lib/graphlib"),dagre:require("./lib/dagre"),intersect:require("./lib/intersect"),render:require("./lib/render"),util:require("./lib/util"),version:require("./lib/version")}},{"./lib/dagre":8,"./lib/graphlib":9,"./lib/intersect":10,"./lib/render":25,"./lib/util":27,"./lib/version":28}],2:[function(require,module,exports){var
util=require("./util");module.exports={default:normal,normal:normal,vee:vee,undirected:undirected};function
normal(parent,id,edge, [...]
-// Clusters created from DOT subgraphs are prefixed with "cluster"
-// strip this prefix if it exists and use our own (i.e. "cluster_")
-var
makeClusterIdentifier=function(v){return"cluster_"+v.replace(/^cluster/,"")};svgClusters.selectAll("*").remove();svgClusters.enter().append("g").attr("class",makeClusterIdentifier).attr("name",function(v){return
g.node(v).label}).classed("cluster",true).attr("id",function(v){var
node=g.node(v);return
node.id}).style("opacity",0).append("rect");svgClusters=selection.selectAll("g.cluster");
-// Draw the label for each cluster and adjust the padding for it.
-// We position the labels later because the dimensions and the positions
-// of the enclosing rectangles are still subject to change. Note that
-// the ordering here is important because we build the parents' padding
-// based on the children's.
-var sortedClusters=util.orderByRank(g,svgClusters.data());for(var
i=0;i<sortedClusters.length;i++){var v=sortedClusters[i];var
node=g.node(v);if(node.label){var
thisGroup=selection.select("g.cluster."+makeClusterIdentifier(v));labelGroup=thisGroup.append("g").attr("class","label"),labelDom=addLabel(labelGroup,node),bbox=pick(labelDom.node().getBBox(),"width","height");
-// Add some padding for the label
-// Do this recursively to account for our descendants' labels.
-// To avoid double counting, we must start from the leaves.
-node.paddingTop+=bbox.height;node.paddingTop+=util.getMaxChildPaddingTop(g,v);
-// move the label to the right-top of the cluster
-// node.padding is the orginal padding
-var x=node.width/2-5;// move right to edge of cluster
-// move right to edge of cluster
-var y=-node.height/2-node.paddingTop+5;// move up to top of cluster
-// move up to top of cluster
-labelDom.attr("text-anchor","end").attr("transform","translate("+x+","+y+")")}}util.applyTransition(svgClusters,g).style("opacity",1);svgClusters.selectAll("rect").each(function(c){var
node=g.node(c);var
domCluster=d3.select(this);util.applyStyle(domCluster,node.style)});var
exitSelection;if(svgClusters.exit){exitSelection=svgClusters.exit()}else{exitSelection=svgClusters.selectAll(null);//
empty selection
-}util.applyTransition(exitSelection,g).style("opacity",0).remove();return
svgClusters}},{"./d3":7,"./label/add-label":18,"./util":27,"lodash/pick":331}],4:[function(require,module,exports){"use
strict";var _=require("./lodash");var
addLabel=require("./label/add-label");var util=require("./util");var
d3=require("./d3");module.exports=createEdgeLabels;function
createEdgeLabels(selection,g){var
svgEdgeLabels=selection.selectAll("g.edgeLabel").data(g.edges(),function(e){return
util.edgeToId( [...]
+module.exports={graphlib:require("./lib/graphlib"),dagre:require("./lib/dagre"),intersect:require("./lib/intersect"),render:require("./lib/render"),util:require("./lib/util"),version:require("./lib/version")}},{"./lib/dagre":8,"./lib/graphlib":9,"./lib/intersect":10,"./lib/render":25,"./lib/util":27,"./lib/version":28}],2:[function(require,module,exports){var
util=require("./util");module.exports={default:normal,normal:normal,vee:vee,undirected:undirected};function
normal(parent,id,edge, [...]
+}util.applyTransition(exitSelection,g).style("opacity",0).remove();return
svgClusters}},{"./d3":7,"./label/add-label":18,"./util":27}],4:[function(require,module,exports){"use
strict";var _=require("./lodash");var
addLabel=require("./label/add-label");var util=require("./util");var
d3=require("./d3");module.exports=createEdgeLabels;function
createEdgeLabels(selection,g){var
svgEdgeLabels=selection.selectAll("g.edgeLabel").data(g.edges(),function(e){return
util.edgeToId(e)}).classed("upda [...]
}util.applyTransition(exitSelection,g).style("opacity",0).remove();return
svgEdgeLabels}},{"./d3":7,"./label/add-label":18,"./lodash":21,"./util":27}],5:[function(require,module,exports){"use
strict";var _=require("./lodash");var
intersectNode=require("./intersect/intersect-node");var
util=require("./util");var
d3=require("./d3");module.exports=createEdgePaths;function
createEdgePaths(selection,g,arrows){var
previousPaths=selection.selectAll("g.edgePath").data(g.edges(),function(e){retur
[...]
// Save DOM element in the path group, and set ID and class
-svgPaths.each(function(e){var domEdge=d3.select(this);var
edge=g.edge(e);edge.elem=this;if(edge.id){domEdge.attr("id",edge.id)}util.applyClass(domEdge,edge["class"],(domEdge.classed("update")?"update
":"")+"edgePath")});svgPaths.selectAll("path.path").each(function(e){var
edge=g.edge(e);edge.arrowheadId=_.uniqueId("arrowhead");var
domEdge=d3.select(this).attr("marker-end",function(){return"url("+makeFragmentRef(location.href,edge.arrowheadId)+")"}).style("fill","none");util.applyTransiti
[...]
-// Stretch this node horizontally a little to account for ancestor cluster
-// labels. We must do this here because by the time we create the clusters,
-// we have already positioned all the nodes.
-var requiredWidth=0,requiredHeight=0;var
nextNode=g.node(g.parent(v));while(nextNode){var
tempGroup=thisGroup.append("g");var tempLabel=addLabel(tempGroup,nextNode);var
tempBBox=tempLabel.node().getBBox();
-// WARNING: this uses a hard-coded value of nodesep
-tempBBox.width-=50;requiredWidth=Math.max(requiredWidth,tempBBox.width);requiredHeight=Math.max(requiredHeight,tempBBox.height);tempLabel.remove();nextNode=g.node(g.parent(nextNode.label))}var
shapeBBox=shapeSvg.node().getBBox();shapeBBox.width=Math.max(shapeBBox.width,requiredWidth);shapeBBox.height=Math.max(shapeBBox.height,requiredHeight);node.width=shapeBBox.width;node.height=shapeBBox.height});var
exitSelection;if(svgNodes.exit){exitSelection=svgNodes.exit()}else{exitSelection=svgNo
[...]
+svgPaths.each(function(e){var domEdge=d3.select(this);var
edge=g.edge(e);edge.elem=this;if(edge.id){domEdge.attr("id",edge.id)}util.applyClass(domEdge,edge["class"],(domEdge.classed("update")?"update
":"")+"edgePath")});svgPaths.selectAll("path.path").each(function(e){var
edge=g.edge(e);edge.arrowheadId=_.uniqueId("arrowhead");var
domEdge=d3.select(this).attr("marker-end",function(){return"url("+makeFragmentRef(location.href,edge.arrowheadId)+")"}).style("fill","none");util.applyTransiti
[...]
}util.applyTransition(exitSelection,g).style("opacity",0).remove();return
svgNodes}},{"./d3":7,"./label/add-label":18,"./lodash":21,"./util":27}],7:[function(require,module,exports){
// Stub to get D3 either via NPM or from the global object
var d3;if(!d3){if(typeof require==="function"){try{d3=require("d3")}catch(e){
@@ -125,9 +101,9 @@ if(node.labelType==="svg"){addSVGLabel(labelSvg,node)}else
if(typeof label!=="st
/* global window */
var lodash;if(typeof
require==="function"){try{lodash={defaults:require("lodash/defaults"),each:require("lodash/each"),isFunction:require("lodash/isFunction"),isPlainObject:require("lodash/isPlainObject"),pick:require("lodash/pick"),has:require("lodash/has"),range:require("lodash/range"),uniqueId:require("lodash/uniqueId")}}catch(e){
// continue regardless of error
-}}if(!lodash){lodash=window._}module.exports=lodash},{"lodash/defaults":289,"lodash/each":290,"lodash/has":299,"lodash/isFunction":308,"lodash/isPlainObject":313,"lodash/pick":331,"lodash/range":333,"lodash/uniqueId":346}],22:[function(require,module,exports){"use
strict";var util=require("./util");var
d3=require("./d3");module.exports=positionClusters;function
positionClusters(selection,g){var
created=selection.filter(function(){return!d3.select(this).classed("update")});function
transl [...]
+}}if(!lodash){lodash=window._}module.exports=lodash},{"lodash/defaults":289,"lodash/each":290,"lodash/has":299,"lodash/isFunction":308,"lodash/isPlainObject":313,"lodash/pick":331,"lodash/range":333,"lodash/uniqueId":346}],22:[function(require,module,exports){"use
strict";var util=require("./util");var
d3=require("./d3");module.exports=positionClusters;function
positionClusters(selection,g){var
created=selection.filter(function(){return!d3.select(this).classed("update")});function
transl [...]
// This design is based on http://bost.ocks.org/mike/chart/.
-function render(){var createNodes=require("./create-nodes");var
createClusters=require("./create-clusters");var
createEdgeLabels=require("./create-edge-labels");var
createEdgePaths=require("./create-edge-paths");var
positionNodes=require("./position-nodes");var
positionEdgeLabels=require("./position-edge-labels");var
positionClusters=require("./position-clusters");var
shapes=require("./shapes");var arrows=require("./arrows");var
fn=function(svg,g){preProcessGraph(g);var outputGroup=creat [...]
+function render(){var createNodes=require("./create-nodes");var
createClusters=require("./create-clusters");var
createEdgeLabels=require("./create-edge-labels");var
createEdgePaths=require("./create-edge-paths");var
positionNodes=require("./position-nodes");var
positionEdgeLabels=require("./position-edge-labels");var
positionClusters=require("./position-clusters");var
shapes=require("./shapes");var arrows=require("./arrows");var
fn=function(svg,g){preProcessGraph(g);var outputGroup=creat [...]
// Save dimensions for restore during post-processing
if(_.has(node,"width")){node._prevWidth=node.width}if(_.has(node,"height")){node._prevHeight=node.height}});g.edges().forEach(function(e){var
edge=g.edge(e);if(!_.has(edge,"label")){edge.label=""}_.defaults(edge,EDGE_DEFAULT_ATTRS)})}function
postProcessGraph(g){_.each(g.nodes(),function(v){var node=g.node(v);
// Restore original dimensions
@@ -137,20 +113,11 @@
if(_.has(node,"_prevWidth")){node.width=node._prevWidth}else{delete node.width}i
// http://mathforum.org/kb/message.jspa?messageID=3750236
function diamond(parent,bbox,node){var w=bbox.width*Math.SQRT2/2;var
h=bbox.height*Math.SQRT2/2;var
points=[{x:0,y:-h},{x:-w,y:0},{x:0,y:h},{x:w,y:0}];var
shapeSvg=parent.insert("polygon",":first-child").attr("points",points.map(function(p){return
p.x+","+p.y}).join(" "));node.intersect=function(p){return
intersectPolygon(node,points,p)};return
shapeSvg}},{"./intersect/intersect-circle":11,"./intersect/intersect-ellipse":12,"./intersect/intersect-polygon":15,"./intersect/intersect-rect":
[...]
// Public utility functions
-module.exports={isSubgraph:isSubgraph,getMaxChildPaddingTop:getMaxChildPaddingTop,orderByRank:orderByRank,edgeToId:edgeToId,applyStyle:applyStyle,applyClass:applyClass,applyTransition:applyTransition};
+module.exports={isSubgraph:isSubgraph,edgeToId:edgeToId,applyStyle:applyStyle,applyClass:applyClass,applyTransition:applyTransition};
/*
* Returns true if the specified node in the graph is a subgraph node. A
* subgraph node is one that contains other nodes.
- */function isSubgraph(g,v){return!!g.children(v).length}
-/*
- * Returns the max "paddingTop" property among the specified node's children.
- * A return value of 0 means this node has no children.
- */function getMaxChildPaddingTop(g,v){var maxPadding=0;var
children=g.children(v);for(var i=0;i<children.length;i++){var
child=g.node(children[i]);if(child.paddingTop&&child.paddingTop>maxPadding){maxPadding=child.paddingTop}}return
maxPadding}
-/* Return the rank of the specified node. A rank of 0 means the node has no
children. */function getRank(g,v){var maxRank=0;var
children=g.children(v);for(var i=0;i<children.length;i++){var
thisRank=getRank(g,children[i])+1;if(thisRank>maxRank){maxRank=thisRank}}return
maxRank}
-/*
- * Order the following nodes by rank, from the leaves to the roots.
- * This mutates the list of nodes in place while sorting them.
- */function orderByRank(g,nodes){return nodes.sort(function(x,y){return
getRank(g,x)-getRank(g,y)})}function edgeToId(e){return
escapeId(e.v)+":"+escapeId(e.w)+":"+escapeId(e.name)}var ID_DELIM=/:/g;function
escapeId(str){return str?String(str).replace(ID_DELIM,"\\:"):""}function
applyStyle(dom,styleFn){if(styleFn){dom.attr("style",styleFn)}}function
applyClass(dom,classFn,otherClasses){if(classFn){dom.attr("class",classFn).attr("class",otherClasses+"
"+dom.attr("class"))}}function apply [...]
+ */function isSubgraph(g,v){return!!g.children(v).length}function
edgeToId(e){return escapeId(e.v)+":"+escapeId(e.w)+":"+escapeId(e.name)}var
ID_DELIM=/:/g;function escapeId(str){return
str?String(str).replace(ID_DELIM,"\\:"):""}function
applyStyle(dom,styleFn){if(styleFn){dom.attr("style",styleFn)}}function
applyClass(dom,classFn,otherClasses){if(classFn){dom.attr("class",classFn).attr("class",otherClasses+"
"+dom.attr("class"))}}function applyTransition(selection,g){var
graph=g.graph() [...]
// https://d3js.org/d3-array/ v1.2.4 Copyright 2018 Mike Bostock
(function(global,factory){typeof exports==="object"&&typeof
module!=="undefined"?factory(exports):typeof
define==="function"&&define.amd?define(["exports"],factory):factory(global.d3=global.d3||{})})(this,function(exports){"use
strict";function ascending(a,b){return a<b?-1:a>b?1:a>=b?0:NaN}function
bisector(compare){if(compare.length===1)compare=ascendingComparator(compare);return{left:function(a,x,lo,hi){if(lo==null)lo=0;if(hi==null)hi=a.length;while(lo<hi){var
mid=lo+hi>>>1;if(compare( [...]
if((value=values[i])!=null&&value>=value){min=max=value;while(++i<n){//
Compare the remaining values.
diff --git
a/core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js
b/core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js
index 952ca64d9cd..fd0baec8af6 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js
@@ -45,10 +45,7 @@
* by Spark's UI code. This is currently used only on the stage page and on
* the job page.
*
- * This requires jQuery, d3, and dagre-d3. Note that we use a custom release
- * of dagre-d3 (http://github.com/andrewor14/dagre-d3) for some specific
- * functionality. For more detail, please track the changes in that project
- * since it was forked (commit 101503833a8ce5fe369547f6addf3e71172ce10b).
+ * This requires jQuery, d3, and dagre-d3.
*/
/* global $, appBasePath, d3, dagreD3, graphlibDot, uiRoot */
@@ -171,34 +168,90 @@ function renderDagViz(forJob) {
metadataContainer().selectAll(".cached-rdd").each(function(_ignored_v) {
var rddId = d3.select(this).text().trim();
var nodeId = VizConstants.nodePrefix + rddId;
- svg.selectAll("g." + nodeId).classed("cached", true);
+ svg.selectAll("#" + nodeId).classed("cached", true);
});
metadataContainer().selectAll(".barrier-rdd").each(function() {
var opId = d3.select(this).text().trim();
var opClusterId = VizConstants.clusterPrefix + opId;
- var stageId = $(this).parents(".stage-metadata").attr("stage-id");
- var stageClusterId = VizConstants.graphPrefix + stageId;
- svg.selectAll("g[id=" + stageClusterId + "] g." +
opClusterId).classed("barrier", true)
+ svg.selectAll("#" + opClusterId).classed("barrier", true)
});
-
metadataContainer().selectAll(".indeterminate-rdd").each(function(_ignored_v) {
+ metadataContainer().selectAll(".indeterminate-rdd").each(function
(_ignored_v) {
var rddId = d3.select(this).text().trim();
var nodeId = VizConstants.nodePrefix + rddId;
- svg.selectAll("g." + nodeId).classed("indeterminate", true);
+ svg.selectAll("#" + nodeId).classed("indeterminate", true);
});
resizeSvg(svg);
interpretLineBreak(svg);
}
+/*
+ * Set up the layout for stage and child cluster in an inside-out(reverse) way.
+ * By default, the label of a cluster is placed in the middle of the cluster.
This function moves
+ * the label to the right top corner of the cluster and expands the cluster to
fit the label.
+ */
+function setupLayoutForClusters(g, svg) {
+ g.nodes().filter((v) => g.node(v).isCluster).reverse().forEach((v) => {
+ const node = g.node(v);
+ const cluster = svg.select("#" + node.id);
+ // Find the stage cluster and mark it for styling and post-processing
+ if (isStage(v)) {
+ cluster.classed("stage", true);
+ }
+ const labelGroup = cluster.select(".label");
+ const bbox = labelGroup.node().getBBox();
+ const rect = cluster.select("rect");
+ const maxChildSize = getMaxChildWidthAndPaddingTop(g, v, svg);
+ const oldWidth = parseFloat(rect.attr("width"));
+ const newWidth = Math.max(oldWidth, bbox.width, maxChildSize.width) + 10;
+ const oldHeight = parseFloat(rect.attr("height"));
+ const newHeight = oldHeight + bbox.height + maxChildSize.paddingTop;
+ rect
+ .attr("width", (_ignored_i) => newWidth)
+ .attr("height", (_ignored_i) => newHeight)
+ .attr("x", (_ignored_i) => parseFloat(rect.attr("x")) - (newWidth -
oldWidth) / 2)
+ .attr("y", (_ignored_i) => parseFloat(rect.attr("y")) - (newHeight -
oldHeight) / 2);
+
+ labelGroup
+ .select("g")
+ .attr("text-anchor", "end")
+ .attr("transform", "translate(" + (newWidth / 2 - 5) + "," + (-newHeight
/ 2 + 5) + ")");
+ })
+}
+
+
+/*
+ * Get the max width of all children and get the max padding top based on the
label text height.
+ */
+function getMaxChildWidthAndPaddingTop(g, v, svg) {
+ var maxWidth = 0;
+ var maxPaddingTop = 0;
+ g.children(v).filter((i) => g.node(i).isCluster).forEach((c) => {
+ const childCluster = svg.select("#" + g.node(c).id);
+ const rect = childCluster.select("rect");
+ if (!rect.empty()) {
+ const width = parseFloat(rect.attr("width"));
+ if (width > maxWidth) {
+ maxWidth = width;
+ }
+ }
+ const height = childCluster.select(".label").node().getBBox().height;
+ if (height > maxPaddingTop) {
+ maxPaddingTop = height;
+ }
+ });
+ return {paddingTop: maxPaddingTop, width: maxWidth};
+}
+
/* Render the RDD DAG visualization on the stage page. */
function renderDagVizForStage(svgContainer) {
var metadata = metadataContainer().select(".stage-metadata");
var dot = metadata.select(".dot-file").text().trim();
- var containerId = VizConstants.graphPrefix + metadata.attr("stage-id");
- var container = svgContainer.append("g").attr("id", containerId);
- renderDot(dot, container, false);
+ var g = graphlibDot.read(dot);
+ renderDot(g, svgContainer, false);
+ setupLayoutForClusters(g, svgContainer)
// Round corners on rectangles
svgContainer
@@ -224,43 +277,28 @@ function renderDagVizForJob(svgContainer) {
metadataContainer().selectAll(".stage-metadata").each(function(d, i) {
var metadata = d3.select(this);
var dot = metadata.select(".dot-file").text();
- var stageId = metadata.attr("stage-id");
- var containerId = VizConstants.graphPrefix + stageId;
var isSkipped = metadata.attr("skipped") === "true";
var container;
if (isSkipped) {
container = svgContainer
.append("g")
- .attr("id", containerId)
.attr("skipped", "true");
} else {
// Link each graph to the corresponding stage page (TODO: handle stage
attempts)
var attemptId = 0;
+ var stageId = metadata.attr("stage-id");
var stageLink = uiRoot + appBasePath + "/stages/stage/?id=" + stageId +
"&attempt=" + attemptId;
container = svgContainer
.append("a")
.attr("xlink:href", stageLink)
.attr("onclick",
"window.localStorage.setItem(expandDagVizArrowKey(false), true)")
.append("g")
- .attr("id", containerId);
- }
-
- // Now we need to shift the container for this stage so it doesn't overlap
with
- // existing ones, taking into account the position and width of the last
stage's
- // container. We do not need to do this for the first stage of this job.
- if (i > 0) {
- var existingStages = svgContainer.selectAll("g.cluster.stage").nodes();
- if (existingStages.length > 0) {
- var lastStage = d3.select(existingStages.pop());
- var lastStageWidth = toFloat(lastStage.select("rect").attr("width"));
- var lastStagePosition = getAbsolutePosition(lastStage);
- var offset = lastStagePosition.x + lastStageWidth +
VizConstants.stageSep;
- container.attr("transform", "translate(" + offset + ", 0)");
- }
}
+ var g = graphlibDot.read(dot);
// Actually render the stage
- renderDot(dot, container, true);
+ renderDot(g, container, true);
+ setupLayoutForClusters(g, container)
// Mark elements as skipped if appropriate. Unfortunately we need to mark
all
// elements instead of the parent container because of CSS override rules.
@@ -274,28 +312,39 @@ function renderDagVizForJob(svgContainer) {
.attr("rx", "4")
.attr("ry", "4");
+ // Now we need to shift the container for this stage so it doesn't overlap
with
+ // existing ones, taking into account the position and width of the last
stage's
+ // container. We do not need to do this for the first stage of this job.
+ if (i > 0) {
+ var existingStages = svgContainer.selectAll("g.cluster.stage").nodes();
+ if (existingStages.length > 0) {
+ var lastStage = d3.select(existingStages.pop());
+ var lastStageWidth = toFloat(lastStage.select("rect").attr("width"));
+ var lastStagePosition = getAbsolutePosition(lastStage);
+ var offset = lastStagePosition.x + lastStageWidth +
VizConstants.stageSep;
+ container.attr("transform", "translate(" + offset + ", 0)");
+ }
+ }
+
// If there are any incoming edges into this graph, keep track of them to
render
// them separately later. Note that we cannot draw them now because we
need to
// put these edges in a separate container that is on top of all stage
graphs.
- metadata.selectAll(".incoming-edge").each(function(_ignored_v) {
+ metadata.selectAll(".incoming-edge").each(function (_ignored_v) {
var edge = d3.select(this).text().trim().split(","); // e.g. 3,4 => [3,
4]
crossStageEdges.push(edge);
});
+
+ addTooltipsForRDDs(container, g);
});
- addTooltipsForRDDs(svgContainer);
drawCrossStageEdges(crossStageEdges, svgContainer);
}
/* Render the dot file as an SVG in the given container. */
-function renderDot(dot, container, forJob) {
- var g = graphlibDot.read(dot);
+function renderDot(g, container, forJob) {
var renderer = new dagreD3.render();
preprocessGraphLayout(g, forJob);
renderer(container, g);
-
- // Find the stage cluster and mark it for styling and post-processing
- container.selectAll("g.cluster[name^=\"Stage \"]").classed("stage", true);
}
/* -------------------- *
@@ -304,19 +353,22 @@ function renderDot(dot, container, forJob) {
// Helper d3 accessors
function graphContainer() { return d3.select("#dag-viz-graph"); }
+
function metadataContainer() { return d3.select("#dag-viz-metadata"); }
+function isStage(v) {
+ return v.indexOf(VizConstants.graphPrefix) === 0;
+}
+
/*
* Helper function to pre-process the graph layout.
* This step is necessary for certain styles that affect the positioning
* and sizes of graph elements, e.g. padding, font style, shape.
*/
function preprocessGraphLayout(g, forJob) {
- var nodes = g.nodes();
- for (var i = 0; i < nodes.length; i++) {
- var isCluster = g.children(nodes[i]).length > 0;
- if (!isCluster) {
- var node = g.node(nodes[i]);
+ g.nodes().filter((v) => !g.node(v).isCluster).forEach((v) => {
+ const node = g.node(v);
+ if (!node.isCluster) {
if (forJob) {
// Do not display RDD name on job page
node.shape = "circle";
@@ -324,9 +376,11 @@ function preprocessGraphLayout(g, forJob) {
} else {
node.labelStyle = "font-size: 12px";
}
- node.padding = "5";
}
- }
+
+ node.padding = "5";
+ })
+
// Curve the edges
g.edges().forEach(function (edge) {
g.setEdge(edge.v, edge.w, {
@@ -368,8 +422,8 @@ function resizeSvg(svg) {
var width = endX - startX;
var height = endY - startY;
svg.attr("viewBox", startX + " " + startY + " " + width + " " + height)
- .attr("width", width)
- .attr("height", height);
+ .attr("width", width)
+ .attr("height", height);
}
/*
@@ -449,14 +503,14 @@ function getAbsolutePosition(d3selection) {
function connectRDDs(fromRDDId, toRDDId, edgesContainer, svgContainer) {
var fromNodeId = VizConstants.nodePrefix + fromRDDId;
var toNodeId = VizConstants.nodePrefix + toRDDId;
- var fromPos = getAbsolutePosition(svgContainer.select("g." + fromNodeId));
- var toPos = getAbsolutePosition(svgContainer.select("g." + toNodeId));
+ var fromPos = getAbsolutePosition(svgContainer.select("#" + fromNodeId));
+ var toPos = getAbsolutePosition(svgContainer.select("#" + toNodeId));
// On the job page, RDDs are rendered as dots (circles). When rendering the
path,
// we need to account for the radii of these circles. Otherwise the arrow
heads
// will bleed into the circle itself.
var delta = toFloat(svgContainer
- .select("g.node." + toNodeId)
+ .select("#" + toNodeId)
.select("circle")
.attr("r"));
if (fromPos.x < toPos.x) {
@@ -500,44 +554,15 @@ function connectRDDs(fromRDDId, toRDDId, edgesContainer,
svgContainer) {
}
/* (Job page only) Helper function to add tooltips for RDDs. */
-function addTooltipsForRDDs(svgContainer) {
- svgContainer.selectAll("g.node").each(function() {
- var node = d3.select(this);
- var tooltipText = node.attr("name");
- if (tooltipText) {
- node.select("circle")
- .attr("data-toggle", "tooltip")
- .attr("data-placement", "top")
- .attr("data-html", "true") // to interpret line break, tooltipText is
showing <circle> title
- .attr("title", tooltipText);
- }
- // Link tooltips for all nodes that belong to the same RDD
- node.on("mouseenter", function() { triggerTooltipForRDD(node, true); });
- node.on("mouseleave", function() { triggerTooltipForRDD(node, false); });
+function addTooltipsForRDDs(svgContainer, g) {
+ g.nodes().filter((v) => !g.node(v).isCluster).forEach((v) => {
+ const node = g.node(v);
+ d3.select("#" + node.id).each(function () {
+ $(this).tooltip({
+ title: node.label, trigger: "hover focus", container: "body",
placement: "top", html: true
+ })
+ });
});
-
- $("[data-toggle=tooltip]")
- .filter("g.node circle")
- .tooltip({ container: "body", trigger: "manual" });
-}
-
-/*
- * (Job page only) Helper function to show or hide tooltips for all nodes
- * in the graph that refer to the same RDD the specified node represents.
- */
-function triggerTooltipForRDD(d3node, show) {
- var classes = d3node.node().classList;
- for (var i = 0; i < classes.length; i++) {
- var clazz = classes[i];
- var isRDDClass = clazz.indexOf(VizConstants.nodePrefix) == 0;
- if (isRDDClass) {
- graphContainer().selectAll("g." + clazz).each(function() {
- var circle = d3.select(this).select("circle").node();
- var showOrHide = show ? "show" : "hide";
- $(circle).tooltip(showOrHide);
- });
- }
- }
}
/* Helper function to convert attributes to numeric values. */
diff --git
a/core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala
b/core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala
index 260ed53b248..e531bcf3b7c 100644
--- a/core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala
+++ b/core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala
@@ -234,7 +234,10 @@ private[spark] object RDDOperationGraph extends Logging {
def makeDotFile(graph: RDDOperationGraph): String = {
val dotFile = new StringBuilder
dotFile.append("digraph G {\n")
- makeDotSubgraph(dotFile, graph.rootCluster, indent = " ")
+ val indent = " "
+ val graphId =
s"graph_${graph.rootCluster.id.replaceAll(STAGE_CLUSTER_PREFIX, "")}"
+ dotFile.append(indent).append(s"""id="$graphId";\n""")
+ makeDotSubgraph(dotFile, graph.rootCluster, indent = indent)
graph.edges.foreach { edge => dotFile.append(s"""
${edge.fromId}->${edge.toId};\n""") }
dotFile.append("}")
val result = dotFile.toString()
@@ -261,23 +264,32 @@ private[spark] object RDDOperationGraph extends Logging {
case _ => ""
}
val escapedCallsite = Utility.escape(node.callsite)
- val label = s"${node.name}
[${node.id}]$isCached$isBarrier$outputDeterministicLevel" +
- s"<br>${escapedCallsite}"
- s"""${node.id} [labelType="html"
label="${StringEscapeUtils.escapeJava(label)}"]"""
+ val label = StringEscapeUtils.escapeJava(
+ s"${node.name} [${node.id}]$isCached$isBarrier$outputDeterministicLevel"
+
+ s"<br>$escapedCallsite")
+ s"""${node.id} [id="node_${node.id}" labelType="html" label="$label}"]"""
}
- /** Update the dot representation of the RDDOperationGraph in cluster to
subgraph. */
+ /** Update the dot representation of the RDDOperationGraph in cluster to
subgraph.
+ *
+ * @param prefix The prefix of the subgraph id. 'graph_' for stages,
'cluster_' for
+ * for child clusters. See also VizConstants in
`spark-dag-viz.js`
+ */
private def makeDotSubgraph(
subgraph: StringBuilder,
cluster: RDDOperationCluster,
- indent: String): Unit = {
- subgraph.append(indent).append(s"subgraph cluster${cluster.id} {\n")
+ indent: String,
+ prefix: String = "graph_"): Unit = {
+ val clusterId = s"$prefix${cluster.id}"
+ subgraph.append(indent).append(s"subgraph $clusterId {\n")
+ .append(indent).append(s""" id="$clusterId";\n""")
+ .append(indent).append(s""" isCluster="true";\n""")
.append(indent).append(s"""
label="${StringEscapeUtils.escapeJava(cluster.name)}";\n""")
cluster.childNodes.foreach { node =>
subgraph.append(indent).append(s" ${makeDotNode(node)};\n")
}
cluster.childClusters.foreach { cscope =>
- makeDotSubgraph(subgraph, cscope, indent + " ")
+ makeDotSubgraph(subgraph, cscope, indent + " ", "cluster_")
}
subgraph.append(indent).append("}\n")
}
diff --git a/core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala
b/core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala
index 79496bba667..dd9927d7ba1 100644
--- a/core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala
+++ b/core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala
@@ -721,32 +721,57 @@ class UISeleniumSuite extends SparkFunSuite with
WebBrowser with Matchers {
eventually(timeout(5.seconds), interval(100.milliseconds)) {
val stage0 = Utils.tryWithResource(Source.fromURL(sc.ui.get.webUrl +
"/stages/stage/?id=0&attempt=0&expandDagViz=true"))(_.mkString)
- assert(stage0.contains("digraph G {\n subgraph clusterstage_0 {\n
" +
- "label="Stage 0";\n subgraph "))
- assert(stage0.contains("{\n label="parallelize";\n
" +
- "0 [labelType="html" label="ParallelCollectionRDD
[0]"))
- assert(stage0.contains("{\n label="map";\n " +
- "1 [labelType="html" label="MapPartitionsRDD [1]"))
- assert(stage0.contains("{\n label="groupBy";\n " +
- "2 [labelType="html" label="MapPartitionsRDD [2]"))
+ assert(stage0.contains("""digraph G {
+ | id="graph_0";
+ | subgraph graph_stage_0 {
+ | id="graph_stage_0";
+ | isCluster="true";
+ | label="Stage
0";""".stripMargin))
+ assert(stage0.contains("""
+ | isCluster="true";
+ | label="parallelize";
+ | 0
[id="node_0"""".stripMargin))
+ assert(stage0.contains("""
+ | isCluster="true";
+ | label="map";
+ | 1
[id="node_1"""".stripMargin))
+ assert(stage0.contains("""
+ | isCluster="true";
+ | label="groupBy";
+ | 2
[id="node_2"""".stripMargin))
val stage1 = Utils.tryWithResource(Source.fromURL(sc.ui.get.webUrl +
"/stages/stage/?id=1&attempt=0&expandDagViz=true"))(_.mkString)
- assert(stage1.contains("digraph G {\n subgraph clusterstage_1 {\n
" +
- "label="Stage 1";\n subgraph "))
- assert(stage1.contains("{\n label="groupBy";\n " +
- "3 [labelType="html" label="ShuffledRDD [3]"))
- assert(stage1.contains("{\n label="map";\n " +
- "4 [labelType="html" label="MapPartitionsRDD [4]"))
- assert(stage1.contains("{\n label="groupBy";\n " +
- "5 [labelType="html" label="MapPartitionsRDD [5]"))
+ assert(stage1.contains("""digraph G {
+ | id="graph_1";
+ | subgraph graph_stage_1 {
+ | id="graph_stage_1";
+ | isCluster="true";
+ | label="Stage
1";""".stripMargin))
+ assert(stage1.contains("""
+ | isCluster="true";
+ |
label="groupBy";""".stripMargin))
+ assert(stage1.contains(
+ "3 [id="node_3" labelType="html"
label="ShuffledRDD"))
+ assert(stage1.contains("""
+ | isCluster="true";
+ | label="map";""".stripMargin))
+ assert(stage1.contains(
+ "4 [id="node_4" labelType="html"
label="MapPartitionsRDD [4]"))
val stage2 = Utils.tryWithResource(Source.fromURL(sc.ui.get.webUrl +
"/stages/stage/?id=2&attempt=0&expandDagViz=true"))(_.mkString)
- assert(stage2.contains("digraph G {\n subgraph clusterstage_2 {\n
" +
- "label="Stage 2";\n subgraph "))
- assert(stage2.contains("{\n label="groupBy";\n " +
- "6 [labelType="html" label="ShuffledRDD [6]"))
+ assert(stage2.contains("""digraph G {
+ | id="graph_2";
+ | subgraph graph_stage_2 {
+ | id="graph_stage_2";
+ | isCluster="true";
+ | label="Stage
2";""".stripMargin))
+ assert(stage2.contains("""
+ | isCluster="true";
+ |
label="groupBy";""".stripMargin))
+ assert(stage2.contains(
+ "6 [id="node_6" labelType="html"
label="ShuffledRDD [6]"))
}
}
}
diff --git
a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js
b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js
index a926376f3c0..ea42877924d 100644
---
a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js
+++
b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js
@@ -46,11 +46,8 @@ function renderPlanViz() {
.attr("rx", "5")
.attr("ry", "5");
- var nodeSize = parseInt($("#plan-viz-metadata-size").text());
- for (var i = 0; i < nodeSize; i++) {
- setupTooltipForSparkPlanNode(i);
- }
-
+ setupLayoutForSparkPlanCluster(g, svg);
+ setupTooltipForSparkPlanNode(g);
resizeSvg(svg);
postprocessForAdditionalMetrics();
}
@@ -66,15 +63,44 @@ function planVizContainer() { return
d3.select("#plan-viz-graph"); }
* Set up the tooltip for a SparkPlan node using metadata. When the user moves
the mouse on the
* node, it will display the details of this SparkPlan node in the right.
*/
-function setupTooltipForSparkPlanNode(nodeId) {
- var nodeTooltip = d3.select("#plan-meta-data-" + nodeId).text();
- d3.select("svg g .node_" + nodeId)
- .each(function(_ignored_d) {
- var domNode = d3.select(this).node();
- $(domNode).tooltip({
- title: nodeTooltip, trigger: "hover focus", container: "body",
placement: "top"
+function setupTooltipForSparkPlanNode(g) {
+ g.nodes().forEach(function (v) {
+ const node = g.node(v);
+ d3.select("svg g #" + node.id).each(function () {
+ $(this).tooltip({
+ title: node.tooltip, trigger: "hover focus", container: "body",
placement: "top"
});
- })
+ });
+ });
+}
+
+/*
+ * Set up the layout for SparkPlan cluster.
+ * By default, the label of a cluster is placed in the middle of the cluster.
This function moves
+ * the label to the right top corner of the cluster and expands the cluster to
fit the label.
+ */
+function setupLayoutForSparkPlanCluster(g, svg) {
+ g.nodes().filter((v) => g.node(v).isCluster).forEach((v) => {
+ const node = g.node(v);
+ const cluster = svg.select("#" + node.id);
+ const labelGroup = cluster.select(".label");
+ const bbox = labelGroup.node().getBBox();
+ const rect = cluster.select("rect");
+ const oldWidth = parseFloat(rect.attr("width"));
+ const newWidth = Math.max(oldWidth, bbox.width) + 10;
+ const oldHeight = parseFloat(rect.attr("height"));
+ const newHeight = oldHeight + bbox.height;
+ rect
+ .attr("width", (_ignored_i) => newWidth)
+ .attr("height", (_ignored_i) => newHeight)
+ .attr("x", (_ignored_i) => parseFloat(rect.attr("x")) - (newWidth -
oldWidth) / 2)
+ .attr("y", (_ignored_i) => parseFloat(rect.attr("y")) - (newHeight -
oldHeight) / 2);
+
+ labelGroup
+ .select("g")
+ .attr("text-anchor", "end")
+ .attr("transform", "translate(" + (newWidth / 2 - 5) + "," + (-newHeight
/ 2 + 5) + ")");
+ })
}
// labelSeparator should be a non-graphical character in order not to affect
the width of boxes.
@@ -88,9 +114,8 @@ var stageAndTaskMetricsPattern =
"^(.*)(\\(stage.*task[^)]*\\))(.*)$";
*/
function preprocessGraphLayout(g) {
g.graph().ranksep = "70";
- var nodes = g.nodes();
- for (var i = 0; i < nodes.length; i++) {
- var node = g.node(nodes[i]);
+ g.nodes().forEach(function (v) {
+ const node = g.node(v);
node.padding = "5";
var firstSeparator;
@@ -113,7 +138,7 @@ function preprocessGraphLayout(g) {
newTexts[1] + firstSeparator + newTexts[2] + secondSeparator +
newTexts[3]);
}
});
- }
+ });
// Curve the edges
g.edges().forEach(function (edge) {
g.setEdge(edge.v, edge.w, {
diff --git
a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala
b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala
index c0e6b65d634..aa8fd261c58 100644
---
a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala
+++
b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala
@@ -115,10 +115,6 @@ class ExecutionPage(parent: SQLTab) extends
WebUIPage("execution") with Logging
request: HttpServletRequest,
metrics: Map[Long, String],
graph: SparkPlanGraph): Seq[Node] = {
- val metadata = graph.allNodes.flatMap { node =>
- val nodeId = s"plan-meta-data-${node.id}"
- <div id={nodeId}>{node.desc}</div>
- }
<div>
<div id="plan-viz-graph"></div>
@@ -126,8 +122,6 @@ class ExecutionPage(parent: SQLTab) extends
WebUIPage("execution") with Logging
<div class="dot-file">
{graph.makeDotFile(metrics)}
</div>
- <div id="plan-viz-metadata-size">{graph.allNodes.size.toString}</div>
- {metadata}
</div>
{planVisualizationResources(request)}
<script>$(function() {{ if (shouldRenderPlanViz()) {{ renderPlanViz();
}} }})</script>
diff --git
a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala
b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala
index 1504207d39c..11ba3cd05e2 100644
---
a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala
+++
b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala
@@ -177,20 +177,22 @@ class SparkPlanGraphNode(
metric.name + ": " + value
}
}
-
- if (values.nonEmpty) {
+ val nodeId = s"node$id"
+ val tooltip = StringEscapeUtils.escapeJava(desc)
+ val labelStr = if (values.nonEmpty) {
// If there are metrics, display each entry in a separate line.
// Note: whitespace between two "\n"s is to create an empty line between
the name of
// SparkPlan and metrics. If removing it, it won't display the empty
line in UI.
builder ++= "<br><br>"
builder ++= values.mkString("<br>")
- val labelStr =
StringEscapeUtils.escapeJava(builder.toString().replaceAll("\n", "<br>"))
- s""" $id [labelType="html" label="${labelStr}"];"""
+ StringEscapeUtils.escapeJava(builder.toString().replaceAll("\n", "<br>"))
} else {
// SPARK-30684: when there is no metrics, add empty lines to increase
the height of the node,
// so that there won't be gaps between an edge and a small node.
- s""" $id [labelType="html" label="<br><b>$name</b><br><br>"];"""
+ s"<br><b>$name</b><br><br>"
}
+ s""" $id [id="$nodeId" labelType="html" label="$labelStr"
tooltip="$tooltip"];"""
+
}
}
@@ -218,10 +220,13 @@ class SparkPlanGraphCluster(
} else {
name
}
+ val clusterId = s"cluster$id"
s"""
- | subgraph cluster${id} {
+ | subgraph $clusterId {
| isCluster="true";
+ | id="$clusterId";
| label="${StringEscapeUtils.escapeJava(labelStr)}";
+ | tooltip="${StringEscapeUtils.escapeJava(desc)}";
| ${nodes.map(_.makeDotNode(metricsValue)).mkString(" \n")}
| }
""".stripMargin
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]