[
https://issues.apache.org/jira/browse/SPARK-50374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923340#comment-17923340
]
Ahmed Hussein commented on SPARK-50374:
---------------------------------------
bq. Hi ~ Ahmed Hussein , I'd like to confirm that this won't affect the
accuracy of the data, correct?
Hi [~LuciferYang], this mainly fixes correctness in the Spark History Server
generated graph. Some of the conditions in the code are not consistent.
For example, [SparkPlanGraph-Line140|
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala#L140]
will be false for execs like ({{ReusedSubquery}}, {{SubqueryBroadcast}})
bq. To Ahmed Hussein, according to the Apache Spark guideline, I removed `Fix
Versions`. Please do not set them next time
Thanks [~dongjoon] for the tips. This is very helpful.
> SubqueryBroadcast is not reused in the SparkPlanGraph
> -----------------------------------------------------
>
> Key: SPARK-50374
> URL: https://issues.apache.org/jira/browse/SPARK-50374
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.5.3
> Reporter: Ahmed Hussein
> Priority: Major
> Attachments: spark-ui-expected-subquery-broadcast-2024-11-20.png,
> spark-ui-invalid-subquery-broadcast-2024-11-20.png
>
>
> _emphasized text_Currently, the SparkPlanGraph construction ignores that
> SubqueryBroadcast can be used as a subquery. This results in duplicating
> graph nodes.
> I uploaded two screenshots to show the difference between the current UI
> graph Vs. the expected.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]