[ 
https://issues.apache.org/jira/browse/SPARK-50374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923340#comment-17923340
 ] 

Ahmed Hussein commented on SPARK-50374:
---------------------------------------

bq. Hi ~ Ahmed Hussein , I'd like to confirm that this won't affect the 
accuracy of the data, correct?
Hi [~LuciferYang], this mainly fixes correctness in the Spark History Server 
generated graph. Some of the conditions in the code are not consistent.
For example, [SparkPlanGraph-Line140| 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SparkPlanGraph.scala#L140]
 will be false for  execs like ({{ReusedSubquery}}, {{SubqueryBroadcast}})

bq. To Ahmed Hussein, according to the Apache Spark guideline, I removed `Fix 
Versions`. Please do not set them next time

Thanks [~dongjoon] for the tips. This is very helpful.

> SubqueryBroadcast is not reused in the SparkPlanGraph
> -----------------------------------------------------
>
>                 Key: SPARK-50374
>                 URL: https://issues.apache.org/jira/browse/SPARK-50374
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.5.3
>            Reporter: Ahmed Hussein
>            Priority: Major
>         Attachments: spark-ui-expected-subquery-broadcast-2024-11-20.png, 
> spark-ui-invalid-subquery-broadcast-2024-11-20.png
>
>
> _emphasized text_Currently, the SparkPlanGraph construction ignores that 
> SubqueryBroadcast can be used as a subquery. This results in duplicating 
> graph nodes.
> I uploaded two screenshots to show the difference between the current UI 
> graph Vs. the expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to