[ 
https://issues.apache.org/jira/browse/SPARK-50595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17906303#comment-17906303
 ] 

Tianqi Wan commented on SPARK-50595:
------------------------------------

I found that the delta file was deleted by another attempt of the task(enabling 
spark.speculation). The other attempt was expected to be killed, but the 
progress did not exit, and it tried to rename its tmp file to delta file, and 
found that the delta file was already there. So the #AbstractFilesystem.rename 
deleted the delta file.

 

So, the non-atomic of #AbstractFilesystem.rename is to blame? Am i right?

> Temp file storing state doesn't renamed to delta file, but was read by next 
> task
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-50595
>                 URL: https://issues.apache.org/jira/browse/SPARK-50595
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 3.2.0
>         Environment: spark on yarn
>            Reporter: Tianqi Wan
>            Priority: Major
>         Attachments: Snipaste_2024-12-17_10-38-42.png
>
>
> Structured streaming job failed due to delta file does not exist
> !Snipaste_2024-12-17_10-38-42.png!
>  
> we checked the access log for the missing file, and found that the file was 
> never created. And we found that the correspond tmp file exists whose file 
> name is 
> [.506.delta.936315d9-b58b-4f18-b3ed-b413cd92646f.TID3368893.tmp|https://www.cosmos09.osdinfra.net/cosmos/searchDM/raw/AnaheimRawLogs/Anaheim/RealTimeCheckpoints/Execution0/20241215_application_1733980441401_12313/state/0/1721/.506.delta.936315d9-b58b-4f18-b3ed-b413cd92646f.TID3368893.tmp?property=info]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to