This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-2.4 by this push:
new 55f92a3 [SPARK-28302][CORE] Make sure to generate unique output file
for SparkLauncher on Windows
55f92a3 is described below
commit 55f92a31d7c1a6f02a9b0fc2ace6c5a5e0871ec4
Author: wuyi <[email protected]>
AuthorDate: Tue Jul 9 15:49:31 2019 +0900
[SPARK-28302][CORE] Make sure to generate unique output file for
SparkLauncher on Windows
## What changes were proposed in this pull request?
When using SparkLauncher to submit applications **concurrently** with
multiple threads under **Windows**, some apps would show that "The process
cannot access the file because it is being used by another process" and remains
in LOST state at the end. The issue can be reproduced by this
[demo](https://issues.apache.org/jira/secure/attachment/12973920/Main.scala).
After digging into the code, I find that, Windows cmd `%RANDOM%` would
return the same number if we call it instantly(e.g. < 500ms) after last call.
As a result, SparkLauncher would get same output
file(spark-class-launcher-output-%RANDOM%.txt) for apps. Then, the following
app would hit the issue when it tries to write the same file which has already
been opened for writing by another app.
We should make sure to generate unique output file for SparkLauncher on
Windows to avoid this issue.
## How was this patch tested?
Tested manually on Windows.
Closes #25076 from Ngone51/SPARK-28302.
Authored-by: wuyi <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 925f620570a022ff8229bfde076e7dde6bf242df)
Signed-off-by: HyukjinKwon <[email protected]>
---
bin/spark-class2.cmd | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/bin/spark-class2.cmd b/bin/spark-class2.cmd
index 5da7d7a..34d04c9 100644
--- a/bin/spark-class2.cmd
+++ b/bin/spark-class2.cmd
@@ -63,7 +63,12 @@ if not "x%JAVA_HOME%"=="x" (
rem The launcher library prints the command to be executed in a single line
suitable for being
rem executed by the batch interpreter. So read all the output of the launcher
into a variable.
+:gen
set LAUNCHER_OUTPUT=%temp%\spark-class-launcher-output-%RANDOM%.txt
+rem SPARK-28302: %RANDOM% would return the same number if we call it instantly
after last call,
+rem so we should make it sure to generate unique file to avoid process
collision of writing into
+rem the same file concurrently.
+if exist %LAUNCHER_OUTPUT% goto :gen
"%RUNNER%" -Xmx128m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.launcher.Main %*
> %LAUNCHER_OUTPUT%
for /f "tokens=*" %%i in (%LAUNCHER_OUTPUT%) do (
set SPARK_CMD=%%i
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]