[
https://issues.apache.org/jira/browse/PIG-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906709#comment-15906709
]
liyunzhang_intel commented on PIG-5177:
---------------------------------------
[~szita]: thanks for detail explanation. In mr mode, it create a jar and put
the script files into it. mr uploads the jar to hdfs distributed cache and save
the jar path in ${{mapred.job.classpath.files}} of the configuration(detail see
in JobControlCompiler#putJarOnClassPathThroughDistributedCache). So in
ScriptEngine#getScripotAsStream, mr can load the script file from the class
loader in yarn container later. Is my understanding right?
If yes, my question is can spark executor load the script file from the class
loader if we also wrap the script files into a jar, upload the jar to hdfs
distributed cache and save the jar path in ${{mapred.job.classpath.files}}?
> Scripting and StreamingPythonUDFs fail with Spark exec type
> -----------------------------------------------------------
>
> Key: PIG-5177
> URL: https://issues.apache.org/jira/browse/PIG-5177
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: Adam Szita
> Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5177.0.patch, PIG-5177.1.patch, PIG-5177.2.patch
>
>
> We are thrown an exception because the Python script file is not found on the
> backend side (on spark executors).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)