hadoop-azure: "StorageException: The specified Rest Version is Unsupported"

Phillip Henry Thu, 27 Feb 2020 03:51:28 -0800

I've built Spark 3.0.0-preview2 with the -Phadoop-3.2 profile switch and
deployed it via Kubernetes.


I launch Spark with a switch to pull in the relevant Hadoop/Azure
dependencies:

 --packages
org.apache.hadoop:hadoop-azure:3.2.0,org.apache.hadoop:hadoop-azure-datalake:3.2.0

and see that com.microsoft.azure#azure-storage;7.0.0 is indeed pulled in.

I can see files using a blob.core.windows.net URL but the
dfs.core.windows.net throws an Exception saying "The specified Rest Version
is Unsupported".

I use tcpdump and see that my client is indeed using:

x-ms-version: 2017-07-29

in its HTTP headers.

If I upgrade to azure-storage:8.6.0, I see in the HTTP headers:

x-ms-version: 2019-02-02

and the job gets slightly further but reading the Parquet file now fails
with "Incorrect Blob type, please use the correct Blob type to access a
blob on the server. Expected BLOCK_BLOB, actual UNSPECIFIED".

This is not overly surprising as I am shoe-horning in a binary that Hadoop
was unprepared for. I just did this to demonstrate that this version of the
library seems to talk to Azure as its version is more recent.

Does anybody have any ideas on how I can talk to Azure?

[Note: for various non-technical reasons, I cannot use HDInsight nor
DataBricks.]

Kind regards,

Phillip

hadoop-azure: "StorageException: The specified Rest Version is Unsupported"

Reply via email to