Hi Lian, I don't know much about Spark structured streaming, but judging
from the stacktrace, you're application was trying to access
HftpFileSystem, which is removed in Apache Hadoop 3. Most likely it is
removed in HDP3.0 too (Hortonworks folks can confirm)
This is documented in CDH6.0 release note:
https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_600_incompatible_changes.html#hadoop_600_ic

Please use webhdfs or httpfs instead.

On Thu, Aug 30, 2018 at 9:36 AM Lian Jiang <[email protected]> wrote:

> I am using HDP3.0 which uses HADOOP3.1.0 and Spark 2.3.1. My spark
> streaming jobs running fine in HDP2.6.4 (HADOOP2.7.3, spark 2.2.0) fails in
> HDP3:
>
> java.lang.IllegalAccessError: class
> org.apache.hadoop.hdfs.web.HftpFileSystem cannot access its superinterface
> org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator
>
> at java.lang.ClassLoader.defineClass1(Native Method)
>
> at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>
> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>
> at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>
> at java.lang.Class.forName0(Native Method)
>
> at java.lang.Class.forName(Class.java:348)
>
> at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
>
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
>
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
>
> at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:3268)
>
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3313)
>
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3352)
>
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
>
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
>
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
>
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
>
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
>
> at
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:85)
>
> at
> org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:46)
>
> at
> org.apache.spark.sql.execution.datasources.json.TextInputJsonDataSource$.readFile(JsonDataSource.scala:125)
>
> at
> org.apache.spark.sql.execution.datasources.json.JsonFileFormat$$anonfun$buildReader$2.apply(JsonFileFormat.scala:132)
>
> at
> org.apache.spark.sql.execution.datasources.json.JsonFileFormat$$anonfun$buildReader$2.apply(JsonFileFormat.scala:130)
>
> at
> org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:148)
>
> at
> org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:132)
>
> at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org
> $apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:128)
>
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:182)
>
> at
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:109)
>
> at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
> Source)
>
> at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>
> at
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
>
> at
> org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:216)
>
> at
> org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:108)
>
> at
> org.apache.spark.sql.execution.SortExec$$anonfun$1.apply(SortExec.scala:101)
>
> at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
>
> at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
>
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>
> at org.apache.spark.scheduler.Task.run(Task.scala:109)
>
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
>
> Any idea? Thanks.
>
>
> I sent the same question to spark user group. Sorry if you got it twice
> but this is a little urgent.
>


-- 
A very happy Clouderan

Reply via email to