thompson0012 opened a new issue, #6171: URL: https://github.com/apache/iceberg/issues/6171
### Apache Iceberg version 1.0.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 as the official docker image has several problem i use the pyspark image with the following jars as instead but when i reading the parquet files, it occurs ```java Py4JJavaError: An error occurred while calling o43.parquet. : java.lang.NoClassDefFoundError: scala/$less$colon$less at org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions.$anonfun$apply$2(IcebergSparkSessionExtensions.scala:50) at org.apache.spark.sql.SparkSessionExtensions.$anonfun$buildResolutionRules$1(SparkSessionExtensions.scala:152) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.sql.SparkSessionExtensions.buildResolutionRules(SparkSessionExtensions.scala:152) at org.apache.spark.sql.internal.BaseSessionStateBuilder.customResolutionRules(BaseSessionStateBuilder.scala:216) at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anon$1.<init>(BaseSessionStateBuilder.scala:190) at org.apache.spark.sql.internal.BaseSessionStateBuilder.analyzer(BaseSessionStateBuilder.scala:182) at org.apache.spark.sql.internal.BaseSessionStateBuilder.$anonfun$build$2(BaseSessionStateBuilder.scala:360) at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:87) at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:87) at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:91) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:89) at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:444) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228) at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210) at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:562) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.base/java.lang.Thread.run(Thread.java:833) Caused by: java.lang.ClassNotFoundException: scala.$less$colon$less at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520) ... 47 more ``` my pyspark configuration as follow: ```python from pyspark.sql import SparkSession builder = SparkSession.builder.appName('iceberg')\ .config('spark.jars.packages','org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:1.0.0')\ .config('spark.sql.extensions','org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions')\ .config('spark.sql.catalog.local','org.apache.iceberg.spark.SparkSessionCatalog')\ .config('spark.sql.catalog.spark_catalog.type','hadoop') spark = builder.getOrCreate() df = spark.read.parquet('./yellow_tripdata_2022-01.parquet') ``` it doesn't have any problem when i didn't add any iceberg configuration -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org