abharath9 opened a new issue, #9488:
URL: https://github.com/apache/iceberg/issues/9488
Throwing following error when trying into insert data into the Iceberg table
with not-null columns constraints.
**_Cannot write nullable values to non-null column 'id' exception_**
Here is a sample code to reproduce this issue.
`from pyspark.conf import SparkConf
from pyspark.context import SparkContext
from pyspark.sql import SparkSession
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, concat, when, countDistinct, lit, size
from pyspark.sql.types import StringType
from pyspark.sql.types import StructType, StructField, StringType,
IntegerType, ArrayType
conf = (
SparkConf()
.setAppName("analyst-hero-spark-write-iceberg")
.set("spark.sql.adaptive.coalescePartitions.enabled","true")
.set("spark.sql.parquet.filterPushdown","true")
.set("spark.hadoop.fs.s3a.impl.disable.cache","false")
.set("spark.sql.catalog.spark_catalog",
'org.apache.iceberg.spark.SparkSessionCatalog')
.set("spark.sql.extensions",
'org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions')
.set("spark.hadoop.iceberg.hive.lock-timeout-ms", '180')
.set("spark.hadoop.iceberg.delete-files-on-table-drop", 'true')
.set("spark.hadoop.iceberg.mr.commit.table.thread.pool.size", '1')
.set("spark.hadoop.iceberg.mr.commit.file.thread.pool.size", '1')
.set("spark.hadoop.iceberg.mr.commit.retry.num-retries", '300')
.set("spark.sql.iceberg.check-nullability", 'false')
)
spark =
SparkSession.builder.enableHiveSupport().config(conf=conf).getOrCreate()
spark.sql("CREATE TABLE IF NOT EXISTS
sandbox.iceberg_table_mandatory_column_test (id string not null,name
string,nationality string) USING iceberg")
spark.sql("show create table
sandbox.iceberg_table_mandatory_column_test").select("createtab_stmt").show(truncate=False)
from pyspark.sql import Row
flatData = spark.createDataFrame([Row(id='100', name='bob',
nationality='welsh'),Row(id='200', name='john', nationality='british')])
flatData.show(truncate=False)`
Tried to insert the data using sql-insert into and also using Dataframe API
insertInto both are returning **_Cannot write nullable values to non-null
column 'id' exception_**
`flatData.createOrReplaceTempView("test_data")
spark.sql("insert into sandbox.iceberg_table_mandatory_column_test select *
from test_data")
or
flatData.write.insertInto("sandbox.iceberg_table_mandatory_column_test")`
Complete stack trace in the attachment.
[error-iceberg-test.txt](https://github.com/apache/iceberg/files/13955530/error-iceberg-test.txt)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]