Benjamin-Lemaire commented on issue #1835:
URL: 
https://github.com/apache/iceberg-python/issues/1835#issuecomment-2748182671

   Sure! If you execute this code below on your own AWS account using Glue and 
S3, it creates an Iceberg table using the Glue catalog and adds one row where 
the column bar is initially NULL. The second UPSERT operation is supposed to 
update that NULL to 7, but it doesn't work because the filter sent to the table 
to identify the difference is: 7 != NULL. This condition will never be true 
since, in SQL, comparisons involving NULLs do not yield true or false but 
instead return unknown.  
   
   `import boto3
   import pyarrow
   import pandas
   from pyiceberg.catalog import load_catalog
   
   schema = pyarrow.schema([
       ('foo', pyarrow.string()),
       ('bar', pyarrow.int32()),  # 'bar' is nullable
       ('baz', pyarrow.bool_())
   ])
   
   database = "--> YOUR GLUE DATABASE <--"
   table = "mytable"
   bucket = "--> YOUR S3 BUCKET <--"
   
   # Initialize S3 client
   s3 = boto3.client('s3', region_name='us-east-1')
   
   # Load Catalog
   catalog = load_catalog("default", **{"type": "glue"})
   table_location = f"s3://{bucket}/{database}/{table}"
   
   # Create table in Glue Catalog
   table = catalog.create_table_if_not_exists(
       identifier=(database,table),
       schema=schema,
       location=table_location
   )
   
   # Upsert Data with NULL
   data_with_null = [
       {"foo": "apple", "bar": None, "baz": False}, # Ensuring nullable fields
   ]
   data = 
pyarrow.Table.from_pandas(pandas.DataFrame(data_with_null),schema=schema)
   table.upsert(data,join_cols=["foo"])
       
   # Upsert Data without NULL
   data_without_null = [
       {"foo": "apple", "bar": 7, "baz": False}, 
   ]
   data = 
pyarrow.Table.from_pandas(pandas.DataFrame(data_without_null),schema=schema)
   table.upsert(data,join_cols=["foo"])
   `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to