Re: [I] Cannot use MERGE INTO query on Iceberg table. Getting `java.lang.IllegalArgumentException: Comparison method violates its general contract!` error. [iceberg]

via GitHub Sun, 14 Apr 2024 13:08:24 -0700


flisboac commented on issue #9650:
URL: https://github.com/apache/iceberg/issues/9650#issuecomment-2054171388


   Well, now errors are happening on a `MERGE INTO` as well. Usage is the same 
as what the OP reported. Also, because I'm using PySpark, a more detailed error 
report is somewhat hidden behind PySpark's abstractions, so the only stacktrace 
immediately available is this one:
   
   ```text
   Traceback (most recent call last):
     File 
"/mnt1/yarn/usercache/hadoop/appcache/application_1713122184304_0002/container_1713122184304_0002_01_000001/run_cdc_job.py",
 line 2110, in <module>
       main()
     File 
"/mnt1/yarn/usercache/hadoop/appcache/application_1713122184304_0002/container_1713122184304_0002_01_000001/run_cdc_job.py",
 line 2106, in main
       cdc_processor.run()
     File 
"/mnt1/yarn/usercache/hadoop/appcache/application_1713122184304_0002/container_1713122184304_0002_01_000001/run_cdc_job.py",
 line 344, in run
       self._do_run(execution_context)
     File 
"/mnt1/yarn/usercache/hadoop/appcache/application_1713122184304_0002/container_1713122184304_0002_01_000001/run_cdc_job.py",
 line 412, in _do_run
       self._do_run_cdc(manifest_file, now=now)
     File 
"/mnt1/yarn/usercache/hadoop/appcache/application_1713122184304_0002/container_1713122184304_0002_01_000001/run_cdc_job.py",
 line 602, in _do_run_cdc
       merge_sql_count = self._spark_session.sql(merge_sql, 
args=merge_sql_params).count()
     File 
"/mnt1/yarn/usercache/hadoop/appcache/application_1713122184304_0002/container_1713122184304_0002_01_000001/pyspark.zip/pyspark/sql/session.py",
 line 1631, in sql
     File 
"/mnt1/yarn/usercache/hadoop/appcache/application_1713122184304_0002/container_1713122184304_0002_01_000001/py4j-0.10.9.7-src.zip/py4j/java_gateway.py",
 line 1322, in __call__
     File 
"/mnt1/yarn/usercache/hadoop/appcache/application_1713122184304_0002/container_1713122184304_0002_01_000001/pyspark.zip/pyspark/errors/exceptions/captured.py",
 line 185, in deco
   pyspark.errors.exceptions.captured.IllegalArgumentException: Comparison 
method violates its general contract!
   ``` 
   
   `MERGE INTO` query looks like this:
   
   
   ```sql
   MERGE INTO spark_catalog.DATABASE.TABLE target
   USING __cdc_source_data__ source
   ON (
       1=1
       AND (target.int_field_1 BETWEEN :__CDC_PDFILTER_MIN__int_field_1 AND 
:__CDC_PDFILTER_MAX__int_field_1)
       AND (target.int_field_2 BETWEEN :__CDC_PDFILTER_MIN__int_field_2 AND 
:__CDC_PDFILTER_MAX__int_field_2)
       AND (target.int_field_3 BETWEEN :__CDC_PDFILTER_MIN__int_field_3 AND 
:__CDC_PDFILTER_MAX__int_field_3)
       AND target.int_field_1 <=> source.int_field_1
       AND target.int_field_2 <=> source.int_field_2
       AND target.int_field_3 <=> source.int_field_3
   )
   WHEN MATCHED AND source.operation = 'D' THEN DELETE
   WHEN MATCHED AND source.operation IN ('U', 'I') THEN UPDATE SET
       target.int_field_1 = source.int_field_1,
       target.int_field_2 = source.int_field_2,
       target.int_field_3 = source.int_field_3,
       -- A lot more field assignments here
       target.dt_geracao = cast('2024-04-14T19:20:06.774Z' as timestamp)
   WHEN NOT MATCHED AND source.operation in ('I', 'U') THEN INSERT
       (
           int_field_1,
           int_field_2,
           int_field_3,
           -- A lot more more column names here    
       ) VALUES (
           source.int_field_1,
           source.int_field_2,
           source.int_field_3,
           -- A lot more more column values here   
           cast('2024-04-14T19:20:06.774Z' as timestamp)
       )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Cannot use MERGE INTO query on Iceberg table. Getting `java.lang.IllegalArgumentException: Comparison method violates its general contract!` error. [iceberg]

Reply via email to