matepek commented on issue #10003: URL: https://github.com/apache/iceberg/issues/10003#issuecomment-2008652290
What do you mean by that I'm using JDBC catalog? I thought `spark.sql.catalogImplementation = hive` sets it to hive catalog. (I know I have a knowledge gap and I'm trying to catch up so I appreciate if you correct me and explain.) My understanding of spark catalogs that there is always a `spark_catalog` which is a `hive` catalog because of the `spark.sql.catalogImplementation = hive`. Also we created an `iceberg_catalog` which uses `org.apache.iceberg.spark.SparkCatalog` which was good to manage iceberg tables until v1.5, now views too. So before v1.5 we needed to store the views and "non-managed tables" in hive catalog and work together with iceberg (managed) tables. For that we wrapped and set `spark_catalog` using `org.apache.iceberg.spark.SparkSessionCatalog` which meant to delegate functionalities between hive and iceberg catalogs. That worked okay, Actually we needed some customisation because `SparkSessionCatalog` was unable to properly list items from both catalogs so whenever we needed this functionality we list the items of the two catalogs and concatenated the results. So actually it was not working properly even before. It was a necessity to work work views and tables and "non-managed tables". Since v1.5 the listing seems even less reliable (see this issue). But as we are talking more about it I started to think that I might don't need `SparkSessionCatalog` anymore since views are managed entities now by `org.apache.iceberg.spark.SparkCatalog`. I can just use use `iceberg_catalog` by default and whenever there is a rare need for "non-managed" table I can just specify the catalog like `spark_catalog.schema_for_non_managed_tables.table_name`. And I'm good. So now I'm gonna try removing the definition for `spark_catalog` and I hope that it will make this work. BRB. ## REMARKS: by ""non-managed table" I mean something which his not managed by iceberg which is regular hive table. Ex.: ```sql create table schema_name.external_table ( id LONG, dt DATE ) partitioned by (dt) stored as PARQUET location 'gs://bucket/folder/' ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org