amogh-jahagirdar commented on code in PR #10074: URL: https://github.com/apache/iceberg/pull/10074#discussion_r1561209606
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/BaseCatalog.java: ########## @@ -21,19 +21,31 @@ import org.apache.iceberg.spark.procedures.SparkProcedures; import org.apache.iceberg.spark.procedures.SparkProcedures.ProcedureBuilder; import org.apache.iceberg.spark.source.HasIcebergCatalog; +import org.apache.iceberg.util.PropertyUtil; import org.apache.spark.sql.catalyst.analysis.NoSuchProcedureException; import org.apache.spark.sql.connector.catalog.Identifier; import org.apache.spark.sql.connector.catalog.StagingTableCatalog; import org.apache.spark.sql.connector.catalog.SupportsNamespaces; import org.apache.spark.sql.connector.iceberg.catalog.Procedure; import org.apache.spark.sql.connector.iceberg.catalog.ProcedureCatalog; +import org.apache.spark.sql.util.CaseInsensitiveStringMap; abstract class BaseCatalog implements StagingTableCatalog, ProcedureCatalog, SupportsNamespaces, HasIcebergCatalog, SupportsFunctions { + /** + * Controls whether to mark all the fields as nullable when executing CREATE/REPLACE TABLE ... AS + * SELECT ... and creating the table. If false, fields' nullability will be preserved when + * creating the table. + */ + private static final String TABLE_CREATE_NULLABLE_QUERY_SCHEMA = "use-nullable-query-schema"; Review Comment: Not super opinionated though, if there's something in Spark that's already following this naming then I'd agree to just follow that since it's less of a burden on a user to be aware of these different namings. ########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/BaseCatalog.java: ########## @@ -21,19 +21,31 @@ import org.apache.iceberg.spark.procedures.SparkProcedures; import org.apache.iceberg.spark.procedures.SparkProcedures.ProcedureBuilder; import org.apache.iceberg.spark.source.HasIcebergCatalog; +import org.apache.iceberg.util.PropertyUtil; import org.apache.spark.sql.catalyst.analysis.NoSuchProcedureException; import org.apache.spark.sql.connector.catalog.Identifier; import org.apache.spark.sql.connector.catalog.StagingTableCatalog; import org.apache.spark.sql.connector.catalog.SupportsNamespaces; import org.apache.spark.sql.connector.iceberg.catalog.Procedure; import org.apache.spark.sql.connector.iceberg.catalog.ProcedureCatalog; +import org.apache.spark.sql.util.CaseInsensitiveStringMap; abstract class BaseCatalog implements StagingTableCatalog, ProcedureCatalog, SupportsNamespaces, HasIcebergCatalog, SupportsFunctions { + /** + * Controls whether to mark all the fields as nullable when executing CREATE/REPLACE TABLE ... AS + * SELECT ... and creating the table. If false, fields' nullability will be preserved when + * creating the table. + */ + private static final String TABLE_CREATE_NULLABLE_QUERY_SCHEMA = "use-nullable-query-schema"; Review Comment: Hm, if there's a pointer to where `use-nullable-query-schema` is in the Spark API that would be good to see (I couldn't find anything with searching). IMO something that mentions "preserve" would be a better verb rather than "use" as well as something that clarifies this applies for CTAS/RTAS, since it's a bit more clear that we are essentially preserving the nullability from the source query. So something like `preserve-ctas-rtas-nullability` feels a bit more direct. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org