dependabot[bot] opened a new pull request, #10320: URL: https://github.com/apache/iceberg/pull/10320
Bumps [io.delta:delta-spark_2.12](https://github.com/delta-io/delta) from 3.1.0 to 3.2.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/delta-io/delta/releases">io.delta:delta-spark_2.12's releases</a>.</em></p> <blockquote> <h2>Delta Lake 3.2.0</h2> <p>We are excited to announce the release of Delta Lake 3.2.0! This release includes several exciting new features.</p> <h2>Highlights</h2> <ul> <li><a href="https://github.com/delta-io/delta/commit/4456a122929b834e5c2652f99cc64ff8a71f4113">Support for Liquid clustering</a> to reduce write amplification using incremental clustering.</li> <li>Preview <a href="https://github.com/delta-io/delta/commit/9b3fa0a1a05e51b38cec083afb41226beb399b0f">support for Type Widening</a> to allow users to change the type of columns without having to rewrite data.</li> <li>Preview <a href="https://github.com/delta-io/delta/commit/902830369662f5a84e987b3a97e23f916da104ca">support</a> for <a href="https://hudi.apache.org/">Apache Hudi</a> in Delta UniForm tables.</li> </ul> <h2>Delta Spark</h2> <p>Delta Spark 3.2.0 is built on <a href="https://spark.apache.org/releases/spark-release-3-5-0.html">Apache Spark™ 3.5</a>. Similar to Apache Spark, we have released Maven artifacts for both Scala 2.12 and Scala 2.13.</p> <ul> <li>Documentation: <a href="https://docs.delta.io/3.2.0/index.html">https://docs.delta.io/3.2.0/index.html</a></li> <li>API documentation: <a href="https://docs.delta.io/3.2.0/delta-apidoc.html#delta-spark">https://docs.delta.io/3.2.0/delta-apidoc.html#delta-spark</a></li> <li>Maven artifacts: <a href="https://repo1.maven.org/maven2/io/delta/delta-spark_2.12/3.2.0/">delta-spark_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-spark_2.13/3.2.0/">delta-spark_2.13</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-contribs_2.12/3.2.0/">delta-contribs_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-contribs_2.13/3.2.0/">delta_contribs_2.13</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-storage/3.2.0/">delta-storage</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-storage-s3-dynamodb/3.2.0/">delta-storage-s3-dynamodb</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-iceberg_2.12/3.2.0/">delta-iceberg_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-iceberg_2.13/3.2.0/">delta-iceberg_2.13</a></li> <li>Python artifacts: <a href="https://pypi.org/project/delta-spark/3.2.0/">https://pypi.org/project/delta-spark/3.2.0/</a></li> </ul> <p>The key features of this release are:</p> <ul> <li><a href="https://redirect.github.com/delta-io/delta/issues/1874">Support for Liquid clustering</a>: This allows for <a href="https://github.com/delta-io/delta/commit/4456a122929b834e5c2652f99cc64ff8a71f4113">incremental clustering</a> based on ZCubes and reduces the write amplification by not touching files already well clustered (i.e., files in stable ZCubes). Users can now use the <a href="https://github.com/delta-io/delta/commit/6f4e05197">ALTER TABLE CLUSTER BY</a> syntax to change clustering columns and use the DESCRIBE DETAIL command to check the clustering columns. In addition, Delta Spark now supports DeltaTable <code>clusterBy</code> API in both Python and Scala to allow creating clustered tables using DeltaTable API. See the <a href="https://docs.delta.io/3.2.0/delta-clustering.html">documentation</a> and <a href="https://github.com/delta-io/delta/blob/branch-3.2/examples/scala/src/main/scala/example/Clustering.scala">examples</a> for more information.</li> <li>Preview <a href="https://github.com/delta-io/delta/commit/9b3fa0a1a05e51b38cec083afb41226beb399b0f">support for Type Widening</a>: Delta Spark can now change the type of a column from <code>byte</code> to <code>short</code> to <code>integer</code> using the <a href="https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-alter-table.html#alter-or-change-column">ALTER TABLE t CHANGE COLUMN col TYPE type</a> command or with schema evolution during MERGE and INSERT operations. The table remains readable by Delta 3.2 readers without requiring the data to be rewritten. For compatibility with older versions, a rewrite of the data can be triggered using the <code>ALTER TABLE t DROP FEATURE 'typeWidening-preview’</code> command. <ul> <li>Note that this feature is in preview and that tables created with this preview feature enabled may not be compatible with future Delta Spark releases.</li> </ul> </li> <li><a href="https://github.com/delta-io/delta/commit/7d41fb7bbf63af33ad228007dd6ba3800b4efe81">Support for Vacuum Inventory</a>: Delta Spark now extends the VACUUM SQL command to allow users to specify an inventory table in a VACUUM command. When an inventory table is provided, VACUUM will consider the files listed there instead of doing the full listing of the table directory, which can be time consuming for very large tables. See the docs <a href="https://docs.delta.io/3.2.0/delta-utility.html#inventory-table">here</a>.</li> <li><a href="https://github.com/delta-io/delta/commit/2e197f130765d91f201b6b649f30190a44304b29">Support for Vacuum Writer Protocol Check</a>: Delta Spark can now support <code>vacuumProtocolCheck</code> ReaderWriter feature which ensures consistent application of reader and writer protocol checks during <code>VACUUM</code> operations, addressing potential protocol discrepancies and mitigating the risk of data corruption due to skipped writer checks.</li> <li>Preview <a href="https://github.com/delta-io/delta/commit/b15a2c97432c8892f986c1526ceb2c3f63ed5d2c">support for In-Commit Timestamps</a>: When enabled, this <a href="https://redirect.github.com/delta-io/delta/issues/2532">preview feature</a> persists monotonically increasing timestamps within Delta commits, ensuring they are not affected by file operations. When enabled, time travel queries will yield consistent results, even if the table directory is relocated. <ul> <li>Note that this feature is in preview and that tables created with this preview feature enabled may not be compatible with future Delta Spark releases.</li> </ul> </li> <li>Deletion Vectors Read Performance Improvements: Two improvements were introduced to DVs in Delta 3.2. <ul> <li><a href="https://github.com/delta-io/delta/commit/be7183bef85feaebfc928d5f291c5a90246cde87">Removing broadcasting of DV information to executors</a>: This work improves stability by reducing drivers’ memory consumption, preventing potential Driver OOM for very large Delta tables like 1TB+. This work also improves performance by saving us fixed broadcasting overhead in reading small Delta Tables.</li> <li><a href="https://redirect.github.com/delta-io/delta/pull/2982">Supporting predicate pushdown and splitting in scans with DVs</a>: Improving performance of DV reads with filters queries thanks to predicate pushdown and splitting. This feature gains 2x performance improvement on average.</li> </ul> </li> <li><a href="https://github.com/delta-io/delta/commit/23b7c17628c21881fbefd04db11a31c973205d95">Support for Row Tracking</a>: Delta Spark can now write to tables that maintain information that allows identifying rows across multiple versions of a Delta table. Delta Spark can now also access this tracking information using the two metadata fields <code>_metadata.row_id</code> and <code>_metadata.row_commit_version</code>.</li> </ul> <p>Other notable changes include:</p> <ul> <li><a href="https://github.com/delta-io/delta/commit/8b4b6cce7071046da3d6d3fda4b85120a7445771">Delta Sharing</a>: reduce the minimum RPC interval in delta sharing streaming from 30 seconds to 10 seconds</li> <li><a href="https://github.com/delta-io/delta/commit/bba0e94f0">Improve</a> the performance of write operations by skipping collecting commit stats</li> <li><a href="https://github.com/delta-io/delta/commit/3f0496ba3">New SQL configurations</a> to specify Delta Log cache size (<code>spark.databricks.delta.delta.log.cacheSize</code>) and retention duration (<code>spark.databricks.delta.delta.log.cacheRetentionMinutes</code>)</li> <li><a href="https://github.com/delta-io/delta/commit/8db9617b5">Fix</a> bug in plan validation due to inconsistent field metadata in MERGE</li> <li><a href="https://github.com/delta-io/delta/commit/ef751d236">Improved</a> metrics during VACUUM for better visibility</li> <li>Hive Metastore schema sync: The truncation threshold for schemas with long fields is now <a href="https://github.com/delta-io/delta/commit/3c09d95a34b71fff20cb23753c65af95da5cb48f">user configurable</a></li> </ul> <h2>Delta Universal Format (UniForm)</h2> <ul> <li>Documentation: <a href="https://docs.delta.io/3.2.0/delta-uniform.html">https://docs.delta.io/3.2.0/delta-uniform.html</a></li> <li>Maven artifacts: <a href="https://repo1.maven.org/maven2/io/delta/delta-iceberg_2.12/3.2.0/">delta-iceberg_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-iceberg_2.13/3.2.0/">delta-iceberg_2.13</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-hudi_2.12/3.2.0/">delta-hudi_2.12</a>, <a href="https://repo1.maven.org/maven2/io/delta/delta-hudi_2.13/3.2.0/">delta-hudi_2.13</a></li> </ul> <p>Hudi is now <a href="https://github.com/delta-io/delta/commit/902830369662f5a84e987b3a97e23f916da104ca">supported</a> by Delta Universal format in addition to Iceberg. Writing to a Delta UniForm table can generate Hudi metadata, alongside Delta. This feature is contributed by XTable.</p> <p>Create a UniForm-enabled that automatically generates Hudi metadata using the following command:</p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/delta-io/delta/commit/4e7a342348f5d746d6407f79f43e8281281c7b49"><code>4e7a342</code></a> Setting version to 3.2.0</li> <li><a href="https://github.com/delta-io/delta/commit/03759c960ca6ce1e67b1a8eca99e75522e4e02a3"><code>03759c9</code></a> [3.2][Kernel][Writes] Allow transaction retries for blind append (<a href="https://redirect.github.com/delta-io/delta/issues/3055">#3055</a>)</li> <li><a href="https://github.com/delta-io/delta/commit/4ae6df6ceb09cd1f0d967a4abdc2afb434f921c2"><code>4ae6df6</code></a> [3.2][Kernel][Writes] Support idempotent writes (<a href="https://redirect.github.com/delta-io/delta/issues/3051">#3051</a>)</li> <li><a href="https://github.com/delta-io/delta/commit/f365eb0929350e9bcc3743d64c5a1f101c80261f"><code>f365eb0</code></a> Add documentation link for Vacuum Protocol Check (<a href="https://redirect.github.com/delta-io/delta/issues/3041">#3041</a>)</li> <li><a href="https://github.com/delta-io/delta/commit/8cb2e7826914033698b0cc7e4435c7c2065c75db"><code>8cb2e78</code></a> [Doc] Type Widening documentation (<a href="https://redirect.github.com/delta-io/delta/issues/3025">#3025</a>)</li> <li><a href="https://github.com/delta-io/delta/commit/1ba483226363d8056ecb2ffb029ac63f4c1ac5a9"><code>1ba4832</code></a> [Spark][3.2] Fix CommitInfo.inCommitTimestamp deserialization for very small ...</li> <li><a href="https://github.com/delta-io/delta/commit/6453fe50b7e13c3d4bee1c6710e075186cdcda6a"><code>6453fe5</code></a> [3.2][Kernel][Writes] Add support of inserting data into tables (<a href="https://redirect.github.com/delta-io/delta/issues/3030">#3030</a>)</li> <li><a href="https://github.com/delta-io/delta/commit/fe5d931603c8ccc1c6af04f9cb3586b1d78210b3"><code>fe5d931</code></a> [3.2][Kernel][Writes] APIs and impl. for creating new tables (<a href="https://redirect.github.com/delta-io/delta/issues/3016">#3016</a>)</li> <li><a href="https://github.com/delta-io/delta/commit/c8bbd5b6db29ab4607a34ffb5b5023f78fe61869"><code>c8bbd5b</code></a> [Kernel] Refactor all user-facing exceptions to be "KernelExceptions" (<a href="https://redirect.github.com/delta-io/delta/issues/3014">#3014</a>)</li> <li><a href="https://github.com/delta-io/delta/commit/f4555f545895ac474bce01fa4f84e7795a4fb0ea"><code>f4555f5</code></a> [Kernel] Remove unused <code>ExpressionHandler.isSupported(...)</code> for now (<a href="https://redirect.github.com/delta-io/delta/issues/3018">#3018</a>)</li> <li>Additional commits viewable in <a href="https://github.com/delta-io/delta/compare/v3.1.0...v3.2.0">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org