aihuaxu commented on code in PR #16516: URL: https://github.com/apache/iceberg/pull/16516#discussion_r3290329952
########## site/docs/blog/posts/2026-05-19-iceberg-1.11.0-release.md: ########## @@ -0,0 +1,194 @@ +--- +date: 2026-05-19 +title: Apache Iceberg 1.11.0 Release +slug: apache-iceberg-1.11.0-release +authors: + - iceberg-pmc +categories: + - release +--- + +<!-- + - Licensed to the Apache Software Foundation (ASF) under one or more + - contributor license agreements. See the NOTICE file distributed with + - this work for additional information regarding copyright ownership. + - The ASF licenses this file to You under the Apache License, Version 2.0 + - (the "License"); you may not use this file except in compliance with + - the License. You may obtain a copy of the License at + - + - http://www.apache.org/licenses/LICENSE-2.0 + - + - Unless required by applicable law or agreed to in writing, software + - distributed under the License is distributed on an "AS IS" BASIS, + - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + - See the License for the specific language governing permissions and + - limitations under the License. + --> + +The Apache Iceberg community is pleased to announce the release of Apache Iceberg 1.11.0. This release is the result of over **1,000 commits** from **200+ contributors**. See the [release notes](https://iceberg.apache.org/releases/#1110-release) for the complete list of changes. + +<!-- more --> + +## Release Highlights + +### REST Catalog: A More Complete Protocol + +1.11.0 represents the most significant step forward for the REST catalog protocol since it was introduced. + +[Remote scan planning](https://github.com/apache/iceberg/pull/13400) allows catalog servers to plan table scans and stream back file scan tasks directly. Previously, every client had to fetch manifest lists and manifests itself to determine which files to read. With server-side planning, clients receive only the relevant scan tasks, reducing driver memory pressure and enabling server-side optimizations that are transparent to the query engine. This release extends remote scan planning to cover [incremental scans](https://github.com/apache/iceberg/pull/14661) for Structured Streaming workloads and [metadata tables](https://github.com/apache/iceberg/pull/14881) such as `history` and `snapshots`. A [per-table override](https://github.com/apache/iceberg/pull/15572) allows individual tables to opt out of catalog-level scan planning mode when needed. + +[Freshness-aware table loading](https://github.com/apache/iceberg/pull/14398) adds ETag-based caching to table metadata. When a client loads a table it already has metadata for, the server can return a `304 Not Modified` response instead of re-sending the full metadata payload, cutting unnecessary round-trips in tight read loops and interactive workloads. + +[Idempotency key support](https://github.com/apache/iceberg/pull/14740) introduces a standard `Idempotency-Key` header for mutating catalog operations. Retried writes — commits, creates, and drops — are now guaranteed not to execute twice, preventing duplicate snapshots and corrupted state from network timeouts. + +[Register View](https://github.com/apache/iceberg/pull/14868) completes the view lifecycle in the REST catalog. Just as tables can be registered from their metadata location, views can now be [registered via the REST API](https://github.com/apache/iceberg/pull/14870) too — enabling cross-catalog migrations and re-attaching orphaned view metadata. + +[Custom Table and View Operations](https://github.com/apache/iceberg/pull/14465) can now be injected into the REST catalog, allowing users to extend or override default `TableOperations` and `ViewOperations` behavior without forking the catalog implementation. + +### OpenAPI Specification Updates + +Several protocol-level additions land in the OpenAPI spec this release, tightening the contract between clients and catalog servers. + +- **Namespace separator configurable by server**: [The server can now advertise a custom namespace separator](https://github.com/apache/iceberg/pull/14448) in the config endpoint, allowing catalogs that use separators other than `.` to communicate this to clients without out-of-band configuration. +- **ETag for `CommitTableResponse`**: [ETag support on commit responses](https://github.com/apache/iceberg/pull/14760) enables clients to detect whether a concurrent write changed the table between their load and commit, complementing the existing ETag on `LoadTableResult`. +- **S3 signing endpoint promoted to main spec**: [The S3 signing endpoint](https://github.com/apache/iceberg/pull/15450) moves from an extension into the main OpenAPI spec, making it an official part of the REST catalog protocol. +- **Partition statistics in `TableUpdate`**: [`SetPartitionStatisticsUpdate` and `RemovePartitionStatisticsUpdate`](https://github.com/apache/iceberg/pull/14957) are now included in the `TableUpdate` union type, allowing partition stats to be managed through the standard commit path. +- **Storage credentials in scan planning responses**: [Storage credentials are now returned](https://github.com/apache/iceberg/pull/15524) in `PlanTableScanResponse` and `FetchPlanningResultResponse` when the `include-credentials` flag is set, so clients performing remote scan planning can access data without a separate credential fetch. + +### Spec: SQL UDFs and Geospatial Types + +The [SQL UDF Specification](https://github.com/apache/iceberg/pull/14117) introduces a new spec for storing SQL user-defined functions in Iceberg catalogs. UDFs are versioned, support multiple SQL dialects, and are portable across engines, bringing function management into the catalog layer for the first time. + +[Geospatial bounding box types](https://github.com/apache/iceberg/pull/12667) add native bounding box types and an `INTERSECTS` predicate to Iceberg's type system, enabling spatial partition pruning and file skipping for geospatial workloads directly on Iceberg tables. [Restrictions for geometry types in V3](https://github.com/apache/iceberg/pull/14250) are also clarified in this release. + +Several smaller but meaningful spec additions round out the release: + +- **`added-rows` in snapshot fields**: The [`added-rows` field is restored to snapshot metadata](https://github.com/apache/iceberg/pull/14048), giving engines and monitoring tools a reliable row count per snapshot without scanning data files. +- **`referenced-by` in `loadTable` response**: [`loadTable` now returns a `referenced-by` field](https://github.com/apache/iceberg/pull/13810) listing views and other objects that depend on the table, making dependency tracking possible at the protocol level. +- **`scan-planning-mode` in `LoadTableResult`**: [The server can now advertise its preferred scan planning mode](https://github.com/apache/iceberg/pull/14867) in the `LoadTableResult` config, letting clients know upfront whether to use remote or local scan planning without probing. +- **404 for missing warehouse on config endpoint**: [The `/v1/config` endpoint now returns 404](https://github.com/apache/iceberg/pull/15746) when the requested warehouse does not exist, replacing the previous ambiguous error. + +### Performance and Reliability + +[LIMIT pushdown to scan](https://github.com/apache/iceberg/pull/14615) stops scanning after enough rows are found when a query includes a `LIMIT` clause, rather than reading all matching files. For exploratory queries, this can reduce I/O by orders of magnitude. + +Vectorized reads now cover additional Parquet encodings, eliminating the row-at-a-time fallback for [BYTE_STREAM_SPLIT](https://github.com/apache/iceberg/pull/15373), [DELTA_LENGTH_BYTE_ARRAY, and DELTA_BYTE_ARRAY](https://github.com/apache/iceberg/pull/15362). This is particularly impactful for scientific and ML datasets using float or double columns with `BYTE_STREAM_SPLIT` encoding. + +[Snapshot expiration cleanup modes](https://github.com/apache/iceberg/pull/14287) introduce a new `cleanupMode` API that gives finer control over what gets cleaned up when snapshots expire. + +[Unique table locations](https://github.com/apache/iceberg/pull/12892) via a new catalog property append a UUID to table storage paths, preventing a data loss scenario where `DeleteOrphanFiles` could remove files from a renamed table. This also enables per-table storage lifecycle policies and cost attribution. + +Scheduled credential refresh for [AWS S3FileIO](https://github.com/apache/iceberg/pull/15678) and [GCS FileIO](https://github.com/apache/iceberg/pull/15696) proactively rotates credentials before they expire, eliminating transient failures in long-running Spark and Flink jobs that outlive their initial credential lease. + +The [GCSAnalyticsCore library](https://github.com/apache/iceberg/pull/14333) is now integrated into GCSFileIO, bringing analytics-optimized I/O for Google Cloud Storage. The library improves read throughput for large-scale analytical workloads on GCS, complementing the existing AWS Analytics Accelerator integration on S3. + +### Format V4 Foundations + +1.11.0 begins laying the groundwork for Table Format V4. + +New foundational types — [TrackedFile, TrackingInfo, ContentInfo, and ManifestStats](https://github.com/apache/iceberg/pull/15049) — are the building blocks for V4's adaptive metadata tree. These interfaces define how Iceberg will track files at scale, with [implementations](https://github.com/apache/iceberg/pull/15854), [builders](https://github.com/apache/iceberg/pull/16092), and [partition support](https://github.com/apache/iceberg/pull/16253) being added iteratively across the release cycle. + +The new [FormatModel abstraction](https://github.com/apache/iceberg/pull/12774) replaces hardcoded file format handling with a pluggable interface. [Parquet](https://github.com/apache/iceberg/pull/15253), [ORC](https://github.com/apache/iceberg/pull/15255), Avro, and Arrow each now implement a `FormatModel` contract, making it simpler to add new formats or customize read/write behavior. + +### Encryption Review Comment: I will check more on this. @ggershinsky If you have text line available, please share as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
