Iceberg Community Meetings are open to everyone. To receive an invitation
to the next meeting, please join the [email protected]
<https://groups.google.com/g/iceberg-sync> list.Special thanks to Ryan Blue
for contributing most of these notes.Attendees: Anjali Norwood, Badrul
Chowdhury, Ben Mears, Dan Weeks, Gustavo Torres Torres, Jack Ye, Karuppayya
Rajendran, Kyle Bendickson, Parth Brahmbhatt, Russel Spitzer, Ryan Blue,
Sreeram Garlapati, Szehon Ho, Wing Yew Poon, Xinbin Huang, Yan Yan, Carl
Steinbach
-
Highlights
-
JDBC catalog was committed (Thanks, Ismail!)
-
DynamoDB catalog was committed (Thanks, Jack!)
-
Added predicate pushdown for partitions metadata table (Thanks,
Szehon!)
-
Releases
-
0.12.0
-
New Actions API update
- Almost done with compaction.
-
Need to make the old API deprecated (to confirm)
-
Spark 3.1 support
-
Recently rebased on master
https://github.com/apache/iceberg/pull/2512
- No longer adds new modules, should be ready to commit.
-
Feature-based or time-based release cycle?
-
Carl: A time-based release cycle would be more predictable, not
slipping because of some feature that isn’t quite ready. This could be
monthly or quarterly.
-
Ryan: We already try not to hold back releases to get features in
because it is better to release more often than to let them
slip. But we
could be better about this. It’s important to continuously
release so that
changes get back out to contributors.
-
The consensus was to discuss this on the dev list. It is a
promising idea.
-
Iceberg 1.0?
-
Carl: Semver is a lie, and there is a public perception around 1.0
releases. Should we go ahead and target a 1.0 soon?
-
Ryan: What do you mean that semver is a lie?
-
Carl: If semver were followed carefully, most projects would be on
a major version in the 100s. Many things change, and the
version doesn’t
always reflect it.
-
Ryan: That’s fair, but I think people still make downstream
decisions based on how those version numbers change.
-
Jack: There is an expectation that breaking changes are signaled
by increasing the major version, or more accurately, that not
increasing
the major version indicates no major APIs are broken.
-
Ryan: Also, bumping up to 1.0 is when people start expecting more
rigid enforcement of semver, even if it isn’t always done. If
we want to
update to 1.0 and/or drop semver, we should figure out our
guarantees and
document them clearly. And we should also prepare for more
API stability.
Maybe add binary compatibility checks to the build.
-
The consensus was to discuss this more on the dev list and target
a 1.0 for later this year with clear guidelines about API
compatibility.
-
New slack community: apache-iceberg.slack.com
<https://communityinviter.com/apps/apache-iceberg/apache-iceberg-website>
-
It’s easy to sign up for ASF Slack here:
https://s.apache.org/slack-invite
-
No need for an independent Iceberg workspace.
-
Any updates on the secondary index design?
- Miao and Guy weren’t at the meeting, so no update.
- Jack is going to look into this and help out.
-
Github triage permissions for project contributors
-
Carl opened an INFRA ticket for anyone with 2 or more contributions
-
We Will see if infra can add everyone.
-
Ref: INFRA-22026, INFRA-22031
-
Updating partitioning via Optimize/RewriteDataFiles
-
Russell: We ran into an issue where compaction with multiple
partition specs will create many small files---planning groups files by
current spec, but writing can split data for the new spec. Since
this is a
rare event (unmerged data in an old spec), the solution is to merge files
for the old spec separately.
-
Ryan: sounds reasonable.
-
Low-latency streaming
-
Sreeram: We are trying to see how frequently we can commit to an
Iceberg table. Looking to get to commits every 1-2 seconds. One
main issue
we’ve found is that there are several metadata files written for every
commit: at least one manifest, the manifest list, and the metadata JSON
file. Plus, the metadata JSON file has many snapshots and gets
quite large
(3MB+) after a day of frequent commits. Is there a way to improve on how
the JSON file tracks snapshots?
-
Ryan: There is space to improve this. I’ve thought about replacing
the JSON file with a database so that changes are more targeted and don’t
require rewriting all of the information. This is supported by the
TableOperations API, which swaps TableMetadata objects. The JSON
file isn’t
really required by the implementation, although it has become popular
because it places all of the table metadata in the file system. So the
source of truth is entirely in the table’s files.
-
Sreeram: What about writing diffs of the JSON files? We could, for
example, write a new snapshot as the only content in a new JSON file.
-
Ryan: You could come up with a way to do that, but what you’d want to
avoid is needing to read lots of files to reconstruct the table’s current
state. If you’re trying to put together the history or snapshots metadata
tables, you don’t want to read the current file, its parent, that file’s
parent, and so on. (That’s an easy design flaw to fall into.) What you
should do instead is choose a base version and write all differences
against that. We’d need to define the format for JSON diffs.
-
Ryan: And, I think it may be more useful to replace the JSON file
with a database because this could introduce more commit conflicts. When
the JSON file is periodically rewritten entirely to produce a new base
version, this operation may fail due to faster commits from
other writers.
That would be bad for a table.
-
Ryan: What is the use case for this? 1-2 seconds is very frequent and
causes other issues, like small data files that need to be
compacted, plus
compaction commit retries because of frequent, ongoing commits.
-
Sreeram: The idea is to see if we can replace Kafka with an Iceberg
table in workflows.
-
Ryan: I don’t think that’s something you’d want to do. Iceberg just
isn’t designed for that kind of use case, and that is what Kafka does
really well.
-
Kyle: Yeah, you’d definitely want to use Kafka for that. Iceberg is
good for long-term storage and isn’t a good replacement.
-
Purge Behaviors
-
Russell: Spark’s new API passes a purge flag through DROP TABLE. Do
we want to respect that flag?
-
Ryan: Yes? Why wouldn’t we?
-
Russell: Not everyone wants to purge data.
-
Ryan: Agreed. Netflix wouldn’t do this because they often have to
restore tables. But that’s something that Netflix can turn off in their
catalog. For the built-in catalogs, we should probably support
the expected
behavior.
-
Deduplication? As part of rewrite
-
Kyle [I think]: What is the story around deduplication? Duplicate
records are a common problem.
-
Ryan: Iceberg didn’t have one before, but now that we have a way to
identify records, thanks to Jack adding the row identifier
fields, we could
build something in this space. Maybe a background service that detects
duplicates and rewrites? But we would want to be careful here because it
could easily attempt to read an entire table if the partition spec is not
aligned with the identifier fields.