Re: [I] Dectect schema evolution or partition evolution for append DataFile [iceberg-rust]

via GitHub Thu, 12 Dec 2024 00:47:24 -0800


Fokko commented on issue #777:
URL: https://github.com/apache/iceberg-rust/issues/777#issuecomment-2538223320

This is a very interesting question, that I'm happy to elaborate on.

> But there are some case it can't detect for this way, e.g. partition spec
type <p1: int, p2: int> reorder to <p2: int, p1: int>

This is true for V1 tables, here the field-IDs are omitted and not written
to the files. Therefore for V1 tables there are [special
rules](https://iceberg.apache.org/spec/?column-projection#partition-evolution):

![image](https://github.com/user-attachments/assets/12c756be-de3a-4e87-a23b-39953c6c15be)

The is an issue to create a dedicated API to evolve the partition, that
enforces these rules for V1 tables:
https://github.com/apache/iceberg-rust/issues/732. I think this would be a
great thing to have since violating these rules might brick the table, or even
worse; data corruption.

For V2 tables, we used field-ID projection, where the reader will read and
project the files correctly into the structs based on the Field-IDs. This
allows for re-ordering, and when reading the files, they will be read into the
correct position of the struct. The write order of fields doesn't make any
difference for V2, as they will be re-ordered on read. Of course, I would
suggest keeping the same order as the partition spec, to keep everyone sane.

> Ensure that the partition value schema matches the existing partition spec
in terms of field name or field id.

This ties in with a discussion I had early this week with @c-thiel that
resulted in https://github.com/apache/iceberg-rust/pull/771.

My suggestion was to make the field-ID required regardless of the version
(see https://github.com/apache/iceberg-rust/pull/763). This is safe to do if we
adhere to the imitations of partition-evolution mentioned above. When reading
V1 tables, we can sequentially add the IDs to each of the partition specs,
starting at 1000: `<1000 p1 int, 1001 p2: int>`. This way we can fully rely on
the field-IDs, instead of the order for V1. We can **never** match these on
names. Keep also in mind that this will simplify when someone has a V1 table,
write a couple of peta's, and then upgrades it into a V2 table. Then we still
have to correctly handle the old V1 DataFiles since they are not rewritten.

---

> The partition in DataFile should include types to facilitate validation.
e.g. the field name and field id

I think that's a great thing to do anyway. It isn't super expensive, and
will avoid folks bricking their table. Preferably by field-ID for both V1 and
V2, otherwise order for V1, and field-IDs for V2.

> Append operations need to add validation checks for scheme evolution:
lower bounds, upper_bound.

I'm not sure if I fully understand this one. We know the type in the file,
and we know what to project to. Iceberg currently has a [fairly limited set of
promotions](https://iceberg.apache.org/spec/?column-projection#schema-evolution).
This is because we encode the upper- and lower bound into binary. Based on the
number of bytes, we can safely determine if it is a float (4 bytes), double (8
bytes), and if we need to promote the type based on the current schema.

We can do some cool stuff here, for example, if you query `id >= 2^31+1`
then we know that it doesn't fit into a int field. If you have promoted the
`id` column over time, then we can skip the file based on the schema :) In
PyIceberg/Java we have the
[AboveMax/BelowMin](https://github.com/apache/iceberg-python/blob/547d881948dfe17c92bdde9e5b63a94d095a110d/pyiceberg/expressions/literals.py#L152-L169)
to indicate this. This will be done when we [bind the evaluator to the
schema](https://github.com/apache/iceberg-python/blob/547d881948dfe17c92bdde9e5b63a94d095a110d/pyiceberg/expressions/__init__.py#L674-L687).
Looping in @sdd since he did a lot on this part 🥳

I know that this is a lot of text, hope this helps, and always happy to
elaborate

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Dectect schema evolution or partition evolution for append DataFile [iceberg-rust]

Reply via email to