liurenjie1024 commented on code in PR #11772: URL: https://github.com/apache/iceberg/pull/11772#discussion_r1884789740
########## site/docs/status.md: ########## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- +<!-- + - Licensed to the Apache Software Foundation (ASF) under one or more + - contributor license agreements. See the NOTICE file distributed with + - this work for additional information regarding copyright ownership. + - The ASF licenses this file to You under the Apache License, Version 2.0 + - (the "License"); you may not use this file except in compliance with + - the License. You may obtain a copy of the License at + - + - http://www.apache.org/licenses/LICENSE-2.0 + - + - Unless required by applicable law or agreed to in writing, software + - distributed under the License is distributed on an "AS IS" BASIS, + - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + - See the License for the specific language governing permissions and + - limitations under the License. + --> + +# Implementations Status + +Apache iceberg now has implementations of the iceberg spec in multiple languages. This page provides a summary of the +current status of these implementations. + +## Versions + +This section describes the versions of each implementation that are being tracked in this page. + +| Language | Version | +|-----------|---------| +| Java | 1.7.1 | +| PyIceberg | 0.7.0 | +| Rust | 0.3.0 | +| Go | 0.1.0 | + +## Data Types + +| Data Type | Java | PyIceberg | Rust | Go | +|----------------|------|-----------|------|----| +| boolean | Y | Y | Y | Y | +| int | Y | Y | Y | Y | +| long | Y | Y | Y | Y | +| float | Y | Y | Y | Y | +| double | Y | Y | Y | Y | +| decimal | Y | Y | Y | Y | +| date | Y | Y | Y | Y | +| time | Y | Y | Y | Y | +| timestamp | Y | Y | Y | Y | +| timestamptz | Y | Y | Y | Y | +| timestamp_ns | Y | Y | Y | Y | +| timestamptz_ns | Y | Y | Y | Y | +| string | Y | Y | Y | Y | +| uuid | Y | Y | Y | Y | +| fixed | Y | Y | Y | Y | +| binary | Y | Y | Y | Y | + +## Data File Formats + +| Format | Java | PyIceberg | Rust | Go | +|---------|------|-----------|------|----| +| Parquet | Y | Y | Y | Y | +| ORC | Y | N | N | N | +| Puffin | Y | N | N | N | Review Comment: For puffin, I think we should split the capabilities into different part: 1. Basic support for puffin format, e.g. read/write capability, and this is what the file format section means. 2. Planning with puffin table statis, this should appear in table read part 3. Reading/write puffin deletion vector, this should appear in table read/write part. What do you think? ########## site/docs/status.md: ########## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- +<!-- + - Licensed to the Apache Software Foundation (ASF) under one or more + - contributor license agreements. See the NOTICE file distributed with + - this work for additional information regarding copyright ownership. + - The ASF licenses this file to You under the Apache License, Version 2.0 + - (the "License"); you may not use this file except in compliance with + - the License. You may obtain a copy of the License at + - + - http://www.apache.org/licenses/LICENSE-2.0 + - + - Unless required by applicable law or agreed to in writing, software + - distributed under the License is distributed on an "AS IS" BASIS, + - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + - See the License for the specific language governing permissions and + - limitations under the License. + --> + +# Implementations Status + +Apache iceberg now has implementations of the iceberg spec in multiple languages. This page provides a summary of the +current status of these implementations. + +## Versions + +This section describes the versions of each implementation that are being tracked in this page. + +| Language | Version | +|-----------|---------| +| Java | 1.7.1 | +| PyIceberg | 0.7.0 | +| Rust | 0.3.0 | +| Go | 0.1.0 | + +## Data Types + +| Data Type | Java | PyIceberg | Rust | Go | +|----------------|------|-----------|------|----| +| boolean | Y | Y | Y | Y | +| int | Y | Y | Y | Y | +| long | Y | Y | Y | Y | +| float | Y | Y | Y | Y | +| double | Y | Y | Y | Y | +| decimal | Y | Y | Y | Y | +| date | Y | Y | Y | Y | +| time | Y | Y | Y | Y | +| timestamp | Y | Y | Y | Y | +| timestamptz | Y | Y | Y | Y | +| timestamp_ns | Y | Y | Y | Y | +| timestamptz_ns | Y | Y | Y | Y | +| string | Y | Y | Y | Y | +| uuid | Y | Y | Y | Y | +| fixed | Y | Y | Y | Y | +| binary | Y | Y | Y | Y | + +## Data File Formats + +| Format | Java | PyIceberg | Rust | Go | +|---------|------|-----------|------|----| +| Parquet | Y | Y | Y | Y | +| ORC | Y | N | N | N | +| Puffin | Y | N | N | N | + +## File IO + +| Storage | Java | PyIceberg | Rust | Go | +|----------------------|------|-----------|------|----| +| Local Filesystem | Y | Y | Y | Y | +| Hadoop Filesystem | Y | Y | Y | Y | +| Aws S3 | Y | Y | Y | Y | +| Google Cloud Storage | Y | Y | Y | Y | +| Memory Fs | Y | Y | Y | Y | + +## Table Update Operations + +### Table Spec V1 + +| Operation | Java | PyIceberg | Rust | Go | +|-----------------------------|------|-----------|------|----| +| Update schema | Y | N | Y | N | +| Update partition spec | Y | N | Y | N | +| Update table properties | Y | Y | Y | N | +| Replace sort order | Y | N | N | N | +| Update table location | Y | N | N | N | +| Append data files | Y | Y | N | N | +| Rewrite files | Y | Y | N | N | +| Rewrite manifests | Y | Y | N | N | +| Overwrite files | Y | Y | N | N | +| Row delta | Y | N | N | N | +| Delete files | Y | N | N | N | +| Update statistics | Y | N | N | N | +| Update partition statistics | Y | N | N | N | +| Expire snapshots | Y | N | N | N | +| Manage snapshots | Y | N | N | N | + +### Table Spec V2 + Review Comment: Do you mean to split table update operations into two parts: 1. Data related such as append file, delete files, row delta etc? 2. Table maintaince related such as update statistics, manage snapshots, etc? ########## site/docs/status.md: ########## @@ -0,0 +1,362 @@ +--- +title: "Implementation Status" +--- +<!-- + - Licensed to the Apache Software Foundation (ASF) under one or more + - contributor license agreements. See the NOTICE file distributed with + - this work for additional information regarding copyright ownership. + - The ASF licenses this file to You under the Apache License, Version 2.0 + - (the "License"); you may not use this file except in compliance with + - the License. You may obtain a copy of the License at + - + - http://www.apache.org/licenses/LICENSE-2.0 + - + - Unless required by applicable law or agreed to in writing, software + - distributed under the License is distributed on an "AS IS" BASIS, + - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + - See the License for the specific language governing permissions and + - limitations under the License. + --> + +# Implementations Status + +Apache iceberg now has implementations of the iceberg spec in multiple languages. This page provides a summary of the +current status of these implementations. + +## Versions + +This section describes the versions of each implementation that are being tracked in this page. + +| Language | Version | +|-----------|---------| +| Java | 1.7.1 | +| PyIceberg | 0.7.0 | +| Rust | 0.3.0 | +| Go | 0.1.0 | + +## Data Types + +| Data Type | Java | PyIceberg | Rust | Go | +|----------------|------|-----------|------|----| +| boolean | Y | Y | Y | Y | +| int | Y | Y | Y | Y | +| long | Y | Y | Y | Y | +| float | Y | Y | Y | Y | +| double | Y | Y | Y | Y | +| decimal | Y | Y | Y | Y | +| date | Y | Y | Y | Y | +| time | Y | Y | Y | Y | +| timestamp | Y | Y | Y | Y | +| timestamptz | Y | Y | Y | Y | +| timestamp_ns | Y | Y | Y | Y | +| timestamptz_ns | Y | Y | Y | Y | +| string | Y | Y | Y | Y | +| uuid | Y | Y | Y | Y | +| fixed | Y | Y | Y | Y | +| binary | Y | Y | Y | Y | + +## Data File Formats + +| Format | Java | PyIceberg | Rust | Go | +|---------|------|-----------|------|----| +| Parquet | Y | Y | Y | Y | +| ORC | Y | N | N | N | +| Puffin | Y | N | N | N | + +## File IO + +| Storage | Java | PyIceberg | Rust | Go | +|----------------------|------|-----------|------|----| +| Local Filesystem | Y | Y | Y | Y | +| Hadoop Filesystem | Y | Y | Y | Y | +| Aws S3 | Y | Y | Y | Y | +| Google Cloud Storage | Y | Y | Y | Y | +| Memory Fs | Y | Y | Y | Y | + +## Table Update Operations + +### Table Spec V1 + +| Operation | Java | PyIceberg | Rust | Go | +|-----------------------------|------|-----------|------|----| +| Update schema | Y | N | Y | N | +| Update partition spec | Y | N | Y | N | +| Update table properties | Y | Y | Y | N | +| Replace sort order | Y | N | N | N | +| Update table location | Y | N | N | N | +| Append data files | Y | Y | N | N | +| Rewrite files | Y | Y | N | N | +| Rewrite manifests | Y | Y | N | N | +| Overwrite files | Y | Y | N | N | +| Row delta | Y | N | N | N | +| Delete files | Y | N | N | N | +| Update statistics | Y | N | N | N | +| Update partition statistics | Y | N | N | N | +| Expire snapshots | Y | N | N | N | +| Manage snapshots | Y | N | N | N | + +### Table Spec V2 + +| Operation | Java | PyIceberg | Rust | Go | +|-----------------------------|------|-----------|------|----| +| Update schema | Y | Y | N | N | +| Update partition spec | Y | Y | N | N | +| Update table properties | Y | Y | Y | N | +| Replace sort order | Y | N | N | N | +| Update table location | Y | N | N | N | +| Append data files | Y | Y | N | N | +| Rewrite files | Y | Y | N | N | +| Rewrite manifests | Y | Y | N | N | +| Overwrite files | Y | Y | N | N | +| Row delta | Y | N | N | N | +| Delete files | Y | Y | N | N | +| Update statistics | Y | N | N | N | +| Update partition statistics | Y | N | N | N | +| Expire snapshots | Y | N | N | N | +| Manage snapshots | Y | N | N | N | + Review Comment: I'm hesitating to add this detail into this. As iceberg spec only supports serializable ioslation level, should we really need to mention this? I mean, usually a database mentions isolation level it supports only when it support different isolation levels such as snapshot isolation, repeatable read, serializable, and iceberg only support serializable by using retry. As with other part such version history, it's more like a feature of time travel, maybe we should add features like incremental planning, incremental read, time travel into table read part? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org