date:20231005

Re: [I] S3 compression Issue with Iceberg [iceberg]

2023-10-05 Thread via GitHub



swat1234 commented on issue #8713:
URL: https://github.com/apache/iceberg/issues/8713#issuecomment-1748235072

   I have tried with huge data. Below are the outcomes. 
   
   1. File with UNCOMPRESSED codec. - 1.8GB
   
   2. File with gzip codec - 1.8GB
   3. File with code snappy codec - 2.8GB
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spec: Add partition stats spec [iceberg]

2023-10-05 Thread via GitHub



ajantha-bhat commented on PR #7105:
URL: https://github.com/apache/iceberg/pull/7105#issuecomment-1748239349

   @rdblue: Can you also please take a look?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Include summary without 'operation' when comparing view versions [iceberg]

2023-10-05 Thread via GitHub



nastra commented on code in PR #8678:
URL: https://github.com/apache/iceberg/pull/8678#discussion_r1346975335


##
core/src/main/java/org/apache/iceberg/view/ViewMetadata.java:
##
@@ -321,14 +321,20 @@ private int reuseOrCreateNewViewVersionId(ViewVersion 
viewVersion) {
 
 /**
  * Checks whether the given view versions would behave the same while 
ignoring the view version
- * id, the creation timestamp, and the summary.
+ * id, the creation timestamp, and the operation.
  *
  * @param one the view version to compare
  * @param two the view version to compare
  * @return true if the given view versions would behave the same
  */
 private boolean sameViewVersion(ViewVersion one, ViewVersion two) {
-  return Objects.equals(one.representations(), two.representations())
+  // ignore the "operation" contained in the summary for the comparison
+  Map summaryOne = Maps.newHashMap(one.summary());
+  Map summaryTwo = Maps.newHashMap(two.summary());
+  summaryOne.remove("operation");
+  summaryTwo.remove("operation");
+  return Objects.equals(summaryOne, summaryTwo)
+  && Objects.equals(one.representations(), two.representations())

Review Comment:
   if the SQL is different, then both view versions will be treated separately 
and won't be de-duplicated. But if a create+replace operation creates a view 
version with the same SQL, then this is de-duplicated and only one of those 
remains.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Need to support AWS storage class [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on issue #8153:
URL: https://github.com/apache/iceberg/issues/8153#issuecomment-1748352603

   Was going through issues, looks like this is already fixed by #8154 . 
Closing this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Need to support AWS storage class [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar closed issue #8153: Need to support AWS storage class
URL: https://github.com/apache/iceberg/issues/8153


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add logic to generate a new snapshot-id [iceberg-python]

2023-10-05 Thread via GitHub



Fokko commented on PR #37:
URL: https://github.com/apache/iceberg-python/pull/37#issuecomment-1748353368

   In Python, you cannot xor on ints/longs (probably because they are also 
unbound in terms of size :), so I had to deviate a bit there. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add logic to generate a new snapshot-id [iceberg-python]

2023-10-05 Thread via GitHub



Fokko commented on code in PR #37:
URL: https://github.com/apache/iceberg-python/pull/37#discussion_r1346999502


##
pyiceberg/table/__init__.py:
##
@@ -1566,3 +1575,15 @@ def _add_and_move_fields(
 elif len(moves) > 0:
 return _move_fields(fields, moves)
 return None if len(adds) == 0 else tuple(*fields, *adds)
+
+
+def _generate_snapshot_id() -> int:
+"""Generate a new Snapshot ID from a UUID.
+
+Right shifting the 64 bits removes the MAC address and time
+leaving only the part that's based on the clock (and has the
+highest entropy).
+
+Returns: An 64 bit long
+"""
+return uuid.uuid4().int & (1 << 64) - 1

Review Comment:
   It was because I read unsigned. Fixed that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add logic to generate a new snapshot-id [iceberg-python]

2023-10-05 Thread via GitHub



Fokko merged PR #37:
URL: https://github.com/apache/iceberg-python/pull/37


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



jbonofre opened a new pull request, #8722:
URL: https://github.com/apache/iceberg/pull/8722

   This PR rename resources using `master` to `main`. This PR should be merged 
only when the renaming has been done by Apache INRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Upsert support for keyless Apache Flink tables [iceberg]

2023-10-05 Thread via GitHub



pvary commented on issue #8719:
URL: https://github.com/apache/iceberg/issues/8719#issuecomment-1748445456

   Have you tried turning off the upsert mode?
   https://iceberg.apache.org/docs/latest/flink-configuration/#write-options
   ```
   upsert-enabled | Table write.upsert.enabled | Overrides this table’s 
write.upsert.enabled
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Null support in Apache Flink [iceberg]

2023-10-05 Thread via GitHub



pvary commented on issue #8720:
URL: https://github.com/apache/iceberg/issues/8720#issuecomment-1748453178

   I would guess the `NULL` handling becomes non-trivial, if you start 
executing queries filtering for the fields containing `NULL` values.
   
   What is the results / expected result for:
   ```
   SELECT * FROM word_count WHERE cnt = NULL'
   SELECT * FROM word_count WHERE cnt is NULL;
   SELECT * FROM word_count WHERE cnt <> NULL;
   SELECT * FROM word_count w1, word_count w2 WHERE w1.cnt = w2.cnt;
   ```
   
   These are things which might return non-excepted results, or handled 
differently in Iceberg scans and Flink SQL.
   Ot it could be that simply NULL handling has not been tested/documented yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



nastra commented on code in PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#discussion_r1347081505


##
README.md:
##
@@ -36,9 +36,9 @@ The core Java library that tracks table snapshots and 
metadata is complete, but
 
 The [Iceberg format specification][iceberg-spec] is being actively updated and 
is open for comment. Until the specification is complete and released, it 
carries no compatibility guarantees. The spec is currently evolving as the Java 
reference implementation changes.
 
-[Java API javadocs][iceberg-javadocs] are available for the master.
+[Java API javadocs][iceberg-javadocs] are available for the main branch.
 
-[iceberg-javadocs]: https://iceberg.apache.org/javadoc/master
+[iceberg-javadocs]: https://iceberg.apache.org/javadoc/main

Review Comment:
   will this new link work after the rename?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



jbonofre commented on code in PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#discussion_r1347085171


##
README.md:
##
@@ -36,9 +36,9 @@ The core Java library that tracks table snapshots and 
metadata is complete, but
 
 The [Iceberg format specification][iceberg-spec] is being actively updated and 
is open for comment. Until the specification is complete and released, it 
carries no compatibility guarantees. The spec is currently evolving as the Java 
reference implementation changes.
 
-[Java API javadocs][iceberg-javadocs] are available for the master.
+[Java API javadocs][iceberg-javadocs] are available for the main branch.
 
-[iceberg-javadocs]: https://iceberg.apache.org/javadoc/master
+[iceberg-javadocs]: https://iceberg.apache.org/javadoc/main

Review Comment:
   It should if we update website repo. I will do another PR for website and 
docs. Let me check 😃 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] OpenAPI: Add AssignUUID update to metadata updates [iceberg]

2023-10-05 Thread via GitHub



nastra merged PR #8716:
URL: https://github.com/apache/iceberg/pull/8716


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] OpenAPI: Add AssignUUID update to metadata updates [iceberg]

2023-10-05 Thread via GitHub



nastra commented on PR #8716:
URL: https://github.com/apache/iceberg/pull/8716#issuecomment-1748467044

   thanks for reviewing @Fokko 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



ajantha-bhat commented on code in PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#discussion_r1347108071


##
docs/metrics-reporting.md:
##
@@ -171,4 +171,4 @@ TableScan tableScan =
 try (CloseableIterable fileScanTasks = tableScan.planFiles()) {
   // ...
 }
-```
\ No newline at end of file
+```

Review Comment:
   same here.



##
docs/flink-writes.md:
##
@@ -270,4 +270,4 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') 
*/
 ...
 ```
 
-Check out all the options here: 
[write-options](/flink-configuration#write-options) 

Review Comment:
   nit: why these are shown as modified? Shall we revert the unrelated changed?



##
.github/workflows/spark-ci.yml:
##
@@ -143,4 +143,4 @@ jobs:
 with:
   name: test logs
   path: |
-**/build/testlogs
\ No newline at end of file
+**/build/testlogs

Review Comment:
   nit: unrelated modification?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



jbonofre commented on code in PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#discussion_r1347144486


##
docs/flink-writes.md:
##
@@ -270,4 +270,4 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') 
*/
 ...
 ```
 
-Check out all the options here: 
[write-options](/flink-configuration#write-options) 

Review Comment:
   Weird. Let me check and clean this. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



nastra opened a new pull request, #8723:
URL: https://github.com/apache/iceberg/pull/8723

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



bitsondatadev commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1347232147


##
docs-new/site/releases.md:
##
@@ -0,0 +1,777 @@
+---
+title: "Releases"
+---
+
+
+## Downloads
+
+The latest version of Iceberg is [{{ icebergVersion 
}}](https://github.com/apache/iceberg/releases/tag/apache-iceberg-{{ 
icebergVersion }}).
+
+* [{{ icebergVersion }} source 
tar.gz](https://www.apache.org/dyn/closer.cgi/iceberg/apache-iceberg-{{ 
icebergVersion }}/apache-iceberg-{{ icebergVersion }}.tar.gz) -- 
[signature](https://downloads.apache.org/iceberg/apache-iceberg-{{ 
icebergVersion }}/apache-iceberg-{{ icebergVersion }}.tar.gz.asc) -- 
[sha512](https://downloads.apache.org/iceberg/apache-iceberg-{{ icebergVersion 
}}/apache-iceberg-{{ icebergVersion }}.tar.gz.sha512)
+* [{{ icebergVersion }} Spark 3.4\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.4_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.4_2.12-{{ icebergVersion }}.jar) -- 
[3.4\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.4_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.4_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.3\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.3_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.3_2.12-{{ icebergVersion }}.jar) -- 
[3.3\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.3_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.3_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.2\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.2_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.2_2.12-{{ icebergVersion }}.jar) -- 
[3.2\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.2_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.2_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.1 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.1_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.1_2.12-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.17 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.17/{{
 icebergVersion }}/iceberg-flink-runtime-1.17-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.16 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.16/{{
 icebergVersion }}/iceberg-flink-runtime-1.16-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.15 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.15/{{
 icebergVersion }}/iceberg-flink-runtime-1.15-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Hive runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-hive-runtime/{{
 icebergVersion }}/iceberg-hive-runtime-{{ icebergVersion }}.jar)
+
+To use Iceberg in Spark or Flink, download the runtime JAR for your engine 
version and add it to the jars folder of your installation.
+
+To use Iceberg in Hive 2 or Hive 3, download the Hive runtime JAR and add it 
to Hive using `ADD JAR`.
+
+### Gradle
+
+To add a dependency on Iceberg in Gradle, add the following to `build.gradle`:
+
+```
+dependencies {
+  compile 'org.apache.iceberg:iceberg-core:{{ icebergVersion }}'
+}
+```
+
+You may also want to include `iceberg-parquet` for Parquet file support.
+
+### Maven
+
+To add a dependency on Iceberg in Maven, add the following to your `pom.xml`:
+
+```
+
+  ...
+  
+org.apache.iceberg
+iceberg-core
+{{ icebergVersion }}
+  
+  ...
+
+```
+
+## 1.3.1 release
+
+Apache Iceberg 1.3.1 was released on July 25, 2023.
+The 1.3.1 release addresses various issues identified in the 1.3.0 release.
+
+* Core
+  - Table Metadata parser now accepts null for fields: current-snapshot-id, 
properties, and snapshots 
([\#8064](https://github.com/apache/iceberg/pull/8064))
+* Hive
+  - Fix HiveCatalog deleting metadata on failures in checking lock status 
([\#7931](https://github.com/apache/iceberg/pull/7931))
+* Spark
+  - Fix RewritePositionDeleteFiles failure for certain partition types 
([\#8059](https://github.com/apache/iceberg/pull/8059))
+  - Fix RewriteDataFiles concurrency edge-case on commit timeouts 
([\#7933](https://github.com/apache/iceberg/pull/7933))
+  - Fix partition-level DELETE operations for WAP branches 
([\#7900](https://github.com/apache/iceberg/pull/7900))
+* Flink
+  - FlinkCatalog creation no longer creates the default database 
([\#8039](https://github.com/apache/iceberg/pull/8039))
+
+## Past releases
+
+### 1.3.0 release
+
+Apache Iceberg 1.3.0 was released on May 30th, 2023.
+The 1.3.0 release adds a va

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



bitsondatadev commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1347234000


##
docs-new/site/releases.md:
##
@@ -0,0 +1,777 @@
+---
+title: "Releases"
+---
+
+
+## Downloads
+
+The latest version of Iceberg is [{{ icebergVersion 
}}](https://github.com/apache/iceberg/releases/tag/apache-iceberg-{{ 
icebergVersion }}).
+
+* [{{ icebergVersion }} source 
tar.gz](https://www.apache.org/dyn/closer.cgi/iceberg/apache-iceberg-{{ 
icebergVersion }}/apache-iceberg-{{ icebergVersion }}.tar.gz) -- 
[signature](https://downloads.apache.org/iceberg/apache-iceberg-{{ 
icebergVersion }}/apache-iceberg-{{ icebergVersion }}.tar.gz.asc) -- 
[sha512](https://downloads.apache.org/iceberg/apache-iceberg-{{ icebergVersion 
}}/apache-iceberg-{{ icebergVersion }}.tar.gz.sha512)
+* [{{ icebergVersion }} Spark 3.4\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.4_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.4_2.12-{{ icebergVersion }}.jar) -- 
[3.4\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.4_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.4_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.3\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.3_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.3_2.12-{{ icebergVersion }}.jar) -- 
[3.3\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.3_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.3_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.2\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.2_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.2_2.12-{{ icebergVersion }}.jar) -- 
[3.2\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.2_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.2_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.1 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.1_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.1_2.12-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.17 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.17/{{
 icebergVersion }}/iceberg-flink-runtime-1.17-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.16 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.16/{{
 icebergVersion }}/iceberg-flink-runtime-1.16-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.15 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.15/{{
 icebergVersion }}/iceberg-flink-runtime-1.15-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Hive runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-hive-runtime/{{
 icebergVersion }}/iceberg-hive-runtime-{{ icebergVersion }}.jar)
+
+To use Iceberg in Spark or Flink, download the runtime JAR for your engine 
version and add it to the jars folder of your installation.
+
+To use Iceberg in Hive 2 or Hive 3, download the Hive runtime JAR and add it 
to Hive using `ADD JAR`.
+
+### Gradle
+
+To add a dependency on Iceberg in Gradle, add the following to `build.gradle`:
+
+```
+dependencies {
+  compile 'org.apache.iceberg:iceberg-core:{{ icebergVersion }}'
+}
+```
+
+You may also want to include `iceberg-parquet` for Parquet file support.
+
+### Maven
+
+To add a dependency on Iceberg in Maven, add the following to your `pom.xml`:
+
+```
+
+  ...
+  
+org.apache.iceberg
+iceberg-core
+{{ icebergVersion }}
+  
+  ...
+
+```
+
+## 1.3.1 release
+
+Apache Iceberg 1.3.1 was released on July 25, 2023.
+The 1.3.1 release addresses various issues identified in the 1.3.0 release.
+
+* Core
+  - Table Metadata parser now accepts null for fields: current-snapshot-id, 
properties, and snapshots 
([\#8064](https://github.com/apache/iceberg/pull/8064))
+* Hive
+  - Fix HiveCatalog deleting metadata on failures in checking lock status 
([\#7931](https://github.com/apache/iceberg/pull/7931))
+* Spark
+  - Fix RewritePositionDeleteFiles failure for certain partition types 
([\#8059](https://github.com/apache/iceberg/pull/8059))
+  - Fix RewriteDataFiles concurrency edge-case on commit timeouts 
([\#7933](https://github.com/apache/iceberg/pull/7933))
+  - Fix partition-level DELETE operations for WAP branches 
([\#7900](https://github.com/apache/iceberg/pull/7900))
+* Flink
+  - FlinkCatalog creation no longer creates the default database 
([\#8039](https://github.com/apache/iceberg/pull/8039))
+
+## Past releases
+
+### 1.3.0 release
+
+Apache Iceberg 1.3.0 was released on May 30th, 2023.
+The 1.3.0 release adds a va

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



ajantha-bhat commented on code in PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#discussion_r1347234482


##
docs/flink-writes.md:
##
@@ -270,4 +270,4 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') 
*/
 ...
 ```
 
-Check out all the options here: 
[write-options](/flink-configuration#write-options) 

Review Comment:
   Sometimes I have observed that editing .md files from browser will cause 
this kind of issues. So, I use IDE only for doing changes now to avoid these 
type of changes. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



bitsondatadev commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1347234000


##
docs-new/site/releases.md:
##
@@ -0,0 +1,777 @@
+---
+title: "Releases"
+---
+
+
+## Downloads
+
+The latest version of Iceberg is [{{ icebergVersion 
}}](https://github.com/apache/iceberg/releases/tag/apache-iceberg-{{ 
icebergVersion }}).
+
+* [{{ icebergVersion }} source 
tar.gz](https://www.apache.org/dyn/closer.cgi/iceberg/apache-iceberg-{{ 
icebergVersion }}/apache-iceberg-{{ icebergVersion }}.tar.gz) -- 
[signature](https://downloads.apache.org/iceberg/apache-iceberg-{{ 
icebergVersion }}/apache-iceberg-{{ icebergVersion }}.tar.gz.asc) -- 
[sha512](https://downloads.apache.org/iceberg/apache-iceberg-{{ icebergVersion 
}}/apache-iceberg-{{ icebergVersion }}.tar.gz.sha512)
+* [{{ icebergVersion }} Spark 3.4\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.4_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.4_2.12-{{ icebergVersion }}.jar) -- 
[3.4\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.4_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.4_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.3\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.3_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.3_2.12-{{ icebergVersion }}.jar) -- 
[3.3\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.3_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.3_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.2\_2.12 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.2_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.2_2.12-{{ icebergVersion }}.jar) -- 
[3.2\_2.13](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.2_2.13/{{
 icebergVersion }}/iceberg-spark-runtime-3.2_2.13-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Spark 3.1 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime-3.1_2.12/{{
 icebergVersion }}/iceberg-spark-runtime-3.1_2.12-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.17 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.17/{{
 icebergVersion }}/iceberg-flink-runtime-1.17-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.16 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.16/{{
 icebergVersion }}/iceberg-flink-runtime-1.16-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Flink 1.15 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-flink-runtime-1.15/{{
 icebergVersion }}/iceberg-flink-runtime-1.15-{{ icebergVersion }}.jar)
+* [{{ icebergVersion }} Hive runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-hive-runtime/{{
 icebergVersion }}/iceberg-hive-runtime-{{ icebergVersion }}.jar)
+
+To use Iceberg in Spark or Flink, download the runtime JAR for your engine 
version and add it to the jars folder of your installation.
+
+To use Iceberg in Hive 2 or Hive 3, download the Hive runtime JAR and add it 
to Hive using `ADD JAR`.
+
+### Gradle
+
+To add a dependency on Iceberg in Gradle, add the following to `build.gradle`:
+
+```
+dependencies {
+  compile 'org.apache.iceberg:iceberg-core:{{ icebergVersion }}'
+}
+```
+
+You may also want to include `iceberg-parquet` for Parquet file support.
+
+### Maven
+
+To add a dependency on Iceberg in Maven, add the following to your `pom.xml`:
+
+```
+
+  ...
+  
+org.apache.iceberg
+iceberg-core
+{{ icebergVersion }}
+  
+  ...
+
+```
+
+## 1.3.1 release
+
+Apache Iceberg 1.3.1 was released on July 25, 2023.
+The 1.3.1 release addresses various issues identified in the 1.3.0 release.
+
+* Core
+  - Table Metadata parser now accepts null for fields: current-snapshot-id, 
properties, and snapshots 
([\#8064](https://github.com/apache/iceberg/pull/8064))
+* Hive
+  - Fix HiveCatalog deleting metadata on failures in checking lock status 
([\#7931](https://github.com/apache/iceberg/pull/7931))
+* Spark
+  - Fix RewritePositionDeleteFiles failure for certain partition types 
([\#8059](https://github.com/apache/iceberg/pull/8059))
+  - Fix RewriteDataFiles concurrency edge-case on commit timeouts 
([\#7933](https://github.com/apache/iceberg/pull/7933))
+  - Fix partition-level DELETE operations for WAP branches 
([\#7900](https://github.com/apache/iceberg/pull/7900))
+* Flink
+  - FlinkCatalog creation no longer creates the default database 
([\#8039](https://github.com/apache/iceberg/pull/8039))
+
+## Past releases
+
+### 1.3.0 release
+
+Apache Iceberg 1.3.0 was released on May 30th, 2023.
+The 1.3.0 release adds a va

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



ajantha-bhat commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347242330


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   But In case of table metadata we are always writing it. 
   
https://github.com/apache/iceberg/blob/dd26f3630954e69c584faa06c9395f77d49337b1/core/src/main/java/org/apache/iceberg/TableMetadataParser.java#L214
   
   Should we make it similar in a follow up?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



jbonofre commented on code in PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#discussion_r1347247075


##
docs/flink-writes.md:
##
@@ -270,4 +270,4 @@ INSERT INTO tableName /*+ OPTIONS('upsert-enabled'='true') 
*/
 ...
 ```
 
-Check out all the options here: 
[write-options](/flink-configuration#write-options) 

Review Comment:
   I used vim. It's maybe my vimrc :)  I gonna clean this ;)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



shreyanshR7 commented on PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#issuecomment-1748692640

   I've made the changes :)@nastra


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



nastra commented on code in PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#discussion_r1347350422


##
api/src/test/java/org/apache/iceberg/metrics/TestDefaultTimer.java:
##
@@ -104,7 +104,7 @@ public void measureRunnable() {
 Runnable runnable =
 () -> {
   try {
-Thread.sleep(100);
+   Thread.sleep(100);

Review Comment:
   unnecessary changes that need to be reverted, same for the other files



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



nastra commented on code in PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#discussion_r1347351094


##
core/src/test/java/org/apache/iceberg/util/TestInMemoryLockManager.java:
##
@@ -112,7 +114,7 @@ public void testAcquireSingleProcess() throws Exception {
 CompletableFuture.supplyAsync(
 () -> {
   try {
-Thread.sleep(200);
+Awaitility.await().atLeast(Duration.ofMillis(200)).until(() -> 
true);

Review Comment:
   does it make sense here to switch to awaitility?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



nastra commented on code in PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#discussion_r1347351835


##
flink/v1.16/flink/src/test/java/org/apache/iceberg/flink/source/TestIcebergSourceFailover.java:
##
@@ -208,7 +210,7 @@ private void testContinuousIcebergSource(FailoverType 
failoverType) throws Excep
 JobID jobId = jobClient.getJobID();
 
 for (int i = 1; i < 5; i++) {
-  Thread.sleep(10);
+  Awaitility.await().atLeast(Duration.ofMillis(10)).until(() -> true);

Review Comment:
   does it make sense here to switch to awaitility?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



nastra commented on code in PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#discussion_r1347352377


##
spark/v3.5/spark/src/test/java/org/apache/iceberg/spark/source/TestIcebergSourceTablesBase.java:
##
@@ -1955,7 +1958,6 @@ public void testRemoveOrphanFilesActionSupport() throws 
InterruptedException {
 
 // sleep for 1 second to ensure files will be old enough
 Thread.sleep(1000);
-

Review Comment:
   why are these changes here necessary?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347356619


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   I think we can probably leave the `TableMetadata` as is. I'm curious why we 
want to make this change for views, sorry if this was already discussed before 
@nastra . Why can't we just keep it similar to tables, and have the property 
always exist, and by default it'll just be the empty collection that Immutables 
creates?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



shreyanshR7 commented on PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#issuecomment-1748822301

   wait let me check
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



ajantha-bhat commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347371858


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   @amogh-jahagirdar: Table metadata , the read is still optional. But always 
writes. So, else case is a dead code. 
   For view this PR is making both read and write optional.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Adopt `Catalog` API to include references to the `TableMetadata` and the `metadata_location` in the `TableCommit` payload for the `update_table` method [iceberg-rust]

2023-10-05 Thread via GitHub



Xuanwo commented on issue #75:
URL: https://github.com/apache/iceberg-rust/issues/75#issuecomment-1748854108

   > It would therefore be helpful to include references to the 
`metadata_location` and the `TableMetadata` in the 
[`TableCommit`](https://github.com/apache/iceberg-rust/blob/main/crates/iceberg/src/catalog/mod.rs#L179)
 payload of the `update_table` operation.
   
   Could you please provide further explanation on why we require those in 
`TableCommit`? I was under the impression that the catalog should be 
responsible for maintaining them. For instance, the catalog can locate the 
metadata_location and metadata of a table using its TableIdent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



nastra commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347392925


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   @amogh-jahagirdar in newer parsers we typically try to avoid sending empty 
collections, hence the suggested change.
   We could still send an empty collection here for views, but we still need to 
adjust the read-side to not require properties



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347395043


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   @ajantha-bhat Right, my point is I think what TableMetadata is doing already 
is fine and we should be able to follow the same thing here. 
   
   In other words, why make it optional in the first place for views on the 
write side when going through the Iceberg provided metadata writer?
   
I can see the argument for optional reads since there's no point failing 
reading view metadata if it's missing properties, in case somebody messed with 
metadata or has their own metadata writer that's OK to me because properties 
are optional from a spec perspective. But I feel that on the write side the 
Iceberg library provided metadata writer should just write out the properties.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347395043


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   @ajantha-bhat Right, my point is I think what TableMetadata is doing already 
is fine and we should be able to follow the same idea here. 
   
   In other words, why make it optional in the first place for views on the 
write side when going through the Iceberg provided metadata writer?
   
   I can see the argument for optional reads since there's no point failing 
reading view metadata if it's missing properties, in case somebody messed with 
metadata or has their own metadata writer (so it's not really dead code in this 
case) 
   . That's OK to me because properties are optional from a spec perspective. 
But I feel that on the write side the Iceberg library provided metadata writer 
should just write out the properties.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347395043


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   @ajantha-bhat Right, my point is I think what TableMetadata is doing already 
is fine and we should be able to follow the same idea here. 
   
   In other words, why make it optional in the first place for views on the 
write side when going through the Iceberg provided metadata writer?
   
   I can see the argument for optional reads since there's no point failing 
reading view metadata if it's missing properties, in case somebody messed with 
metadata or has their own metadata writer (so it's not really dead code in this 
case) 
   . That's OK to me because properties are optional from a spec perspective. I 
think that on the write side the Iceberg library metadata writer should just 
write out the properties.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347395043


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   @ajantha-bhat Right, my point is I think what TableMetadata is doing already 
is fine and we should be able to follow the same idea here. 
   
   In other words, why make it optional in the first place for views on the 
write side when going through the Iceberg provided metadata writer?
   
   I can see the argument for optional reads since there's no point failing 
reading view metadata if it's missing properties, in case somebody messed with 
metadata or has their own metadata writer (so the else block mentioned is not 
really dead code in this case) 
   . That's OK to me because properties are optional from a spec perspective. I 
think that on the write side the Iceberg library metadata writer should just 
write out the properties.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347395043


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   @ajantha-bhat Right, my point is I think what TableMetadata is doing already 
is fine and we should be able to follow the same idea here. 
   
   In other words, why make it optional in the first place for views on the 
write side when going through the Iceberg provided metadata writer?
   
   I can see the argument for optional reads since there's no point failing 
reading view metadata if it's missing properties, in case somebody messed with 
metadata or has their own metadata writer (so the else block mentioned is not 
really dead code in this case) 
   . That's OK to me because properties are optional from a spec 
perspective.However, I think that on the write side the Iceberg library 
metadata writer should just write out the properties.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347411761


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   > We could still send an empty collection here for views, but we still need 
to adjust the read-side to not require properties
   
   Yeah I'm on board with this @nastra. Checking on the read side is expected 
since properties is optional as per the spec. I think any Iceberg library 
parser should write out "richer" metadata even if it's empty. It makes it easy 
for someone to look at the metadata file and see the state (rather than 
determine a field is missing). Also minor point in this case, it simplifies the 
code to just write out the field.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347411761


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   > We could still send an empty collection here for views, but we still need 
to adjust the read-side to not require properties
   
   Yeah I'm on board with this @nastra. Checking on the read side is expected 
since properties is optional as per the spec. I think any Iceberg library 
parser should write out "richer" metadata even if it's empty. It makes it easy 
for someone to look at the metadata file and see the state (rather than 
determine a field is missing by having to have the spec side by side). Also 
minor point in this case, it simplifies the code to just write out the field.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Upgrade to gradle 8.4 [iceberg]

2023-10-05 Thread via GitHub



jbonofre commented on issue #8485:
URL: https://github.com/apache/iceberg/issues/8485#issuecomment-1748905073

   As gradle 8.4 is now available, I'm updating the PR (I'm also re-testing 
`revapi` but I'm pretty sure it will be the same as in Gradle 8.4 :) ).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Make view metadata properties optional in JSON parser [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8723:
URL: https://github.com/apache/iceberg/pull/8723#discussion_r1347436052


##
core/src/main/java/org/apache/iceberg/view/ViewMetadataParser.java:
##
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) 
throws IOException
 gen.writeStringField(VIEW_UUID, metadata.uuid());
 gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
 gen.writeStringField(LOCATION, metadata.location());
-JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
+if (!metadata.properties().isEmpty()) {

Review Comment:
   Just discussed with @nastra , I also see the other argument that writing 
empty collections is a bit wasteful. Considering that, I'm okay either way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Construct a writer tree [iceberg-python]

2023-10-05 Thread via GitHub



Fokko commented on code in PR #40:
URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1345991299


##
pyiceberg/manifest.py:
##
@@ -262,15 +346,13 @@ class DataFile(Record):
 "split_offsets",
 "equality_ids",
 "sort_order_id",
-"spec_id",
 )
 content: DataFileContent
 file_path: str
 file_format: FileFormat
 partition: Record
 record_count: int
 file_size_in_bytes: int
-block_size_in_bytes: Optional[int]

Review Comment:
   I've removed this since we shouldn't use it. It is still part of the write 
schema itself.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Upgrade to gradle 8.4 [iceberg]

2023-10-05 Thread via GitHub



jbonofre commented on PR #8486:
URL: https://github.com/apache/iceberg/pull/8486#issuecomment-1748930989

   Re-testing `revapi` on Gradle 8.4 (it should be the same as with Gradle 8.3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #8673:
URL: https://github.com/apache/iceberg/pull/8673#discussion_r1347419881


##
core/src/main/java/org/apache/iceberg/GenericManifestFile.java:
##
@@ -92,7 +92,7 @@ public GenericManifestFile(Schema avroSchema) {
 this.specId = specId;
 this.sequenceNumber = 0;
 this.minSequenceNumber = 0;
-this.snapshotId = null;
+this.snapshotId = 0L;

Review Comment:
   This seems a bit dangerous to me. I get we're saying it's required, so we 
don't need the null value but I looked around to see where 
manifestFile.snapshotId() == null checks are performed, and there's quite a 
few. For example `RewriteDataFiles`, `AppendFiles`. This may break some of the 
existing logic there by exercising the other branch in the code? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-05 Thread via GitHub



Fokko commented on code in PR #8673:
URL: https://github.com/apache/iceberg/pull/8673#discussion_r1347459324


##
core/src/main/java/org/apache/iceberg/GenericManifestFile.java:
##
@@ -92,7 +92,7 @@ public GenericManifestFile(Schema avroSchema) {
 this.specId = specId;
 this.sequenceNumber = 0;
 this.minSequenceNumber = 0;
-this.snapshotId = null;
+this.snapshotId = 0L;

Review Comment:
   Hmm, I thought that this was used only by tests, but there is one path from 
actual code that also uses this. Let me check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-05 Thread via GitHub



Fokko commented on code in PR #8673:
URL: https://github.com/apache/iceberg/pull/8673#discussion_r1347466602


##
core/src/main/java/org/apache/iceberg/GenericManifestFile.java:
##
@@ -92,7 +92,7 @@ public GenericManifestFile(Schema avroSchema) {
 this.specId = specId;
 this.sequenceNumber = 0;
 this.minSequenceNumber = 0;
-this.snapshotId = null;
+this.snapshotId = 0L;

Review Comment:
   This constructor is used when there is no manifest-list (v1 tables). That's 
quite rare nowadays.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-05 Thread via GitHub



Fokko commented on code in PR #8673:
URL: https://github.com/apache/iceberg/pull/8673#discussion_r1347466602


##
core/src/main/java/org/apache/iceberg/GenericManifestFile.java:
##
@@ -92,7 +92,7 @@ public GenericManifestFile(Schema avroSchema) {
 this.specId = specId;
 this.sequenceNumber = 0;
 this.minSequenceNumber = 0;
-this.snapshotId = null;
+this.snapshotId = 0L;

Review Comment:
   This constructor is used when there is no manifest-list (v1 tables). That's 
quite rare nowadays, but we should still support it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Add View support for REST catalog [iceberg]

2023-10-05 Thread via GitHub



amogh-jahagirdar commented on code in PR #7913:
URL: https://github.com/apache/iceberg/pull/7913#discussion_r1347539389


##
core/src/main/java/org/apache/iceberg/rest/CatalogHandlers.java:
##
@@ -374,4 +385,107 @@ static TableMetadata commit(TableOperations ops, 
UpdateTableRequest request) {
 
 return ops.current();
   }
+
+  public static BaseView baseView(View view) {
+Preconditions.checkArgument(view instanceof BaseView, "View must be a 
BaseView");
+return (BaseView) view;
+  }
+
+  public static ListTablesResponse listViews(ViewCatalog catalog, Namespace 
namespace) {
+return 
ListTablesResponse.builder().addAll(catalog.listViews(namespace)).build();
+  }
+
+  public static LoadViewResponse createView(
+  ViewCatalog catalog, Namespace namespace, CreateViewRequest request) {
+request.validate();
+
+ViewMetadata viewMetadata = request.metadata();
+ViewBuilder viewBuilder =
+catalog
+.buildView(TableIdentifier.of(namespace, request.name()))
+.withSchema(viewMetadata.schema())
+.withProperties(viewMetadata.properties())
+
.withDefaultNamespace(viewMetadata.currentVersion().defaultNamespace())
+
.withDefaultCatalog(viewMetadata.currentVersion().defaultCatalog());
+viewMetadata.currentVersion().representations().stream()
+.filter(r -> r instanceof SQLViewRepresentation)
+.map(r -> (SQLViewRepresentation) r)
+.forEach(r -> viewBuilder.withQuery(r.dialect(), r.sql()));
+View view = viewBuilder.create();
+
+return viewResponse(view);
+  }
+
+  private static LoadViewResponse viewResponse(View view) {
+ViewMetadata metadata = baseView(view).operations().current();
+return ImmutableLoadViewResponse.builder()
+.metadata(metadata)
+.metadataLocation(metadata.metadataFileLocation())
+.build();
+  }
+
+  public static LoadViewResponse loadView(ViewCatalog catalog, TableIdentifier 
viewIdentifier) {
+View view = catalog.loadView(viewIdentifier);
+return viewResponse(view);
+  }
+
+  public static LoadViewResponse updateView(
+  ViewCatalog catalog, TableIdentifier ident, UpdateTableRequest request) {
+View view = catalog.loadView(ident);
+ViewMetadata metadata = commit(baseView(view).operations(), request);
+
+return ImmutableLoadViewResponse.builder()
+.metadata(metadata)
+.metadataLocation(metadata.metadataFileLocation())
+.build();
+  }
+
+  public static void renameView(ViewCatalog catalog, RenameTableRequest 
request) {
+catalog.renameView(request.source(), request.destination());
+  }
+
+  public static void dropView(ViewCatalog catalog, TableIdentifier 
viewIdentifier) {
+boolean dropped = catalog.dropView(viewIdentifier);
+if (!dropped) {
+  throw new NoSuchViewException("View does not exist: %s", viewIdentifier);
+}
+  }
+
+  static ViewMetadata commit(ViewOperations ops, UpdateTableRequest request) {
+AtomicBoolean isRetry = new AtomicBoolean(false);
+try {
+  Tasks.foreach(ops)
+  .retry(COMMIT_NUM_RETRIES_DEFAULT)
+  .exponentialBackoff(
+  COMMIT_MIN_RETRY_WAIT_MS_DEFAULT,
+  COMMIT_MAX_RETRY_WAIT_MS_DEFAULT,
+  COMMIT_TOTAL_RETRY_TIME_MS_DEFAULT,
+  2.0 /* exponential */)
+  .onlyRetryOn(CommitFailedException.class)
+  .run(
+  taskOps -> {
+ViewMetadata base = isRetry.get() ? taskOps.refresh() : 
taskOps.current();
+isRetry.set(true);
+
+// apply changes
+ViewMetadata.Builder metadataBuilder = 
ViewMetadata.buildFrom(base);
+request.updates().forEach(update -> 
update.applyTo(metadataBuilder));

Review Comment:
   I see we're currently running through the validations on line 486 but if I'm 
not mistaken right now they're all just no-ops, is that right @nastra ? 
   
   Few requirements that come to mind:
   
   1.) For creating a view, there probably needs to be a requirement similar to 
table's `AssertTableDoesNotExist` but for views, `AssertViewDoesNotExist`. 
Similar principle for ReplaceView and verifying the UUID as expected.
   
   
   2.) I don't see a metadata update for AddViewRepresentation but I'd imagine 
that should validate that the SQL dialect does not already exist for the 
specified version. I forgot where we landed on that discussion, We do want to 
prevent that right?
   
   3.) A requirement for the schema ID to make sure that is unchanged from the 
base metadata? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



shreyanshR7 commented on PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#issuecomment-1749145628

   I tried to fix the issues :) @nastra 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



nastra commented on PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#issuecomment-1749152592

   @shreyanshR7 there are still many unrelated changes to files. Can you please 
fix all of those?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] feat: add builder to TableMetadata interface [iceberg-rust]

2023-10-05 Thread via GitHub



y0psolo commented on PR #62:
URL: https://github.com/apache/iceberg-rust/pull/62#issuecomment-1749197704

   Ok i added some validation steps. Tell me if its ok
   i don't know if we need to have all fields visible on the builder. Tell me. 
We can choose to open and restrict later or do the reverse. But for 
compatibility reason the later will be easier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



shreyanshR7 commented on PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#issuecomment-1749205309

   Let me check, I will try my best to undo these unrelated changes, these 
changes may be occurred due to adding awaitility method. sorry for 
inconvenience @nastra 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Thread.sleep() method is replaced with Awaitility [iceberg]

2023-10-05 Thread via GitHub



shreyanshR7 commented on PR #8715:
URL: https://github.com/apache/iceberg/pull/8715#issuecomment-1749406399

   @nastra I made another pull request addressing the issue with some more 
clean changes, please consider that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Support get partition table with filter [iceberg-python]

2023-10-05 Thread via GitHub



puchengy commented on issue #24:
URL: https://github.com/apache/iceberg-python/issues/24#issuecomment-1749412122

   @Fokko gentle ping, thanks ^


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Write data to iceberg with using java api multi thread, cause data lost [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on issue #8610:
URL: https://github.com/apache/iceberg/issues/8610#issuecomment-1749415259

   Yes, I agree. The safety of the HadoopCatalog depends on the file system, 
which is one reason why we don't recommend using that catalog.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Write data to iceberg with using java api multi thread, cause data lost [iceberg]

2023-10-05 Thread via GitHub



rdblue closed issue #8610: Write data to iceberg with using java api multi 
thread, cause data lost
URL: https://github.com/apache/iceberg/issues/8610


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-05 Thread via GitHub



Fokko commented on code in PR #8673:
URL: https://github.com/apache/iceberg/pull/8673#discussion_r1347814559


##
core/src/main/java/org/apache/iceberg/GenericManifestFile.java:
##
@@ -92,7 +92,7 @@ public GenericManifestFile(Schema avroSchema) {
 this.specId = specId;
 this.sequenceNumber = 0;
 this.minSequenceNumber = 0;
-this.snapshotId = null;
+this.snapshotId = 0L;

Review Comment:
   Hey @amogh-jahagirdar I looked at it once more, and it looks like the 
Snapshot-ID is available, so no need in setting it to `0L`. Thanks for pointing 
this out



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#discussion_r1347814726


##
.github/workflows/python-release.yml:
##
@@ -51,7 +51,7 @@ jobs:
   - name: Set version
 run: python -m poetry version "${{ inputs.version }}"
 working-directory: ./python
-if: "${{ github.event.inputs.version != 'master' }}"
+if: "${{ github.event.inputs.version != 'main' }}"

Review Comment:
   This is going to break CI? Is there a way one of the committers can commit 
this and change the main branch themselves to avoid affecting ongoing PRs?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#issuecomment-1749425255

   @jbonofre, how will this affect existing PRs? Do all of them need to be 
reopened against main instead of master?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Support get partition table with filter [iceberg-python]

2023-10-05 Thread via GitHub



Fokko commented on issue #24:
URL: https://github.com/apache/iceberg-python/issues/24#issuecomment-1749431518

   @puchengy Yes, certainly. Would this be something that you're interested in 
working on? From the snapshot, we can load the manifest list, and from there 
the manifests themselves, which contain the partition information


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables [iceberg]

2023-10-05 Thread via GitHub



hankfanchiu commented on PR #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-1749443748

   @mderoy, thanks for the interest! I don't plan on following through with 
apache/hive#2484, #2751, and this PR. Please feel free to take over.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Rename master branch to main [iceberg]

2023-10-05 Thread via GitHub



jbonofre commented on PR #8722:
URL: https://github.com/apache/iceberg/pull/8722#issuecomment-1749444235

   @rdblue no impact on the existing PRs: they will be "attached" to main 
instead of master. The only impact is for the remotes for our users. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spec: Clarify missing fields when writing [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8672:
URL: https://github.com/apache/iceberg/pull/8672#discussion_r1347851058


##
format/spec.md:
##
@@ -128,13 +128,13 @@ Tables do not require rename, except for tables that use 
atomic rename to implem
 
  Writer requirements
 
-Some tables in this spec have columns that specify requirements for v1 and v2 
tables. These requirements are intended for writers when adding metadata files 
to a table with the given version.
+Some tables in this spec have columns that specify requirements for v1 and v2 
tables. These requirements are intended for writers when adding metadata, 
manifest files, or manifest lists to a table with the given version.
 
-| Requirement | Write behavior |
-|-||
-| (blank) | The field should be omitted |
-| _optional_  | The field can be written |
-| _required_  | The field must be written |
+| Requirement | Write behavior  |
+|-|-|
+| (blank) | The field is not present in the schema  |
+| _optional_  | The field is part of the schema, and can be written |

Review Comment:
   can be written or omitted?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spec: Clarify missing fields when writing [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8672:
URL: https://github.com/apache/iceberg/pull/8672#discussion_r1347851474


##
format/spec.md:
##
@@ -128,13 +128,13 @@ Tables do not require rename, except for tables that use 
atomic rename to implem
 
  Writer requirements
 
-Some tables in this spec have columns that specify requirements for v1 and v2 
tables. These requirements are intended for writers when adding metadata files 
to a table with the given version.
+Some tables in this spec have columns that specify requirements for v1 and v2 
tables. These requirements are intended for writers when adding metadata, 
manifest files, or manifest lists to a table with the given version.
 
-| Requirement | Write behavior |
-|-||
-| (blank) | The field should be omitted |
-| _optional_  | The field can be written |
-| _required_  | The field must be written |
+| Requirement | Write behavior  |
+|-|-|
+| (blank) | The field is not present in the schema  |

Review Comment:
   I like having "Should be omitted" still, even if it isn't present in the 
schema and that means the same thing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spec: Clarify missing fields when writing [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8672:
URL: https://github.com/apache/iceberg/pull/8672#discussion_r1347854144


##
format/spec.md:
##
@@ -128,13 +128,13 @@ Tables do not require rename, except for tables that use 
atomic rename to implem
 
  Writer requirements
 
-Some tables in this spec have columns that specify requirements for v1 and v2 
tables. These requirements are intended for writers when adding metadata files 
to a table with the given version.
+Some tables in this spec have columns that specify requirements for v1 and v2 
tables. These requirements are intended for writers when adding metadata, 
manifest files, or manifest lists to a table with the given version.

Review Comment:
   Alternative: "metadata (including manifests files and manifest lists)"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Revert "Spec: Mark added_snapshot_id as optional (#8600)" [iceberg]

2023-10-05 Thread via GitHub



Fokko opened a new pull request, #8726:
URL: https://github.com/apache/iceberg/pull/8726

   This reverts commit f7a7eb2c10cb4a9b6b3ea5bfdfc5d085be8b9c31.
   
   This should stay required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Revert "Spec: Mark added_snapshot_id as optional (#8600)" [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on PR #8726:
URL: https://github.com/apache/iceberg/pull/8726#issuecomment-1749494924

   Thank you! Good to get this fixed quickly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spec: Inconsistency around files_count [iceberg]

2023-10-05 Thread via GitHub



Fokko commented on PR #5338:
URL: https://github.com/apache/iceberg/pull/5338#issuecomment-1749550427

   Thanks for the review @rdblue 🙌 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Spec disagrees with implementation on field names [iceberg]

2023-10-05 Thread via GitHub



Fokko closed issue #8684: Spec disagrees with implementation on field names
URL: https://github.com/apache/iceberg/issues/8684


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spec: Inconsistency around files_count [iceberg]

2023-10-05 Thread via GitHub



Fokko merged PR #5338:
URL: https://github.com/apache/iceberg/pull/5338


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Revert "Spec: Mark added_snapshot_id as optional (#8600)" [iceberg]

2023-10-05 Thread via GitHub



rdblue merged PR #8726:
URL: https://github.com/apache/iceberg/pull/8726


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-05 Thread via GitHub



Fokko commented on code in PR #8673:
URL: https://github.com/apache/iceberg/pull/8673#discussion_r1347949961


##
api/src/main/java/org/apache/iceberg/ManifestFile.java:
##
@@ -49,7 +49,7 @@ public interface ManifestFile {
   Types.LongType.get(),
   "Lowest sequence number in the manifest");
   Types.NestedField SNAPSHOT_ID =
-  optional(
+  required(

Review Comment:
   I talked with Ryan to get historical context, and this should be reverted 
and created a separate PR https://github.com/apache/iceberg/pull/8726 so we can 
already revert the spec itself.
   
   As in, reading an optional field using a required read schema; this is 
allowed. I've added a test as well (but probably this is already tested 
somewhere).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



bitsondatadev commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348043125


##
docs-new/mkdocs.yml:
##
@@ -0,0 +1,96 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+site_name: Apache Iceberg
+docs_dir: home
+
+theme:
+  name: material
+  language: en
+  logo: assets/images/iceberg-logo-icon.png
+  favicon: assets/images/favicon-96x96.png
+  features:
+- navigation.tabs
+- navigation.path
+- navigation.top
+- navigation.tracking
+- toc.follow
+- search.suggest
+- search.highlight
+- content.tabs.link
+- content.code.copy
+- content.code.annotate
+
+plugins:
+  - search
+  - macros:
+  include_yaml:
+- variables.yml
+  - monorepo
+
+nav:
+  - Quickstart:
+- Hive: hive-quickstart.md
+- Spark: spark-quickstart.md
+  - Docs: 
+- latest: '!include home/docs/latest/mkdocs.yml'
+- 1.3.1: '!include home/docs/1.3.1/mkdocs.yml'
+#- 1.3.0: '!include home/docs/1.3.0/mkdocs.yml'
+  - Releases: releases.md
+  - Roadmap: roadmap.md
+  - Blogs: blogs.md
+  - Talks: talks.md
+  - Vendors: vendors.md
+  - Project: 
+- Join: community.md
+- Spec: spec.md
+- View spec: view-spec.md
+- Puffin spec: puffin-spec.md
+- Multi-engine support: multi-engine-support.md
+- How to release: how-to-release.md
+- Terms: terms.md
+  - ASF:
+- ASF: ASF.md

Review Comment:
   So ultimately there's an issue with how mkdocs renders.
   
   https://github.com/squidfunk/mkdocs-material/issues/868
   
   The solution is to provide an 
[override](https://squidfunk.github.io/mkdocs-material/customization/#overriding-partials)
 to the materials theme. I'm going to dig into that later though. I've removed 
the ASF.md file for now. The current functionality is that the "ASF" nav link 
just links directly to "Sponsorship" (the first sub nav item in the index) so 
for now I'm just linking to community.md so that we'll at least be able to open 
a local page to see the nav.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



bitsondatadev commented on PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#issuecomment-1749722304

   I've addressed the comments I intend to hit this review round, but have some 
open questions:
   - https://github.com/apache/iceberg/pull/8659#discussion_r1343890396
   - https://github.com/apache/iceberg/pull/8659#discussion_r1341799374
   - https://github.com/apache/iceberg/pull/8659#discussion_r1341800289
   
   Anton:
   - https://github.com/apache/iceberg/pull/8659#issuecomment-1747804852
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] add support of before and after in preorderschema visitor for traversal [iceberg-python]

2023-10-05 Thread via GitHub



MehulBatra opened a new pull request, #42:
URL: https://github.com/apache/iceberg-python/pull/42

   Add support to perform an action before and after field traversal in 
preorder fashion for schema 
   for object: structtype, maptype & listtype


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348075495


##
docs-new/mkdocs.yml:
##
@@ -0,0 +1,96 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+site_name: Apache Iceberg
+docs_dir: home
+
+theme:
+  name: material
+  language: en
+  logo: assets/images/iceberg-logo-icon.png
+  favicon: assets/images/favicon-96x96.png
+  features:
+- navigation.tabs
+- navigation.path
+- navigation.top
+- navigation.tracking
+- toc.follow
+- search.suggest
+- search.highlight
+- content.tabs.link
+- content.code.copy
+- content.code.annotate
+
+plugins:
+  - search
+  - macros:
+  include_yaml:
+- variables.yml
+  - monorepo
+
+nav:
+  - Quickstart:
+- Hive: hive-quickstart.md
+- Spark: spark-quickstart.md
+  - Docs: 
+- latest: '!include home/docs/latest/mkdocs.yml'
+- 1.3.1: '!include home/docs/1.3.1/mkdocs.yml'

Review Comment:
   Let's open issues and discuss it with the mkdocs monorepo author. I don't 
think ordering is something we should be distracted by.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348076686


##
docs-new/mkdocs.yml:
##
@@ -0,0 +1,96 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+site_name: Apache Iceberg
+docs_dir: home
+
+theme:
+  name: material
+  language: en
+  logo: assets/images/iceberg-logo-icon.png
+  favicon: assets/images/favicon-96x96.png
+  features:
+- navigation.tabs
+- navigation.path
+- navigation.top
+- navigation.tracking
+- toc.follow
+- search.suggest
+- search.highlight
+- content.tabs.link
+- content.code.copy
+- content.code.annotate
+
+plugins:
+  - search
+  - macros:
+  include_yaml:
+- variables.yml
+  - monorepo
+
+nav:
+  - Quickstart:
+- Hive: hive-quickstart.md
+- Spark: spark-quickstart.md
+  - Docs: 
+- latest: '!include home/docs/latest/mkdocs.yml'

Review Comment:
   I think `nightly` is a more obvious name that it is unreleased. I think 
`dev` is pretty generic and may be misinterpreted.
   
   I'd also keep it simple with `latest`. It's better to have `latest` pointing 
to old docs than to have it break because of some automated change. I'd point 
it to the right branch specifically.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348077823


##
docs-new/mkdocs.yml:
##
@@ -0,0 +1,96 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+site_name: Apache Iceberg
+docs_dir: home

Review Comment:
   I don't understand what you're proposing, but I think that we should go with 
the defaults inside the mkdocs area. That is, `docs` should contain the source 
markdown files and is located in the same folder as the top-level 
`mkdocs.yaml`. `site` also lives in that folder when is it auto-generated. The 
folder containing all 3 can be called `site` or `docs`, I don't really have a 
preference there.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348077823


##
docs-new/mkdocs.yml:
##
@@ -0,0 +1,96 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+site_name: Apache Iceberg
+docs_dir: home

Review Comment:
   I don't understand what you're proposing, but I think that we should go with 
the defaults inside the mkdocs area. That is, `docs` should contain the source 
markdown files and is located in the same folder as the top-level 
`mkdocs.yaml`. `site` also lives in that folder when is it auto-generated. The 
folder containing all 3 can be called `site` or `docs`, I don't really have a 
preference there, but I'd lean toward sticking with what we have today.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348087050


##
docs-new/.github/bin/deploy_docs.sh:
##
@@ -0,0 +1,39 @@
+#!/bin/bash
+
+while [[ "$#" -gt 0 ]]; do
+case $1 in
+-v|--version) ICEBERG_VERSION="$2"; shift ;;
+*) echo "Unknown parameter passed: $1"; exit 1 ;;
+esac
+shift
+done
+
+GIT_BRANCH="docs-${ICEBERG_VERSION}"
+
+# change to branch
+git checkout -b $GIT_BRANCH
+
+#remove all local files and leave site/ dir
+rm ./*

Review Comment:
   How does this leave the `site` dir?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348087991


##
docs-new/README.md:
##
@@ -0,0 +1,123 @@
+
+
+# making-iceberg-docs
+
+Testbed for the Iceberg docs.
+
+## Requirements 
+
+* Python >=3.9
+* pip
+
+
+## Build process
+
+The directory structure is intended to mimic the tree hierarchy of the 
website. This will enable contributors to find the documentation they need to 
update easier. The static and documentation will reside in the same location. 
+
+All static pages are all the `./home/.*md` files and the docs are the 
`.home/docs//*.md` files. Notice the location of the `mkdocs.yml`.
+
+Looking at this though, you may ask where the older versions and javadocs are.
+
+```
+.
+├── home
+│   ├── assets
+│   │   └── images
+│   ├── docs
+│   │   └── latest
+│   │   ├── assets
+│   │   ├── api.md
+│   │   ├── aws.md
+│   │   ├── branching.md
+│   │   ├── ...
+│   │   ├── **mkdocs.yml(docs)** 
+│   │   ├── ...
+│   │   ├── spark-structured-streaming.md
+│   │   ├── spark-writes.md
+│   │   └── table-migration.md
+│   ├── ASF.md
+│   ├── about.md
+│   ├── ...
+│   ├── vendors.md
+│   └── view-spec.md
+├── LICENSE
+├── README.md
+├── **mkdocs.yml(static)**
+├── requirements.txt
+└── variables.yml
+```
+
+All of the documentation versions are saved in special `docs-` tags 
that only contain the root of the docs version they contian. There is also a 
`javadoc` tag that contains all prior versions of the javadocs in a single tag. 
These are generated and loaded only at build time using the 
[git-worktree](https://git-scm.com/docs/git-worktree) docs.
+
+```

Review Comment:
   Can you add some description so other readers know what this is?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348089146


##
docs-new/README.md:
##
@@ -0,0 +1,175 @@
+
+
+# Iceberg site and documentation
+
+This holds the static files that define and build the documentation site for 
Apache Iceberg.
+
+## Requirements 
+
+* Python >=3.9
+* pip
+
+
+## Usage
+
+The directory structure is intended to mimic the tree hierarchy of the 
website. This will enable contributors to find the documentation they need to 
update easier. The static and documentation will reside in the same location. 
+
+All static pages are all the `./site/.*md` files and the docs are the 
`.site/docs//*.md` files. Notice the location of the `mkdocs.yml`. 
Looking at this though, you may ask where the older versions and javadocs are.
+
+```
+.
+├── site
+│   ├── assets
+│   │   └── images
+│   ├── docs
+│   │   └── latest
+│   │   ├── assets
+│   │   ├── api.md
+│   │   ├── aws.md
+│   │   ├── branching.md
+│   │   ├── ...
+│   │   ├── **mkdocs.yml(docs)** 
+│   │   ├── ...
+│   │   ├── spark-structured-streaming.md
+│   │   ├── spark-writes.md
+│   │   └── table-migration.md
+│   ├── ASF.md
+│   ├── about.md
+│   ├── ...
+│   ├── vendors.md
+│   └── view-spec.md
+├── LICENSE
+├── README.md
+├── **mkdocs.yml(static)**
+├── requirements.txt
+└── variables.yml
+```
+
+All of the documentation versions are saved in special `docs-` 
branches that only contain the root of the docs version. There is also a 
`javadoc` tag that contains all prior versions of the javadocs in a single tag. 
These are generated and loaded only at build time using the 
[git-worktree](https://git-scm.com/docs/git-worktree) docs.
+
+```
+.
+└── site
+    ├── docs
+    │   ├── latest
+    │   ├── 1.3.1
+    │   ├── 1.3.0
+    │   └── ...
+    └── javadoc
+    ├── latest
+    ├── 1.3.1
+    ├── 1.3.0
+        └── ...
+```
+
+### Install
+
+1. (Optional) Set up venv
+```
+python -m venv mkdocs_env
+source mkdocs_env/bin/activate
+```
+
+1. Install required Python libraries
+```
+pip install -r requirements.txt
+```
+
+ Adding additional versioned documentation
+
+To build locally with additional docs versions, add them to your working tree.
+For now, I'm just adding a single version, and the javadocs directory.
+
+```
+git worktree add site/docs/1.3.1 docs-1.3.1
+git worktree add site/javadoc javadoc
+```
+
+## Build
+
+Run the build command in the root directory, and optionally add `--clean` to 
force MkDocs to clear previously generated pages.
+
+```
+mkdocs build [--clean]
+```
+
+## Run
+
+Start MkDocs server locally to verify the site looks good.
+
+```
+mkdocs serve
+```
+
+### Release process
+
+Deploying a version of the docs is a two step process:
+ 1. Cut a new release from the `latest` documentation which creates a new 
branch `docs-`.
+
+```
+.github/bin/deploy_docs.sh -v 1.4.0
+```
+
+See [deploy_docs.sh](.github/bin/deploy_docs.sh) for more details.
+
+ 1. Make sure to add the new version to the list of versions to pull into git 
worktree.
+ 1. Follow the steps in [the build process](#build).
+ 1. Push the generated site to `gh-pages`.

Review Comment:
   This will eventually need to be the ASF branch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] add support of before and after actions in preorderschema traversal [iceberg-python]

2023-10-05 Thread via GitHub



rdblue commented on PR #42:
URL: https://github.com/apache/iceberg-python/pull/42#issuecomment-1749783634

   I don't think that this is necessary and I think it also breaks future uses 
of the visitor.
   
   Despite the name, the visitor this updates doesn't actually perform 
pre-order traversal. It creates futures that can be called to produce the 
result of further processing. In Java, we call this the "custom order" visitor 
because you can use it for any order you want by calling the future at a 
different point.
   
   This PR changes to actual pre-order traversal. That's fine, but it means 
that you can't actually customize as much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8673:
URL: https://github.com/apache/iceberg/pull/8673#discussion_r1348099841


##
core/src/main/java/org/apache/iceberg/V2Metadata.java:
##
@@ -39,7 +39,7 @@ private V2Metadata() {}
   ManifestFile.MANIFEST_CONTENT.asRequired(),
   ManifestFile.SEQUENCE_NUMBER.asRequired(),
   ManifestFile.MIN_SEQUENCE_NUMBER.asRequired(),
-  ManifestFile.SNAPSHOT_ID.asRequired(),
+  ManifestFile.SNAPSHOT_ID,

Review Comment:
   I believe that this is actually required because if the manifest isn't 
written with a snapshot ID, the manifest list's snapshot ID is inherited. And 
that _must_ be present.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Core: Mark `503: added_snapshot_id` as required [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8673:
URL: https://github.com/apache/iceberg/pull/8673#discussion_r1348100841


##
core/src/main/java/org/apache/iceberg/GenericManifestFile.java:
##
@@ -84,15 +84,24 @@ public GenericManifestFile(Schema avroSchema) {
 }
   }
 
+  /**
+   * @deprecated since 1.5.0, will be removed in 1.6.0. Use {@link
+   * GenericManifestFile#GenericManifestFile(InputFile, int, Long)} with a 
snapshot-id
+   */
+  @Deprecated
   GenericManifestFile(InputFile file, int specId) {
+this(file, specId, null);
+  }
+
+  GenericManifestFile(InputFile file, int specId, Long snapshotId) {

Review Comment:
   I think I understand the rationale for adding this, but I'm pretty sure that 
we didn't do it this way on purpose. By adding this option, you can easily pass 
in the snapshot ID for a manfiest file so that you're deciding when it is 
created what snapshot it has.
   
   This is something that we wanted to avoid so that people didn't accidentally 
corrupt metadata by setting the snapshot ID directly. Instead, callers can only 
construct manifests with snapshot IDs that will be inherited from manifest list 
metadata. At least, that's the idea. It's probably possible to work around it, 
but I think it is more correct for people to not have the option when creating 
the manifest metadata.
   
   Also, I think this is something we use so that you can write a manifest of 
files (like how Flink stores checkpoint state) and later add it to the table 
once the snapshot ID is known.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spec: add types timstamp_ns and timestamptz_ns [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on PR #8683:
URL: https://github.com/apache/iceberg/pull/8683#issuecomment-1749792874

   @jacobmarble, I was talking with @danielcweeks a couple days ago about v3 
additions and we came up with a good reason to add `timestamp_ms` and 
`timestamptz_ms`. Currently, type promotion is limited to very specific 
changes, but we think that we will need to open that up in v3. As long as we 
are open to promoting types, it makes sense to let people promote `long` to 
`timestamp` of some kind.
   
   Now that we are adding timestamps with different units, this gives us a way 
to fix a long-standing problem: older data that used millisecond timestamps as 
longs couldn't be read as timestamps because we had no way to know the unit of 
the `long` (millis or micros). Now that we have separate types, we can use the 
type to carry that information, so that you can promote a `long` to 
`timestamp_ms` or `timestamp` and we know how to interpret the value.
   
   What do you think? If you agree, then we should probably add `ms` types at 
the same time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Spec: add types timstamp_ns and timestamptz_ns [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8683:
URL: https://github.com/apache/iceberg/pull/8683#discussion_r1348103716


##
format/spec.md:
##
@@ -167,30 +167,34 @@ A **`map`** is a collection of key-value pairs with a key 
type and a value type.
 
  Primitive Types
 
-| Primitive type | Description 
 | Requirements |
-||--|--|
-| **`boolean`**  | True or false   
 |  |
-| **`int`**  | 32-bit signed integers  
 | Can promote to `long`|
-| **`long`** | 64-bit signed integers  
 |  |
-| **`float`**| [32-bit IEEE 
754](https://en.wikipedia.org/wiki/IEEE_754) floating point | Can promote to 
double|
-| **`double`**   | [64-bit IEEE 
754](https://en.wikipedia.org/wiki/IEEE_754) floating point |   
   |
-| **`decimal(P,S)`** | Fixed-point decimal; precision P, scale S   
 | Scale is fixed [1], precision must be 38 or less |
-| **`date`** | Calendar date without timezone or time  
 |  |
-| **`time`** | Time of day without date, timezone  
 | Microsecond precision [2]|
-| **`timestamp`**| Timestamp without timezone  
 | Microsecond precision [2]|
-| **`timestamptz`**  | Timestamp with timezone 
 | Stored as UTC [2]|
-| **`string`**   | Arbitrary-length character sequences
 | Encoded with UTF-8 [3]   |
-| **`uuid`** | Universally unique identifiers  
 | Should use 16-byte fixed |
-| **`fixed(L)`** | Fixed-length byte array of length L 
 |  |
-| **`binary`**   | Arbitrary-length byte array 
 |  |
+| Primitive type   | Description   
   | Requirements |
+|--|--|--|
+| **`boolean`**| True or false 
   |  |
+| **`int`**| 32-bit signed integers
   | Can promote to `long`|
+| **`long`**   | 64-bit signed integers
   |  |
+| **`float`**  | [32-bit IEEE 
754](https://en.wikipedia.org/wiki/IEEE_754) floating point | Can promote to 
double|
+| **`double`** | [64-bit IEEE 
754](https://en.wikipedia.org/wiki/IEEE_754) floating point |   
   |
+| **`decimal(P,S)`**   | Fixed-point decimal; precision P, scale S 
   | Scale is fixed [1], precision must be 38 or less |
+| **`string`** | Arbitrary-length character sequences  
   | Encoded with UTF-8 [2]   |
+| **`uuid`**   | Universally unique identifiers
   | Should use 16-byte fixed |
+| **`fixed(L)`**   | Fixed-length byte array of length L   
   |  |
+| **`binary`** | Arbitrary-length byte array   
   |  |
+| **`date`**   | Calendar date, without time, without timezone 
   | [3]  |
+| **`time`**   | Time of day, without date, without timezone   
   | [3], [4] |
+| **`timestamp`**  | Timestamp, with microsecond precision, without 
timezone  | [4], [6] |
+| *

Re: [PR] Spec: add types timstamp_ns and timestamptz_ns [iceberg]

2023-10-05 Thread via GitHub



jacobmarble commented on PR #8683:
URL: https://github.com/apache/iceberg/pull/8683#issuecomment-1749799494

   > What do you think? If you agree, then we should probably add `ms` types at 
the same time.
   
   Sounds good to me. I'll add milliseconds to this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8701:
URL: https://github.com/apache/iceberg/pull/8701#discussion_r1348114157


##
kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/CommitCompletePayload.java:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.connect.events;
+
+import java.util.UUID;
+import org.apache.avro.Schema;
+import org.apache.avro.SchemaBuilder;
+
+public class CommitCompletePayload implements Payload {
+
+  private UUID commitId;
+  private Long vtts;
+  private final Schema avroSchema;
+
+  private static final Schema AVRO_SCHEMA =

Review Comment:
   We typically prefer to define Iceberg schemas with field IDs and convert 
them to Avro schemas. I think those schemas are easier to read and it ensures 
that we assign field IDs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Kafka Connect: Initial project setup and event data structures [iceberg]

2023-10-05 Thread via GitHub



rdblue commented on code in PR #8701:
URL: https://github.com/apache/iceberg/pull/8701#discussion_r1348115184


##
kafka-connect/kafka-connect-events/src/main/java/org/apache/iceberg/connect/events/Element.java:
##
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.connect.events;
+
+import org.apache.avro.LogicalTypes;
+import org.apache.avro.Schema;
+import org.apache.avro.SchemaBuilder;
+import org.apache.avro.generic.IndexedRecord;
+import org.apache.avro.specific.SpecificData.SchemaConstructable;
+import org.apache.iceberg.avro.AvroSchemaUtil;
+
+public interface Element extends IndexedRecord, SchemaConstructable {

Review Comment:
   Rather than using `SchemaConstructable`, Iceberg just checks for a `Schema` 
constructor first.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



bitsondatadev commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348126895


##
docs-new/.github/bin/deploy_docs.sh:
##
@@ -0,0 +1,39 @@
+#!/bin/bash
+
+while [[ "$#" -gt 0 ]]; do
+case $1 in
+-v|--version) ICEBERG_VERSION="$2"; shift ;;
+*) echo "Unknown parameter passed: $1"; exit 1 ;;
+esac
+shift
+done
+
+GIT_BRANCH="docs-${ICEBERG_VERSION}"
+
+# change to branch
+git checkout -b $GIT_BRANCH
+
+#remove all local files and leave site/ dir
+rm ./*

Review Comment:
   Won't remove directories without -R



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Python: Remove python directory and references [iceberg]

2023-10-05 Thread via GitHub



ajantha-bhat commented on PR #8695:
URL: https://github.com/apache/iceberg/pull/8695#issuecomment-1749882115

   Shall we merge this if no comments? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Support get partition table with filter [iceberg-python]

2023-10-05 Thread via GitHub



puchengy commented on issue #24:
URL: https://github.com/apache/iceberg-python/issues/24#issuecomment-1749897984

   @Fokko Yes, I can help. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Phase 1 - New Docs Deployment [iceberg]

2023-10-05 Thread via GitHub



bitsondatadev commented on code in PR #8659:
URL: https://github.com/apache/iceberg/pull/8659#discussion_r1348198680


##
docs-new/mkdocs.yml:
##
@@ -0,0 +1,96 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+site_name: Apache Iceberg
+docs_dir: home

Review Comment:
   This causes warnings with the build expecting to see files in the relative 
locations across docs sites. I am using the current setup to try and minimize 
these errors so that it can be a notification that a link has broken.
   
   Further, I'd like to continue moving towards a 1:1 relationship between the 
directory structure, and the sitemap, so newcomers can use the site layout to 
guide where they will find the file.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Build: Add 1.4.0 to issue template [iceberg]

2023-10-05 Thread via GitHub



aokolnychyi opened a new pull request, #8728:
URL: https://github.com/apache/iceberg/pull/8728

   This PR adds 1.4.0 to our issue template.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Add 1.4.0 to issue template [iceberg]

2023-10-05 Thread via GitHub



aokolnychyi commented on PR #8728:
URL: https://github.com/apache/iceberg/pull/8728#issuecomment-1749928447

   @Fokko @nastra @rdblue @RussellSpitzer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Build: Let revapi compare against 1.4.0 [iceberg]

2023-10-05 Thread via GitHub



aokolnychyi commented on PR #8727:
URL: https://github.com/apache/iceberg/pull/8727#issuecomment-1749928311

   @Fokko @nastra @rdblue @RussellSpitzer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[PR] Common docs for 1.4.0 [iceberg-docs]

2023-10-05 Thread via GitHub



aokolnychyi opened a new pull request, #276:
URL: https://github.com/apache/iceberg-docs/pull/276

   This PR is the first step to update our docs as described 
[here](https://iceberg.apache.org/how-to-release/#common-documentation-update).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

1 2 >

1 - 100 of 117 matches

Mail list logo