nickdelnano commented on code in PR #11923: URL: https://github.com/apache/iceberg/pull/11923#discussion_r1911517154
########## docs/docs/spark-getting-started.md: ########## @@ -77,21 +77,24 @@ Once your table is created, insert data using [`INSERT INTO`](spark-writes.md#in ```sql INSERT INTO local.db.table VALUES (1, 'a'), (2, 'b'), (3, 'c'); -INSERT INTO local.db.table SELECT id, data FROM source WHERE length(data) = 1; ``` Iceberg also adds row-level SQL updates to Spark, [`MERGE INTO`](spark-writes.md#merge-into) and [`DELETE FROM`](spark-writes.md#delete-from): ```sql -MERGE INTO local.db.target t USING (SELECT * FROM updates) u ON t.id = u.id -WHEN MATCHED THEN UPDATE SET t.count = t.count + u.count +CREATE TABLE local.db.updates (id bigint, data string) USING iceberg; +INSERT INTO local.db.updates VALUES (1, 'x'), (2, 'y'), (4, 'z'); Review Comment: about merge branches, commented here https://github.com/apache/iceberg/pull/11923/files?diff=unified&w=0#r1906122418 The table states are straightforward until after the MERGE query (1 insert per table). I have added the table state as a comment after MERGE only. Otherwise there is a lot of duplication. Let me know your thoughts. ########## docs/docs/spark-getting-started.md: ########## @@ -160,7 +163,7 @@ This type conversion table describes how Spark types are converted to the Iceber | map | map | | !!! info - The table is based on representing conversion during creating table. In fact, broader supports are applied on write. Here're some points on write: + The table is based on type conversions during table creation. Broader type conversions are applied on write: Review Comment: thanks, updated ########## spark/v3.3/spark-runtime/src/integration/java/org/apache/iceberg/spark/SmokeTest.java: ########## @@ -66,25 +64,25 @@ public void testGettingStarted() throws IOException { sql( "CREATE TABLE updates (id bigint, data string) USING parquet LOCATION '%s'", temp.newFolder()); - sql("INSERT INTO updates VALUES (1, 'x'), (2, 'x'), (4, 'z')"); + sql("INSERT INTO updates VALUES (1, 'x'), (2, 'y'), (4, 'z')"); Review Comment: the example still hits all branches of merge - id 1 and 2 are updated - id 3, 10, 11 are unchanged - id 4 does not match and is inserted the change here is to provide a unique `data` value for results as that helps to explain the example in the docs better ########## spark/v3.3/spark-runtime/src/integration/java/org/apache/iceberg/spark/SmokeTest.java: ########## @@ -66,25 +64,25 @@ public void testGettingStarted() throws IOException { sql( "CREATE TABLE updates (id bigint, data string) USING parquet LOCATION '%s'", Review Comment: I tried and this change breaks tests that use SmokeTest with hadoop and hive catalogs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org