Re: [PR] Spark: Support Trigger AvailableNow in SS [iceberg]

via GitHub Wed, 03 Sep 2025 11:26:37 -0700


alexprosak commented on code in PR #13824:
URL: https://github.com/apache/iceberg/pull/13824#discussion_r2319838761



##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java:
##########
@@ -523,6 +530,16 @@ public ReadLimit getDefaultReadLimit() {
     }
   }
 
+  @Override
+  public void prepareForTriggerAvailableNow() {
+    LOG.info("The streaming query reports to use Trigger.AvailableNow");
+
+    lastOffsetForTriggerAvailableNow =
+        (StreamingOffset) latestOffset(initialOffset, 
ReadLimit.allAvailable());
+
+    LOG.info("lastOffset for Trigger.AvailableNow is {}", 
lastOffsetForTriggerAvailableNow.json());

Review Comment:
   Thought the log is a little more verbose with `.json()`:
   ```
   lastOffset for Trigger.AvailableNow is 
{"version":1,"snapshot_id":9219094338079637662,"position":2,"scan_all_files":false}
   ```
   
   versus the existing toString:
   ```
   lastOffset for Trigger.AvailableNow is Streaming Offset[8152464158888717084: 
position (2) scan_all_files (false)]
   ```
   
   But if we don't want to depend on json generator for the log that's 
understandable, I can switch to use toString instead if that's what you 
recommend



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Spark: Support Trigger AvailableNow in SS [iceberg]

Reply via email to