snleee commented on code in PR #10045:
URL: https://github.com/apache/pinot/pull/10045#discussion_r1062679718


##########
pinot-connectors/pinot-flink-connector/src/main/java/org/apache/pinot/connector/flink/sink/PinotSinkFunction.java:
##########
@@ -151,4 +148,18 @@ private void flush()
       LOG.info("Pinot segment uploaded to {}", segmentURI);
     });
   }
+
+  @Override
+  public List<GenericRow> snapshotState(long checkpointId, long timestamp) {

Review Comment:
   I see. Have you thought of using Kafka Sink on Flink side and let Pinot 
directly consume from the output kafka topic? Pinot's real-time consumption 
handles the offset commit by itself and it handles fault tolerance by resuming 
to consume from the offset that has been committed recently on the Pinot side.
   
   For this PR, let's at least go with your 2nd approach using Flink state to 
unblock you. In case of Flink state, does Flink take care of cleaning up the 
RocksDB data? If not, we should handle the data clean up when exiting the 
`flush()`.



##########
pinot-connectors/pinot-flink-connector/src/main/java/org/apache/pinot/connector/flink/sink/PinotSinkFunction.java:
##########
@@ -151,4 +148,18 @@ private void flush()
       LOG.info("Pinot segment uploaded to {}", segmentURI);
     });
   }
+
+  @Override
+  public List<GenericRow> snapshotState(long checkpointId, long timestamp) {

Review Comment:
   I see. Have you thought of using Kafka Sink on the Flink side and let Pinot 
directly consume from the output kafka topic? Pinot's real-time consumption 
handles the offset commit by itself and it handles fault tolerance by resuming 
to consume from the offset that has been committed recently on the Pinot side.
   
   For this PR, let's at least go with your 2nd approach using Flink state to 
unblock you. In case of Flink state, does Flink take care of cleaning up the 
RocksDB data? If not, we should handle the data clean up when exiting the 
`flush()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to