dejii commented on PR #49768:
URL: https://github.com/apache/airflow/pull/49768#issuecomment-3168176325

   > You're suggested solution does not cover my use case where files are 
always copied to the same prefix. This imply that I need to check the creation 
time of the files also. 
   
   I don’t think checking the creation time is necessary. The list of objects 
to be copied is determined by `S3ListOperator.execute`, so as long as the same 
prefix is used in the `S3ToGCSOperator`, you should get the same result across 
subsequent tasks
   
   
https://github.com/apache/airflow/blob/aa6615352d98bc0f4b42a8e3accbe1f455e54ba8/providers/google/src/airflow/providers/google/cloud/transfers/s3_to_gcs.py#L185-L186
   
   > Actually I don't know how the deffered operators behave if the triggerrer 
is restarted during the deferring.
   
   It's not restarted during deferral, but it’s designed to be stateless and 
resilient to restarts. To preserve that statelessness with your proposed 
solution, you'd need to serialize the list of objects—which might not be ideal, 
as it could consume significant space in the metadata database.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to