syun64 commented on issue #368:
URL: https://github.com/apache/iceberg-python/issues/368#issuecomment-2021532558

   > In order to implement this with snapshot properties I want my writer to do 
the following transactionally:
   
   > Fetch the current snapshot's dateranges property.
   Modify that dateranges value to include the dates which are about to be 
written.
   Merge the new data and update the dateranges snapshot property, in the same 
new snapshot.
   If another concurrent writer were to write its own new snapshot between step 
1 and 3, I would want my writer to throw an exception and then I'll try again 
at step 1 starting from the latest snapshot.
   
   > Another approach I had in mind was to be able to read and write snapshot 
properties from PySpark SQL query. That is appealing because it would be a 
single-client solution which would also allow my non-python clients to perform 
writes that honor this dateranges property.
   
   I think you should be able to do this today by keeping track of the Iceberg 
table snapshot you are looking at to do task (1), and then writing with 
snapshot property and then using an isolation property based on the snapshot 
commit you've started your sequence of operations from, so that your commit 
fails if there has been a concurrent commit that was made since then.
   
   https://iceberg.apache.org/docs/1.5.0/spark-configuration/#write-options
   
   "isolation-level", "validate-from-snapshot-id" and "snapshot-property" are 
probably the write options you want to use to achieve your goal in PySpark. Let 
me know if that works for you!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to