smaheshwar-pltr commented on PR #1509:
URL: https://github.com/apache/iceberg-python/pull/1509#issuecomment-2585391350

   > In terms of the default value for WRITE_OBJECT_STORE_PARTITIONED_PATHS, 
here are the four different scenarios. 
([source](https://github.com/apache/iceberg-python/blob/cad0ad7d9358315abe1315de2a64227d91acceaa/pyiceberg/table/locations.py#L79-L90))
   > 1. WRITE_OBJECT_STORE_PARTITIONED_PATHS=True for Partitioned Table: 
f"{partition_key.to_path()}/{data_file_name}"
   
   Small correction - the function *recurses* on [this 
line](https://github.com/apache/iceberg-python/blob/cad0ad7d9358315abe1315de2a64227d91acceaa/pyiceberg/table/locations.py#L81),
 so hashes are still prepended to the file name after the prefix, and it's 
*not* the same as when `SimpleLocationProvider` is used. 
   
   To clarify, this isn't a bug. The `ObjectStoreLocationProvider` *should* 
include entropies to reduce prefix collision - that shouldn't change just 
because data is partitioned. I realise it was written confusingly, sorry about 
that. I matched the [Java implementation 
here](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/LocationProviders.java#L154-L155),
 and FWIW I suspect this confusing recursive implementation is what led to what 
I've described 
[here](https://github.com/apache/iceberg-python/pull/1452/files#r1893981856) as 
a potential bug (/unintended result) on their part.
   
   I'm realising maybe we should add a unit test for this case (Case 1) even 
though the Java implementation doesn't have one. All other cases are.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to