danhphan commented on issue #1279:
URL: 
https://github.com/apache/iceberg-python/issues/1279#issuecomment-2466676408

   Thanks @kevinjqliu , I'm reading the code base.
   
   Can you please give me an example of expected unit-tests for the feature if 
possible? For instance, if we create the follow `s3_fileio` with "s3.region": 
"us-east-1" in the `session_properties`. Then we create an `input_file` on s3 
bucket of `warehouse`, which is actually located in "eu-central-1" region, what 
should be the expected results?
   
   ```
   session_properties: Properties = {
       "s3.endpoint": "http://localhost:9000";,
       "s3.access-key-id": "admin",
       "s3.secret-access-key": "password",
       "s3.region": "us-east-1",
       "s3.session-token": "s3.session-token",
       **UNIFIED_AWS_SESSION_PROPERTIES,
   }
   
   s3_fileio = PyArrowFileIO(properties=session_properties)
   print(s3_fileio.properties['s3.region']) #--> us-east-1
   
   filename = str(uuid.uuid4())
   input_file = s3_fileio.new_input(location=f"s3://warehouse/{filename}")
   print(pyarrow.fs.resolve_s3_region('warehouse')) #--> eu-central-1
   
   output_file = s3_fileio.new_output(location=f"s3://foo/{filename}")
   print(pyarrow.fs.resolve_s3_region('foo')) #--> us-east-1
   ```
   
   I'm thinking may be in the `def _initialize_fs(self, scheme: str, netloc: 
Optional[str] = None) -> FileSystem` in your above comments, we can assign the 
value for "region" in `client_kwargs` based on the value of `netloc` (or s3 
bucket), but not sure if it is the right direction.
   
   Like: `"region": pyarrow.fs.resolve_s3_region(netloc), `
   
   ```
    def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> 
FileSystem: 
        if scheme in {"s3", "s3a", "s3n"}: 
            from pyarrow.fs import S3FileSystem 
     
            client_kwargs: Dict[str, Any] = { 
                "endpoint_override": self.properties.get(S3_ENDPOINT), 
                "access_key": get_first_property_value(self.properties, 
S3_ACCESS_KEY_ID, AWS_ACCESS_KEY_ID), 
                "secret_key": get_first_property_value(self.properties, 
S3_SECRET_ACCESS_KEY, AWS_SECRET_ACCESS_KEY), 
                "session_token": get_first_property_value(self.properties, 
S3_SESSION_TOKEN, AWS_SESSION_TOKEN), 
                "region": get_first_property_value(self.properties, S3_REGION, 
AWS_REGION), 
            } 
   ```
   Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to