mccormickt12 commented on PR #2291:
URL: https://github.com/apache/iceberg-python/pull/2291#issuecomment-3180515764

   > What is the proper way to address an absolute path in HadoopFileSystem? 
your example shows that `/path/to/file/` works but `{host}/path/to/file` does 
not work. Should `{host}/path/to/file` also work?
   > 
   > Im trying to see what the requirements are here. I only found examples 
with 
[`hdfs://`](https://github.com/apache/iceberg/blob/0be91dce702de8707fdecfa6fd909cf0d8dae8c9/core/src/main/java/org/apache/iceberg/hadoop/HadoopTables.java#L80)
   > 
   > Also im curious if 
[`HadoopFileSystem.from_uri`](https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html#pyarrow.fs.HadoopFileSystem.from_uri)
 will work for `long_table_base`
   
   It seems to not like the URI passed in 
   
   >>> fs  = 
HadoopFileSystem.from_uri("hdfs://ltx1-yugioh-cluster01.linkfs.prod-ltx1.atd.prod.linkedin.com:9000")
   >>> path = "/user/tmccormi"
   >>> fs.get_file_info(fs.FileSelector(path, recursive=False))
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   AttributeError: 'pyarrow._hdfs.HadoopFileSystem' object has no attribute 
'FileSelector'
   >>> fs.get_file_info(path)
   25/08/12 18:16:38 WARN shortcircuit.DomainSocketFactory: The short-circuit 
local reads feature cannot be used because libhadoop cannot be loaded.
   <FileInfo for '/user/tmccormi': type=FileType.Directory>
   >>> print(fs.get_file_info(path))
   <FileInfo for '/user/tmccormi': type=FileType.Directory>
   >>> path = 
"hdfs://ltx1-yugioh-cluster01.linkfs.prod-ltx1.atd.prod.linkedin.com:9000/user/tmccormi"
   >>> print(fs.get_file_info(path))
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "pyarrow/_fs.pyx", line 590, in pyarrow._fs.FileSystem.get_file_info
     File "pyarrow/error.pxi", line 155, in 
pyarrow.lib.pyarrow_internal_check_status
     File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
   pyarrow.lib.ArrowInvalid: GetFileInfo must not be passed a URI, got: 
hdfs://ltx1-yugioh-cluster01.linkfs.prod-ltx1.atd.prod.linkedin.com:9000/user/tmccormi


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to