HyunWooZZ commented on issue #1952:
URL: 
https://github.com/apache/iceberg-python/issues/1952#issuecomment-2851040172

   Hi, @kevinjqliu :) Thank you for summarizing.
   
   > Looking at the screenshot. It looks like you're running create_table with 
a GCS table location. On table creation, the metadata json file write failed 
with FileExistsError.
   
   You're right! Before creating a file like metadata, the create method checks 
if the file exists in the filesystem:
   ``` python
           try:
               if not overwrite and self.exists() is True:
                   raise FileExistsError(f"Cannot create file, already exists: 
{self.location}")
   ```
   >do you know how this directory was created? Seems like this might be the 
real problem. GCS created a directory named gs://.../....metadata.json and this 
conflicts with the metadata json write above.
   
   No, there was nothing there. Also, as you know, in an object storage system, 
there's no concept of a directory.
   So I think the Arrow GCS filesystem ad lib returns the object information 
because gcsfs cannot check GCS objects without credentials.
   
   >this implies that there was a permission issue. And additional credential 
resolved it. when you created the catalog, did you provide gcs credentials like 
so https://py.iceberg.apache.org/configuration/#google-cloud-storage
   
   I didn’t include the OAuth token. I just added the project ID and locations.
   If I had generated an OAuth token and injected it into the config kwargs, it 
probably would have worked.
   However, I think OAuth tokens are generally not suitable for server-side 
jobs like event-driven or scheduled jobs, because it would require generating 
and injecting the token every time before running the job.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to