Re: [I] Questions around Iceberg-rust [iceberg-rust]

via GitHub Wed, 10 Jul 2024 06:39:36 -0700


liurenjie1024 commented on issue #450:
URL: https://github.com/apache/iceberg-rust/issues/450#issuecomment-2220534721


   > How does connecting an Iceberg catalog with a specific S3 bucket work? I 
understand the structure on S3 with dividing a table into parquet data files 
and avro metadata files, but I am not sure how the relationship between this 
file organization and a deployed catalog works, and how to configure that 
exactly.
   
   It depends on what catalog you are using. For hms/glue catalog, which could 
be classified as client side catalog, you need to setup hive metastore or glue 
server, and pass `warehouse` configuration to catalog builder. For rest 
catalog, it's the rest catalog server's responsibility to manage the location. 
   
   > Where does Pyiceberg fit into Iceberg-rust? Would it be possible to deploy 
Iceberg-rust on the server side, and interact with the rest catalog through 
Pyiceberg? I like python as a nice interface for data consumers to interact 
with a catalog, and for basic management of tables.
   
   Currently there is no relationship between these two libraries, and they are 
just iceberg implementation in different languages. iceberg-rust is a library, 
so you can use it in a server, but you need to write server code by yourself. 
Since pyiceberg and iceberg-rust both implement iceberg spec, so you can in 
theory use iceberg-rust to write data into iceberg table, and use pyiceberg to 
read them, and vice verse.
   
   > What are the write table options with an Iceberg rust? As of now, is it 
only possible with a distributed engine like Spark or Trino? What would be the 
bottlenecks to duckdb, polars, or Ibis+backend writes? The vast majority of my 
datasets are less than 50Gb currently, and most workloads a fraction of that. I 
would like to use Iceberg for its superior data management vs files, but 
initially for use cases that can mostly be done on a single node and don't 
really need the power of distributed engines.
   
   Currently iceberg-rust has not implemented writing to table yet. The 
community focuses on reading support in recent releases.
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] Questions around Iceberg-rust [iceberg-rust]

Reply via email to