Re: [D] [Proposal] Iceberg subsystem for datalake_fdw — design proposal [cloudberry]

via GitHub Thu, 23 Apr 2026 18:35:29 -0700


GitHub user MisterRaindrop added a comment to the discussion: [Proposal] 
Iceberg subsystem for datalake_fdw — design proposal


Thanks for the detailed feedback, @leborchuk. Responding point by point:

  1. RPC interface

  Yes — datalake_agent will expose a Protobuf + gRPC interface, treated as a 
stable, versioned
  contract so that the QD and the agent can evolve independently.

  2. Our primary motivation

  Our main motivation aligns with your scenario (1): cross-cluster data 
sharing, together with
  the storage–compute separation that the Iceberg architecture naturally 
enables. We're very
  optimistic about this direction overall.

  3. On scenario (2) — archive

  A genuine question back: if the end state is data sitting on object storage 
with Iceberg
  metadata, why not write directly to object storage from day one, rather than 
landing it in GP
  first and archiving later? That would collapse the archive case into the same 
code path as data
   sharing.

  4. Schema import / view-like tables

  Because we're going with the Table AM approach, every Iceberg table must have 
a corresponding
  relation in the catalog, so a CREATE TABLE is unavoidable — you will still 
need to create a
  table. That said, making the column set dynamic (tracking Iceberg schema 
evolution at read
  time) is entirely feasible and not particularly hard, and we plan to support 
it.

  5. Caching

  We do believe caching is effective. The first pull from remote storage is 
unavoidably slow, but
   once blocks are cached on local disk, reads are essentially 
indistinguishable from local
  files. On the cache side, prefetching and parallel download are both worth 
considering.

  The common reasons caching appears to underperform are, in our view:
  - cache capacity too small → low hit rate
  - network bottleneck during background fetch
  - cache block size too large → poor efficiency
  - insufficient concurrency → can't keep up with the consumer

  In principle, with proper sizing and tuning, a well-configured cache can 
reach near-local
  performance.

  6. Polaris

  Polaris is not a blocker. Cloudberry will manage all Iceberg metadata 
internally;
  Polaris is only consulted at read time to fetch the latest Iceberg metadata 
pointer. Even if
  Polaris goes down, we can still read the Iceberg data.

  7. One caveat on performance expectations

  One thing worth flagging: we do want it fast, but the realistic baseline for 
comparison is GP
  itself, not a columnar engine. Cloudberry is a PG-based row engine — we 
return data row-by-row
  rather than in batches like a columnar engine — so it will naturally be 
somewhat slower than
  columnar systems. Our plan is to complete the functional surface first, and 
then optimize this
  axis as a dedicated follow-up.

GitHub link: 
https://github.com/apache/cloudberry/discussions/1683#discussioncomment-16694368

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [D] [Proposal] Iceberg subsystem for datalake_fdw — design proposal [cloudberry]

Reply via email to