jhump commented on PR #415:
URL: https://github.com/apache/iceberg-go/pull/415#issuecomment-2848175869

   Sorry for the back and forth. I added a comment on the issue that took back 
the API request, about getting the metadata separately from the entries. I 
realized that it would ideally be possible to get the metadata _and_ the 
entries using a single `io.Reader`, without the need for two object store calls 
if the application chooses to get the entries, too. (Object store API ops is a 
non-trivial cost component of my application, so I try to minimize them here 
possible.)
   
   I think what I was really hoping for, in an ideal world, would be a 
top-level function more like so:
   ```go
   func ReadManifest(in io.Reader, file ManifestFile, discardDeleted bool) 
(*ManifestContent, error)
   
   type ManifestContent struct {
       // ... can be all unexported
   }
   
   // Accessors can lazily parse (and memoize) metadata and the remainder of 
the file,
   // which is why they all can return an error
   
   func (mc *ManifestContent) Schema() (Schema, error)
   
   func (mc *ManifestContent) SchemaID() (int, error)
   
   func (mc *ManifestContent) PartitionSpec() (PartitionSpec, error)
   
   func (mc *ManifestContent) PartitionSpecID() (int, error)
   
   func (mc *ManifestContent) Entries() ([]ManifestEntry, error)
   ```
   
   The existing `ManifestFile.FetchEntries` can trivially be wired up to this 
for backwards compatibility.
   
   Providing a reader-based API would make it much easier to consume from my 
app. (Metadata and manifest list files can already be consumed via `io.Reader`; 
manifest files/entries are the only thing not exposed this way.) And it having 
the above properties, of not needing two object store calls to get both 
metadata and data, is ideal. I also made the accessors all lazy in the above 
proposal just to avoid unnecessary processing, so no need to process metadata 
that the caller doesn't need or use.
   
   BTW, I am happy to float a PR; I'm not trying to put extra work on your or 
any other maintainers (especially since it sounds like I may be in a very small 
minority of users that care about accessing this metadata).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to