jackye1995 commented on PR #7105:
URL: https://github.com/apache/iceberg/pull/7105#issuecomment-1482286193

   sorry I totally overlooked that, I would also -1 for using a specific 
external dependency like RocksDB or HBase, that was probably why I just quickly 
skipped those options... 
   
   But I feel the semantics required for partition stats just does not fit a 
file storage system, as you said it ends up having to choose between CoW and 
MoR, which seems like too much complexity to just manage some additional stats.
   
   I think we can start from a file storage (FileIO) based solution, but the 
spec should be at higher level such that it could be backed by more efficient 
solutions.
   
   I guess there is the same argument also for things like manifest list, today 
rolling up manifest list is a bottleneck for write operations, and some kind of 
design backed by a key-value store could solve that bottleneck. Maybe we should 
think about that and try to solve these cases together? Just like we have 
`FileIO` that works really well with object storage semantics, we can have 
something like `VersionedListStore` that works well with any mutable but 
versioned list.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to