RussellSpitzer opened a new issue, #8856: URL: https://github.com/apache/iceberg/issues/8856
### Feature Request / Improvement Currently all versions of metadata tables have the exact same schema as their not "all" versions. This is actually not very useful if you are attempting to locate the state of a particular entry at a specific time because the `snapshot_id` always just shows the file's original snapshot. For example the entries table looks like ``` scala> spark.sql("SELECT * FROM db.timezoned.entries").show warning: 1 deprecation (since 2.13.3); for details, enable `:setting -deprecation` or `:replay -deprecation` +------+-------------------+---------------+--------------------+--------------------+--------------------+ |status| snapshot_id|sequence_number|file_sequence_number| data_file| readable_metrics| +------+-------------------+---------------+--------------------+--------------------+--------------------+ | 1|6561920950175488866| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 1|5535987506380389562| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 1|2517256618694516958| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 1|2750236691316126600| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 1|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| ``` And `all_entries` looks like ``` scala> spark.sql("SELECT * FROM db.timezoned.all_entries").show warning: 1 deprecation (since 2.13.3); for details, enable `:setting -deprecation` or `:replay -deprecation` +------+-------------------+---------------+--------------------+--------------------+--------------------+ |status| snapshot_id|sequence_number|file_sequence_number| data_file| readable_metrics| +------+-------------------+---------------+--------------------+--------------------+--------------------+ | 1|2517256618694516958| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 1|6561920950175488866| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 1|2750236691316126600| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 1|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 1|5535987506380389562| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 0|6561920950175488866| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 0|5535987506380389562| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 0|2517256618694516958| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 0|2750236691316126600| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| | 0|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| +------+-------------------+---------------+--------------------+--------------------+--------------------+ ```` By looking at all_entries it is impossible for me to determine whether or not all the manifests were rewritten at once, or if they were rewritten in groups. Ideally we would see something like ``` +-------------------+-------------+------+-------------------+---------------+--------------------+--------------------+--------------------+ | as_of| time|status| snapshot_id|sequence_number|file_sequence_number| data_file| readable_metrics| +-------------------+-------------+------+-------------------+---------------+--------------------+--------------------+--------------------+ |7179885233531513409|1697493267302| 1|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2750236691316126600|1697493268363| 1|2750236691316126600| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2750236691316126600|1697493268363| 1|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2517256618694516958|1697493269568| 1|2517256618694516958| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2517256618694516958|1697493269568| 1|2750236691316126600| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2517256618694516958|1697493269568| 1|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |5535987506380389562|1697493270419| 1|5535987506380389562| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |5535987506380389562|1697493270419| 1|2517256618694516958| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |5535987506380389562|1697493270419| 1|2750236691316126600| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |5535987506380389562|1697493270419| 1|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |6561920950175488866|1697493271193| 1|6561920950175488866| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |6561920950175488866|1697493271193| 1|5535987506380389562| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |6561920950175488866|1697493271193| 1|2517256618694516958| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |6561920950175488866|1697493271193| 1|2750236691316126600| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |6561920950175488866|1697493271193| 1|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2315221993819944328|1697497673027| 1|6561920950175488866| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2315221993819944328|1697497673027| 1|5535987506380389562| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2315221993819944328|1697497673027| 1|2517256618694516958| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2315221993819944328|1697497673027| 1|2750236691316126600| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| |2315221993819944328|1697497673027| 1|7179885233531513409| 0| 0|{0, /Users/russel...|{{51, 1, 0, null,...| +-------------------+-------------+------+-------------------+---------------+--------------------+--------------------+--------------------+ ``` Apologies that my code is not getting status status correctly but basically the idea being that we add on two columns (names are still up for debate) "as_of_snapshot", "as_of_time" These would allow us to be able to analyze the actual history in all_entires and all_manifests ### Query engine None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org