RussellSpitzer opened a new issue, #8856:
URL: https://github.com/apache/iceberg/issues/8856

   ### Feature Request / Improvement
   
   Currently all versions of metadata tables have the exact same schema as 
their not "all" versions. This is actually not very useful if you are 
attempting to locate the state of a particular entry at a specific time because 
the `snapshot_id` always just shows the file's original snapshot.
   
   For example the entries table looks like
   
   ```
   scala> spark.sql("SELECT * FROM db.timezoned.entries").show
   warning: 1 deprecation (since 2.13.3); for details, enable `:setting 
-deprecation` or `:replay -deprecation`
   
+------+-------------------+---------------+--------------------+--------------------+--------------------+
   |status|        snapshot_id|sequence_number|file_sequence_number|           
data_file|    readable_metrics|
   
+------+-------------------+---------------+--------------------+--------------------+--------------------+
   |     1|6561920950175488866|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     1|5535987506380389562|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     1|2517256618694516958|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     1|2750236691316126600|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     1|7179885233531513409|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   ```
   
   And `all_entries` looks like
   
   ```
   scala> spark.sql("SELECT * FROM db.timezoned.all_entries").show
   warning: 1 deprecation (since 2.13.3); for details, enable `:setting 
-deprecation` or `:replay -deprecation`
   
+------+-------------------+---------------+--------------------+--------------------+--------------------+
   |status|        snapshot_id|sequence_number|file_sequence_number|           
data_file|    readable_metrics|
   
+------+-------------------+---------------+--------------------+--------------------+--------------------+
   |     1|2517256618694516958|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     1|6561920950175488866|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     1|2750236691316126600|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     1|7179885233531513409|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     1|5535987506380389562|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     0|6561920950175488866|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     0|5535987506380389562|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     0|2517256618694516958|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     0|2750236691316126600|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   |     0|7179885233531513409|              0|                   0|{0, 
/Users/russel...|{{51, 1, 0, null,...|
   
+------+-------------------+---------------+--------------------+--------------------+--------------------+
   ````
   
   By looking at all_entries it is impossible for me to determine whether or 
not all the manifests were rewritten at once, or if they were rewritten in 
groups. Ideally we would see something like 
   
   ```
   
+-------------------+-------------+------+-------------------+---------------+--------------------+--------------------+--------------------+
   |              as_of|         time|status|        
snapshot_id|sequence_number|file_sequence_number|           data_file|    
readable_metrics|
   
+-------------------+-------------+------+-------------------+---------------+--------------------+--------------------+--------------------+
   |7179885233531513409|1697493267302|     1|7179885233531513409|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2750236691316126600|1697493268363|     1|2750236691316126600|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2750236691316126600|1697493268363|     1|7179885233531513409|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2517256618694516958|1697493269568|     1|2517256618694516958|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2517256618694516958|1697493269568|     1|2750236691316126600|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2517256618694516958|1697493269568|     1|7179885233531513409|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |5535987506380389562|1697493270419|     1|5535987506380389562|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |5535987506380389562|1697493270419|     1|2517256618694516958|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |5535987506380389562|1697493270419|     1|2750236691316126600|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |5535987506380389562|1697493270419|     1|7179885233531513409|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |6561920950175488866|1697493271193|     1|6561920950175488866|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |6561920950175488866|1697493271193|     1|5535987506380389562|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |6561920950175488866|1697493271193|     1|2517256618694516958|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |6561920950175488866|1697493271193|     1|2750236691316126600|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |6561920950175488866|1697493271193|     1|7179885233531513409|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2315221993819944328|1697497673027|     1|6561920950175488866|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2315221993819944328|1697497673027|     1|5535987506380389562|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2315221993819944328|1697497673027|     1|2517256618694516958|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2315221993819944328|1697497673027|     1|2750236691316126600|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   |2315221993819944328|1697497673027|     1|7179885233531513409|              
0|                   0|{0, /Users/russel...|{{51, 1, 0, null,...|
   
+-------------------+-------------+------+-------------------+---------------+--------------------+--------------------+--------------------+
   ```
   
   Apologies that my code is not getting status status correctly but basically 
the idea being that we add on two columns (names are still up for debate)
   "as_of_snapshot", "as_of_time"
   
   These would allow us to be able to analyze the actual history in all_entires 
and all_manifests
   
   ### Query engine
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to