Re: Getting partition min/max timestamp

2018-01-14 Thread Benedict Elliott Smith
It's a long time since I looked at the code, but I'm pretty sure that comment is explaining why we translate *no* timestamp to *epoch*, to save space when serializing the encoding stats. Not stipulating that the data may be inaccurate. However, being such a long time since I looked, I forgot we s

Re: Getting partition min/max timestamp

2018-01-14 Thread Jeremiah Jordan
Finding the max timestamp of a partition is an aggregation. Doing that calculation purely on the replica (wether pre-calculated or not) is problematic for any CL > 1 in the face of deletions or update that are missing. As the contents of the partition on a given replica are different than what

Re: Getting partition min/max timestamp

2018-01-14 Thread arhel...@gmail.com
First of all, thx for all the ideas. Benedict ElIiott Smith, in code comments I found a notice that data in EncodingStats can be wrong, not sure that its good idea to use it for accurate results. As I understand incorrect data is not a problem for the current use case of it, but not for my one

Re: Getting partition min/max timestamp

2018-01-14 Thread Benedict Elliott Smith
(Obviously, not to detract from the points that Jon and Jeremiah make, i.e. that if TTLs or tombstones are involved the metadata we have, or can add, is going to be worthless in most cases anyway) On 14 January 2018 at 16:11, Benedict Elliott Smith wrote: > We already store the minimum timestamp

Re: Getting partition min/max timestamp

2018-01-14 Thread Benedict Elliott Smith
We already store the minimum timestamp in the EncodingStats of each partition, to support more efficient encoding of atom timestamps. This just isn't exposed beyond UnfilteredRowIterator, though it probably could be. Storing the max alongside would still require justification, though its cost wou

Re: Getting partition min/max timestamp

2018-01-14 Thread Jeremiah Jordan
Don’t forget about deleted and missing data. The bane of all on replica aggregation optimization’s. > On Jan 14, 2018, at 12:07 AM, Jeff Jirsa wrote: > > > You’re right it’s not stored in metadata now. Adding this to metadata isn’t > hard, it’s just hard to do it right where it’s useful to p