satnash opened a new issue, #16913:
URL: https://github.com/apache/pinot/issues/16913

   Question about retention manager's behavior: 
   
   I have a table whose dateTime field is defined as below: 
   
   `"dateTimeFieldSpecs": [
       { "name": "collection_time", "dataType": "LONG", "format": 
"1:SECONDS:EPOCH", "granularity": "1:MILLISECONDS" }
     ]`
   
   The segment config is below: 
   
   `"segmentsConfig": {
       "timeColumnName": "collection_time",
       "timeType": "SECONDS",
       "segmentPushType": "APPEND",
       "replication": "2",
       "retentionTimeUnit": "DAYS",
       "retentionTimeValue": "7",
       "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
       "peerSegmentDownloadScheme": "http"
     },`
   
   I see that the segments are not getting removed past 7 days. One issue i 
have is the incoming data sometimes has the data from source where NTP is not 
synced, hence they report timestamps in seconds as 147, 243, etc. All on 1970 
Jan 01. So the metadata on segments have snaphots like below
   
   `"stats__6__87__20250925T2024Z" : {
       "segmentName" : "stats__6__87__20250925T2024Z",
       "schemaName" : null,
       "crc" : 2382557963,
       "creationTimeMillis" : 1758837918668,
       "creationTimeReadable" : "2025-09-25T22:05:18:668 UTC",
       "timeColumn" : "collection_time",
       "timeUnit" : "SECONDS",
       "timeGranularitySec" : 1,
       "startTimeMillis" : 79000,
       "startTimeReadable" : "1970-01-01T00:01:19.000Z",
       "endTimeMillis" : 1758983934000,
       "endTimeReadable" : "2025-09-27T14:38:54.000Z",
       "segmentVersion" : "v3",
       "creatorName" : null,
       "totalDocs" : 2652778,
       "custom" : { },
       "startOffset" : "560677999",
       "endOffset" : "563330777",
       "columns" : [ ],
       "indexes" : { },
       "star-tree-index" : null
     },
   `
   
   1. Do I need to stay away from using time from sources ? or does the 
rentention manager only care about end time in which case I am wondering what 
else i need to be looking at ?
   2. I read in the docs that granularity is for documentation purposes only. 
Technically that should also be 1:SECONDS since i wanted each row to be treated 
separately.
   
   Any guidance here is much appreciated. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to