tibrewalpratik17 opened a new issue, #12458:
URL: https://github.com/apache/pinot/issues/12458

   Currently, minion-jobs always use zkMetadata downloadURI to download 
segments from deepstore. 
   
   I want to get community's opinion on adding a new optional task-level config 
`allowPeerDownload` to allow minion-task during execution to try downloading 
segment from a server-peer directly once deepstore-retries fails. Currently, 
the job fails and does not move forward. This also creates a situation of 
`head-of-line-blocking` queue for subsequent task-runs if `tableMaxNumTasks` is 
specified. 
   
   PS: this issue specifically discusses the situation where the zkMetadata has 
deepstore URI available and is not "" (empty). 
   There can be multiple reasons for deepstore URI download failures:
   - Issues with deepstore connection / timeouts 
   - The segment-deepstore-copy getting TTL'ed from deepstore due to other 
non-Pinot frameworks (we are seeing this for some of our clusters).
   
   For example, we have an upsert-compaction task enabled for a table with 
following configs:
   ```
   "task": {
         "taskTypeConfigsMap": {
           "UpsertCompactionTask": {
             "invalidRecordsThresholdPercent": "30",
             "bufferTimePeriod": "0d",
             "schedule": "0 */5 * * * ?",
             "tableMaxNumTasks": "5"
           }
         }
       },
   ```
   The table has data for more than 60days. There was a TTL enforced on the 
deepstore-path for 7days. 
   
   The following graph shows that there was a drop (in red cirlce) when 
UpsertCompaction first kicked off (at that time I had removed 
`"tableMaxNumTasks": "5"` config). But once I added that config there is no 
task getting executed (not even for newer segments) because it is getting 
blocked by `FileNotFoundException` while downloading from deep-store for older 
segments.
   
   <img width="1471" alt="Screenshot 2024-02-21 at 3 34 40 PM" 
src="https://github.com/apache/pinot/assets/23629228/bc56768f-882b-43eb-9820-707018a62f5a";>
   
   The table has huge potential for cost-savings in terms of compaction and it 
seems we can use `peerDownload` to unblock the task.
   
   Another parallel discussion:
   There is a framework which periodically checks if zkMetadata URI is empty 
then upload the jar to deepstore but there is no framework which checks if the 
path pointed out by zkMetadata URI is actually present or not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to