mcvsubbu commented on PR #9058:
URL: https://github.com/apache/pinot/pull/9058#issuecomment-1197352989

   > > @mcvsubbu
   > > 
   > > 1. @npawar @Jackie-Jiang ? I might have just very rough, possibly 
inaccurate numbers.
   > > 2. I feel the need of a control plane level API within pinot to give an 
overall view into current and past state of minion tasks is of importance to 
us. Task generator being a key part of the entire minion task flow. While 
metrics can help to some extent, having details like failure stack traces etc 
might be difficult. This api avoids having to tally metrics and debug logs from 
a separate log processing system.
   > > 3. I suppose it could. But integrating the log processing framework into 
pinot APIs themselves might be a bit of a challenge. Having a system table for 
these kind of usecases might be the right way forward, such that pinot itself 
can store and serve debug data and status metrics for each of the components / 
flows. Essentially move from in-memory storage of logs and metrics into the 
system table
   > > 
   > > @npawar @Jackie-Jiang to add more
   > 
   > for 1: typically for users of RealtimeToOfflineTask, tasks get generated 
hourly. For SegmentGenerationAndPushTask, it can be way more frequently, 
depending on the the number of times files are generated in the source dir. 
MergeRollupTasks, are less frequent, but still several a day. There might be 
others, but this is what we see most commonly setup by users in oss. In a 
typical setup, all of these would be configured. It becomes quite confusing for 
users to have to find the exact exception in the logs, especially because some 
logs are in controller (scheduler related) and some in minion (task execution 
related). This API will help us make the feedback loop quicker, especially when 
we add this into the new Minion tab on the Pinot Admin UI
   > 
   > Regarding info already in Helix, these scheduler related exceptions are 
not present in the Helix generated metadata.
   
   I still think log processing should be the answer to this (and other similar 
PRs that may come up in the future). We should not be adding a new API for 
every error condition we may encounter in the system (and log something).
   
   @siddharthteotia , @npawar , @Jackie-Jiang , @snleee, @kishoreg, 
@mayankshriv  what do you think? If the PMCs don't have any objection to this 
then I can live with this, but I am willing to bet that more such PRs will come 
up because logs are difficult to read. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to