[ https://issues.apache.org/jira/browse/SOLR-13942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051369#comment-17051369 ]
Ishan Chattopadhyaya edited comment on SOLR-13942 at 3/4/20, 4:13 PM: ---------------------------------------------------------------------- Jason, I think you lack some perspective from the point of view of experts who intend to resolve problems and also operations engineers who run & maintain Solr. Here's my perspective. 1. In past 4+ years of consulting, I've encountered several clients who have brought me on to help resolve a production issue. Often, these calls are over conference calls, so I don't have SSH access to their instances (sometimes I did). I am at their mercy to browse or capture screenshots of the Solr admin UI in order to get a fair idea of the ZK data. Here's a recent example, as [~munendrasn] can testify: recently I had to help out with a situation where overseer wasn't getting elected and OVERSEERSTATUS API wasn't working. I needed a way for the client to be able to dump the entire ZK data quickly and pass it to me for further analysis. (In this case, I made a recommendation without having access to the ZK data, and still saved the day.) Asking such clients, in times of such crisis, to install clients or fight with our ZK client etc. is unreasonable because there maybe policy restrictions on their part which will make this process lengthy. 2. Most often, Solr is just part of a very large distributed system comprising of several components and microservices. Expecting dev-ops to install and maintain additional tools for Solr is unreasonable. Also, since we treat ZK as an implementation detail of Solr, it is unreasonable to expect dev-ops to now start setting up proxies etc. for ZK as a way of monitoring Solr. Solr should be able to let expert users peek into the data that Solr puts into ZK. As you've rightly identified, cost of maintaining additional tools is a factor. Another factor is the complexity of monitoring such additional tools. Imagine, the situation when there's a crisis and an outage and the nginx proxy isn't working (and there was no alerting setup to warn that it has gone down). Having Solr let you keep into Solr's own internal state data reduces moving parts needed to debug Solr problems. Hope this helps. was (Author: ichattopadhyaya): Jason, I think you lack some perspective from the point of view of operations engineers who run Solr. Here's my perspective. 1. In past 4+ years of consulting, I've encountered several clients who have brought me on to help resolve a production issue. Often, these calls are over conference calls, so I don't have SSH access to their instances (sometimes I did). I am at their mercy to browse or capture screenshots of the Solr admin UI in order to get a fair idea of the ZK data. Here's a recent example, as [~munendrasn] can testify: recently I had to help out with a situation where overseer wasn't getting elected and OVERSEERSTATUS API wasn't working. I needed a way for the client to be able to dump the entire ZK data quickly and pass it to me for further analysis. (In this case, I made a recommendation without having access to the ZK data, and still saved the day.) Asking such clients, in times of such crisis, to install clients or fight with our ZK client etc. is unreasonable because there maybe policy restrictions on their part which will make this process lengthy. 2. Most often, Solr is just part of a very large distributed system comprising of several components and microservices. Expecting dev-ops to install and maintain additional tools for Solr is unreasonable. Also, since we treat ZK as an implementation detail of Solr, it is unreasonable to expect dev-ops to now start setting up proxies etc. for ZK as a way of monitoring Solr. Solr should be able to let expert users peek into the data that Solr puts into ZK. As you've rightly identified, cost of maintaining additional tools is a factor. Another factor is the complexity of monitoring such additional tools. Imagine, the situation when there's a crisis and an outage and the nginx proxy isn't working (and there was no alerting setup to warn that it has gone down). Having Solr let you keep into Solr's own internal state data reduces moving parts needed to debug Solr problems. Hope this helps. > /api/cluster/zk/* to fetch raw ZK data > -------------------------------------- > > Key: SOLR-13942 > URL: https://issues.apache.org/jira/browse/SOLR-13942 > Project: Solr > Issue Type: New Feature > Components: v2 API > Reporter: Noble Paul > Assignee: Noble Paul > Priority: Blocker > Fix For: 8.5 > > Time Spent: 20m > Remaining Estimate: 0h > > example > download the {{state.json}} of > {code} > GET http://localhost:8983/api/cluster/zk/collections/gettingstarted/state.json > {code} > get a list of all children under {{/live_nodes}} > {code} > GET http://localhost:8983/api/cluster/zk/live_nodes > {code} > If the requested path is a node with children show the list of child nodes > and their meta data -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org