I've wondered about this as well. Recall that the proper architecture for Solr as well as ZooKeeper is as a back-end service, part of a tiered architecture, with web application servers in front. Solr and other search engines should fit in at the same layer as RDBMS and NoSQL, with the web applications in front of them. In some larger systems, there is even an Enterprise SOA layer in between, but I've never worked on a project where I felt that was truly justified. It is probably a matter of scale however.
The common-case solution relies on this architecture - Solr and Zookeeper can be protected by IP address firewalls both off system and on system. The network firewalls (AWS security policy) allow only certain ip addresses/networks to connect to Solr and Zookeeper, and the local system firewalls act as a back-up to this system. The SHA1 checksum within ZooKeeper and the Basic Authentication within SolrCloud then act as a way to fine tune access control, but they are not so much to protect Solr and Zookeeper but to allow a division of privileges. Some sites will find this insufficient: - Solr supports SSL - https://cwiki.apache.org/confluence/display/solr/Enabling+SSL - ZooKeeper supports SSL - https://cwiki.apache.org/confluence/display/ZOOKEEPER/ZooKeeper+SSL+User+Guide Both also at this point support custom authentication providers. My Solr is less protected that it should be, but I have mod_auth_cas protecting the solr admin interface, and certain request handlers can be accessed without this security through hand-built Apache httpd conf.d files for each core. There is a load-balancer (like Amazon Elastic Load Balancer (ELB)) in front of all Solr nodes, and since fault-tolerance is needed only for search, not for indexing, this is adequate. In other words, my Solr clients would not operate in SolrCloud mode, even if I made the Solr instance itself SolrCloud for ease of management. I'm having a little bit of a problem justifying this setup - the Role Based Authorization Plugin for Solr Basic Auth only scales to Enterprise use if you have a web front-end to manage the users, passwords, groups, and roles. Does this help? P.S. - Generally, one cross posts to another list only one when does not receive a good reply on the first list. I can see how both u...@zookeeper.apache.org and solr-user@lucene.apache.org may be justified, but I don't see how you can justify more lists than this. -----Original Message----- From: Zara Parst [mailto:edotserv...@gmail.com] Sent: Wednesday, February 24, 2016 3:27 AM To: zookeeper-u...@hadoop.apache.org; f...@apache.org; AALSIHE <aali...@gmail.com>; u...@zookeeper.apache.org; solr-user@lucene.apache.org; d...@nutch.apache.org; u...@nutch.apache.org; comm...@lucene.apache.org; u...@lucene.apache.org Subject: I have one small question that always intrigue me Hi everyone, I am really need your help, please read below If we have to run solr in cloud mode, we are going to use zookeeper, now any zookeeper client can connect to zookeeper server, Zookeeper has facility to protect znode however any one can see znode acl however password could be encrypted. Decrypting password or guessing password is not a big deal. As we know password is SHA encrypted also there is no limitation of number of try to authorize with ACL. So my point is how to safegard zookeeper. I can guess few things a. Don't reveal ip of your zookeeper ( security with obscurity ) b. ip table which is also not a very good idea c. what else ?? My guess was if some how we can protect zookeeper server itself by asking client to authorize them self before it can make connection to ensemble even at root ( /) znode. Please please at least comment on this , I really need your help.