Dear all, I started to learn how to use Solr three months ago. My experiences are still limited.
Now I crawl Web pages with my crawler and send the data to a single Solr server. It runs fine. Since the potential users are large, I decide to scale Solr. After configuring replication, a single index can be replicated to multiple servers. For shards, I think it is also required. I attempt to split the index according to the data categories and priorities. After that, I will use the above replication techniques and get high performance. The following work is not so difficult. I noticed some new terms, such as SolrClould, Katta and ZooKeeper. According to my current understandings, it seems that I can ignore them. Am I right? What benefits can I get if using them? Thanks so much! LB