On Sun, Nov 29, 2015, at 07:38 PM, William Bell wrote: > OK. Been using Cores for 4 years. Want to migrate to collections / Cloud. > > Do we have to change our queries? > > http://loadbalancer:8983/solr/corename/select?q=*:* > > What does this become once we have the collection sharded? Do we need a > Load Balancer or just point to one box and run the new query? Or would it > be better to hit the LB in case one machine is no longer good to go? > > http://loadbalancer:8983/solr/collectionname/select?q=*:* > > What features would not yet be ready for sharded setups with SolrCloud? > In > the past, facet counts were an issue, grouping? stats? as well as IDF for > sorting by scores. i.e. facet.field=specialties. We want the Cardiologist > specialty to have unique numbers across shards. So if shard1 has 4 people > with Cardiology, and shard2 has 2 people with Cardiology, we would want > the > number to be 6. We would want facet.sort to work on counts... I guess we > could index another collection for facets and just use 1 machine for > that? > But doesn't that defeat the purpose? > > What is the best walk thru for SOLR 5.3.1 ? > > Looking at https://wiki.apache.org/solr/SolrCloud
1. Your queries should stay (more or less) the same 2. If you name a collection the same as what you are using for a core, your base URL will remain the same 3. If you use SolrJ, then you would change to CloudSolrClient, which would feel quite different, but the SolrQuery objects should be interchangeable 4. If you use SolrJ, then you don't need a load balancer - SolrJ will do round robin against the Solr nodes for that collection. It will respond to failures far faster than an LB ever could (I've seen downed machines pulled in <200ms) 5. Regarding sharded setups, there's two scenarios to consider - distributed in general, and solrcloud in particular. Every search component must be enabled for distributed search (faceting, highlighting, grouping, etc, etc). Some of the newer ones may not have had distributed support implemented yet. Others, such as Joining, will require particular concern, and will work in only a subset of conditions. 6. For IDF, mostly, IDF balances itself across the shards. If it doesn't, then distributed IDF is available, but that has a cost in terms of additional network traffic. 7. Faceting should work just fine (as you describe) across shards. I would check specifically on newer faceting features though before assuming anything. 8. facet.sort+counts, have you tried it? 9. I would consider this to be a more up-to-date place to go: https://cwiki.apache.org/confluence/display/solr/SolrCloud Upayavira