ok. What about using DIH handler? Does it index in a SolrCloud setup ? Or how would I convert a query to use SolrJ ?
On Mon, Nov 30, 2015 at 5:36 AM, Upayavira <u...@odoko.co.uk> wrote: > > > On Sun, Nov 29, 2015, at 07:38 PM, William Bell wrote: > > OK. Been using Cores for 4 years. Want to migrate to collections / Cloud. > > > > Do we have to change our queries? > > > > http://loadbalancer:8983/solr/corename/select?q=*:* > > > > What does this become once we have the collection sharded? Do we need a > > Load Balancer or just point to one box and run the new query? Or would it > > be better to hit the LB in case one machine is no longer good to go? > > > > http://loadbalancer:8983/solr/collectionname/select?q=*:* > > > > What features would not yet be ready for sharded setups with SolrCloud? > > In > > the past, facet counts were an issue, grouping? stats? as well as IDF for > > sorting by scores. i.e. facet.field=specialties. We want the Cardiologist > > specialty to have unique numbers across shards. So if shard1 has 4 people > > with Cardiology, and shard2 has 2 people with Cardiology, we would want > > the > > number to be 6. We would want facet.sort to work on counts... I guess we > > could index another collection for facets and just use 1 machine for > > that? > > But doesn't that defeat the purpose? > > > > What is the best walk thru for SOLR 5.3.1 ? > > > > Looking at https://wiki.apache.org/solr/SolrCloud > > 1. Your queries should stay (more or less) the same > 2. If you name a collection the same as what you are using for a core, > your base URL will remain the same > 3. If you use SolrJ, then you would change to CloudSolrClient, which > would feel quite different, but the SolrQuery objects should be > interchangeable > 4. If you use SolrJ, then you don't need a load balancer - SolrJ will do > round robin against the Solr nodes for that collection. It will respond > to failures far faster than an LB ever could (I've seen downed machines > pulled in <200ms) > 5. Regarding sharded setups, there's two scenarios to consider - > distributed in general, and solrcloud in particular. Every search > component must be enabled for distributed search (faceting, > highlighting, grouping, etc, etc). Some of the newer ones may not have > had distributed support implemented yet. Others, such as Joining, will > require particular concern, and will work in only a subset of > conditions. > 6. For IDF, mostly, IDF balances itself across the shards. If it > doesn't, then distributed IDF is available, but that has a cost in terms > of additional network traffic. > 7. Faceting should work just fine (as you describe) across shards. I > would check specifically on newer faceting features though before > assuming anything. > 8. facet.sort+counts, have you tried it? > 9. I would consider this to be a more up-to-date place to go: > https://cwiki.apache.org/confluence/display/solr/SolrCloud > > Upayavira > -- Bill Bell billnb...@gmail.com cell 720-256-8076