Hmmm, that does cut the problem down to size... I guess I wonder how often the same query actually comes into the system but that's a nit. I can certainly see how such a thing would improve those situations that do have lots of repeats....
Erick On Mon, Jun 10, 2013 at 6:15 PM, Otis Gospodnetic <otis.gospodne...@gmail.com> wrote: > Actually, it doesn't really have to be messy. > Couldn't one have a custom handler that know how to compute a query > hash and map it to a specific node? > When the same query comes in again, the same computation will be doen > and the same node will be selected to execute the query. > No need for any nodes to have any "physical" query->node mapping and > no need for nodes to broadcast this sort of info around. > > Sounds doable to me. > > Otis > -- > Solr & ElasticSearch Support > http://sematext.com/ > > > > > > On Mon, Jun 10, 2013 at 5:44 PM, Otis Gospodnetic > <otis.gospodne...@gmail.com> wrote: >> Yeah, that sounds complique and messy. >> Just the other day I was looking at performance metrics for a customer >> using master-slave setup and this sort of query->slave mapping behing >> the load balancer. >> After switching from such a setup to a round-robin setup their >> performance noticeably suffered... >> >> Otis >> -- >> Solr & ElasticSearch Support >> http://sematext.com/ >> >> >> >> >> >> On Mon, Jun 10, 2013 at 8:06 AM, Erick Erickson <erickerick...@gmail.com> >> wrote: >>> Nothing I've seen. It would get really tricky >>> though. Each node in the cluster would have >>> to have a copy of all queries received by >>> _any_ node which would result in all >>> queries being sent to all nodes along with >>> an indication of what node that query was >>> actually supposed to be serviced by. >>> >>> And now suppose there were 100 shards, >>> then the list of the correct node would get >>> quite large. >>> >>> Seems overly complex for the benefit, but >>> what do I know? >>> >>> FWIW >>> Erick >>> >>> >>> >>> On Sat, Jun 8, 2013 at 10:38 PM, Otis Gospodnetic >>> <otis.gospodne...@gmail.com> wrote: >>>> Hi, >>>> >>>> Is there anything in SolrCloud that would support query-node/shard >>>> affinity/stickiness? >>>> >>>> What I mean by that is a mechanism that is smart enough to keep >>>> sending the same query X to the same node(s)+shard(s)... with the goal >>>> being better utilization of Solr and OS caches? >>>> >>>> Example: >>>> * Imagine a Collection with 2 shards and 3 replicas: s1r1, s1r2, s1r3, >>>> s2r1, s2r2, s2r3 >>>> * Query for "Foo Bar" comes in and hits one of the nodes, say s1r1 >>>> * Since shard 2 needs to be queried, too, one of its 3 replicas needs >>>> to be searched. Say s2r1 gets searched >>>> * 5 minutes later the same query for "Foo Bar" comes in, say it hits s1r1 >>>> again >>>> * Again shard 2 needs to be searched. But which of the 3 replicas >>>> should be searched? >>>> * Ideally that same s2r1 would be searched >>>> >>>> Is there anything in SolrCloud that can accomplish this? >>>> Or if there a place in SolrCloud where such "query hash ==> >>>> node/shard" mapping could be implemented? >>>> >>>> Thanks, >>>> Otis >>>> -- >>>> Solr & ElasticSearch Support >>>> http://sematext.com/