bq: One thing that still feels a bit odd though is that the health check query was referencing a collection that no longer existed in the cluster. So it seems like it was downloading the state for ALL non-hosted collections, not a requested one.
This is a bit odd, I don't know whether there's fallback logic like "if there's no such collection look at them all". If you're _sure_ this is what's happening, and especially if you can provide a test case this is worth a JIRA to at least insure that it's intended behavior. Best, Erick On Mon, May 16, 2016 at 9:28 PM, Jeff Wartes <jwar...@whitepages.com> wrote: > > Ah, I tracked this down to an haproxy that was set up on a load server during > development and still running. It was configured with a health check every 10 > seconds, so that’s pretty clearly the cause. Thanks for the pointer. > > One thing that still feels a bit odd though is that the health check query > was referencing a collection that no longer existed in the cluster. So it > seems like it was downloading the state for ALL non-hosted collections, not a > requested one. > > This touches a bit on a sore point with me. I dislike that those > collection-not-here proxy requests aren’t logged on the server doing the > proxy, because you end up with traffic visible at the http interface but not > the solr level. Honestly, I dislike that transparent proxy approach in > general, because it means I lose the ability to dedicate entire nodes to the > fan-out and shard-aggregation process like I could pre-solrcloud. > > > > > On 5/16/16, 8:56 PM, "Erick Erickson" <erickerick...@gmail.com> wrote: > >>With the per-collection state.json, if "something" goes to a node that doesn't >>host a replica for a node, it downloads the state for the "other" >>collection then >>throws it away. >> >>In this case, "something" is apparently asking the nodes hosting collectionA >>to >>do "something" with collections B and/or C. Some support for this would >>be if further investigation shows that the nodes that _do_ re-download the >>info did _not_ have replicas B and C. >> >>What the "something" is that sends requests I'm not quite sure, but >>that's a place >>to start. >> >>Best, >>Erick >> >>On Mon, May 16, 2016 at 11:08 AM, Jeff Wartes <jwar...@whitepages.com> wrote: >>> >>> I have a solr 5.4 cluster with three collections, A, B, C. >>> Nodes either host replicas for collection A, or B and C. Collections B and >>> C are not currently used - no inserts or queries. Collection A is getting >>> significant query traffic, but no insert traffic, and queries are only >>> directed to nodes hosting replicas for collection A. ZK timeout is set to >>> 15 seconds. >>> >>> I’ve noticed via tcpdump that, every 10 seconds exactly, several of the >>> nodes (but not all) hosting collection A re-download the state.json for >>> collections B and C. This behavior survives JVM restart. >>> >>> This isn’t a huge deal, the extra traffic isn’t very meaningful, but it’s >>> odd and smells like a bug somewhere. Anyone seen something like this? >>> >>>