: What I mean is that when you have publicly exposed search that bots crawl, they : issue all kinds of crazy "queries" that result in errors, that add noise to Solr : caches, increase Solr cache evictions, etc. etc.
I teld with this type of thing a few years back by having my front end app executing queries to different solr tiers based on the User-Agent. Typical users to the main tier, known bots of partners to their own alt tier, known bots of public crawlers to a third alt tier. in some cases these alternate tier had the same configs as my normal search tier, but by being distinct, the unusual and eratic query volume and number of unique queries didn't screw up the cache rates or user stats generated by log parsing that i would use on my regular search tier. In other cases the tiers had slightly differnet configs, ie: the bots of my known parterns ran twice a day at predictible times, didn't do any faceting, and used a very predictible set of filters -- so i did snappulling only twice a day, and force warmed those filters. i advocate this kind of distinct search tiers per "user base" even for human users -- assusming your volumne is high enough and you have the budget for the hardware -- users who do similar queries on a certain subset of documents (with tons of faceting on a certain subset fields) should all use the same set of query servers -- but if a differnt group of users tend to issue differnt types of queries (and facet on different fields) and you know this in advance -- you might as well have that second group of people query differnet boxes. it's esentailly "session affinity" except it's not about sessions -- it's about expected behavior based on what you know about the user -Hoss