On 12/29/2014 08:08 PM, ralph tice wrote:
Like all things it really depends on your use case. We have >160B documents in our largest SolrCloud and doing a *:* to get that count takes ~13-14 seconds. Doing a text:happy query only takes ~3.5-3.6 seconds cold, subsequent queries for the same terms take <500ms.
That seems perfectly reasonable.
Facets over high cardinality fields are going to be painful. We currently programmatically limit the range to around 1/12th or 1/13th of the data set for facet queries, but plan on evaluating Heliosearch (initial results didn't look promising) and Toke's sparse faceting patch (SOLR-5894) to help out there.
We had a look at Heliosearch a while ago and found it unsuitable. Seems like they're trying to make use of some native x86_64 code and HotSpot JVM specific features which we can't use. Some of our clients use IBM's JVM so we're pretty much limited to strictly Java.
There could be more support / ease of use enhancements for moving shards across SolrClouds, moving shards across physically nodes within a SolrCloud, and snapshot/restore of a SolrCloud, but there has also been a lot of recent work in these areas that are starting to provide the underlying infrastructure for more advanced shard management.
That's reassuring to hear. If we run in to these issues we can probably donate some time to work on them, so I'm not too worried about that.
I think there are more people getting into the space of >100B documents but I only ran into or discovered a handful during my time at Lucene/Solr Revolution this November. The majority of large scale SolrCloud users seem to have many collections (collections per logical user) rather than many documents in one/few collections.
That's my understanding as well. Lucene Revolution is on the wrong side of the Atlantic for me. But there's an Open Source Search devroom at FOSDEM this year, which seems like a sensible place to discuss these things. I'll make a post on the relevant mailing lists about this after the holidays if anyone is interested.
Thanks for your detailed response! - Bram