Hi guys, do you have any ideas?
Does it even make sense to paginate in facet searches, if we require deep paging? Dmitry On Fri, Apr 26, 2013 at 11:09 PM, Dmitry Kan <solrexp...@gmail.com> wrote: > Hi list, > > We have encountered a weird bug related to the facet.offset parameter. In > short: the more general query is, that generates lots of hits, the higher > the risk of the facet.offset parameter to stop working. > > In more detail: > > 1. Since getting all facets we need (facet.limit=1000) from around 100 > shards didn't work for some broad query terms, like "the" (yes, we index > and search those too), we decided to paginate. > > 2. The facet page size is set to 100 for all pages starting the second > one. We start with: facet.offset=0&facet.limit=30, then continue with > facet.offset=30&facet.limit=100, then facet.offset=100&facet.limit=100 and > so on, until we get facet.offset=900. > > All facets work just fine, until we hit facet.offset=700. > > Debugging showed, that in the class HttpCommComponent static Executor > instance is created with a setting to terminate idle threads after 5 sec. > Our belief, is that this setting way too low for our billion document > scenario and broad searches. Setting this to 5 min seems to improve the > situation a bit, but not solve fully. This same class is no longer used in > 4.2.1 (can anyone tell what's used instead in distributed faceting?) so it > isn't easy to compare these parts of the code. > > Anyhow, playing now with this value in the hope to see some light in the > tunnel (would be good, if it is not the train). > > One more question: can this be related to RAM allocation on the router and > / or shards? If RAM isn't enough for some operations, why the router or > shards wouldn't just crash with OOM? > > If anyone has other ideas for what to try / look into, that'll be much > appreciated. > > Dmitry >