Hi list,

We have encountered a weird bug related to the facet.offset parameter. In
short: the more general query is, that generates lots of hits, the higher
the risk of the facet.offset parameter to stop working.

In more detail:

1. Since getting all facets we need (facet.limit=1000) from around 100
shards didn't work for some broad query terms, like "the" (yes, we index
and search those too), we decided to paginate.

2. The facet page size is set to 100 for all pages starting the second one.
We start with: facet.offset=0&facet.limit=30, then continue with
facet.offset=30&facet.limit=100, then facet.offset=100&facet.limit=100 and
so on, until we get facet.offset=900.

All facets work just fine, until we hit facet.offset=700.

Debugging showed, that in the class HttpCommComponent static Executor
instance is created with a setting to terminate idle threads after 5 sec.
Our belief, is that this setting way too low for our billion document
scenario and broad searches. Setting this to 5 min seems to improve the
situation a bit, but not solve fully. This same class is no longer used in
4.2.1 (can anyone tell what's used instead in distributed faceting?) so it
isn't easy to compare these parts of the code.

Anyhow, playing now with this value in the hope to see some light in the
tunnel (would be good, if it is not the train).

One more question: can this be related to RAM allocation on the router and
/ or shards? If RAM isn't enough for some operations, why the router or
shards wouldn't just crash with OOM?

If anyone has other ideas for what to try / look into, that'll be much
appreciated.

Dmitry

Reply via email to