Thanks for the good suggestions on read traffic. I have been simulating
reads through parsing our elb logs and replaying them from a fleet of test
servers acting as frontends using Siege <https://www.joedog.org/siege-home/>.
We are hoping to tune mostly based on exact use case, and so this seems the
most effective route. I see why for the average user experience, 0-hit
queries would provide some better data. Our plan is to start with exact
user patterns and then branch and refine our metrics from there.

For writes, I am using an index rebuild which we have written. We use this
for building anew or refreshing an existing index in case of changes to our
data model, document structure, schema, etc... It was actually turning on
this rebuild to our main cluster that started edging us toward the
performance limits on writes.

After writing last, we discovered we were garbage collection limited in our
current cluster. We noticed that when doing writes, especially the large
volume of writes our background rebuild was using, we generally do okay,
but eventually the GC would do a deep pass and we'd see 504 gateway
timeouts. We updated with the settings from Shawn Heisey
<https://wiki.apache.org/solr/ShawnHeisey>'s page, and we have only seen
timeouts a couple of times since then (these don't kill the rebuild, they
simply get retried later). I see from you here and on another thread right
now that gc seems to be an area of active discussion.

Best,
Stephen

On Mon, May 2, 2016 at 9:20 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Bram:
>
> That works. I try to monitor the number of 0-hit
> queries when I generate a test set on the theory that
> those are _usually_ groups of random terms I've
> selected that aren't a good model. So it's often
> a sequence like "generate my list, see which
> ones give 0 results and remove them". Rinse,
> repeat.
>
> Like you said, imperfect but _loads_ better than
> trying to create them without real user queries
> as guidance...
>
> Best,
> Erick
>
> On Sat, Apr 30, 2016 at 4:19 AM, Bram Van Dam <bram.van...@intix.eu>
> wrote:
> >> If I'm reading this right, you have 420M docs on a single shard?
> >> Yep, you were reading it right.
> >
> > Is Erick mentioned, it's hard to give concrete sizing advice, but we've
> > found 120M to be the magic number. When a shard contains more than 120M
> > documents, performance goes down rapidly & GC pauses grow a lot longer.
> > Up until 250M things remain acceptable. But then performance starts to
> > drop very quickly after that.
> >
> >  - Bram
> >
>



-- 
Stephen

(206)753-9320
stephen-lewis.net

Reply via email to