On Fri, 29 Feb 2008 13:02:21 -0500
"Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> On Fri, Feb 29, 2008 at 12:45 AM, Walter Underwood
> <[EMAIL PROTECTED]> wrote:
> > Good point. My numbers are from a full rebuild. Let's collect maximum
> > times, to keep it simple. --wunder
>
> You may see more variation than you expect since optimization is done
> in stages of mergeFactor segments. In the same environment, you could
> add a single extra doc, and then an optimize would be faster than a
> previous run because that add happened to force a bunch of merges.
Hi all,
Does providing "my optimise takes x minutes on this hardware on a data set this
big" actually tell us useful information, other than rough ideas of how long an
optimise operation could take for {those variables}? I mean, the data ,
configuration, etc you are working on makes quite a bit of difference.
As Walter mentioned, so many variables at hand can get scary...but they
could be grouped as :
1) your SOLR setup (schema, # of docs, configuration)
2) your hardware and OS configuration.
I would guess that to get a proper understanding and provide most useful
information they could be treated separately.
For example, for test 2), a sample SOLR configuration and data be provided and
a set of test scripts be provided. Then anyone can provide information back on
how fast their hardware / config works on SOLR-PERF-TEST_1 (optimised for
overall speed) vs SOLR-PERF-TEST_2 (optimised for commit times) vs ... whatever.
I am not too sure how to have a standard test for the first group... maybe the
data and configuration examples from 2) would be useful enough for finetuning ,
as examples (similar to MySQL 'large' and 'huge' configurations)...
just a thought...
B
_________________________
{Beto|Norberto|Numard} Meijome
Do not take away the camels hump, you may be stopping him from being a camel.
I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.