On Fri, 29 Feb 2008 13:02:21 -0500 "Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> On Fri, Feb 29, 2008 at 12:45 AM, Walter Underwood > <[EMAIL PROTECTED]> wrote: > > Good point. My numbers are from a full rebuild. Let's collect maximum > > times, to keep it simple. --wunder > > You may see more variation than you expect since optimization is done > in stages of mergeFactor segments. In the same environment, you could > add a single extra doc, and then an optimize would be faster than a > previous run because that add happened to force a bunch of merges. Hi all, Does providing "my optimise takes x minutes on this hardware on a data set this big" actually tell us useful information, other than rough ideas of how long an optimise operation could take for {those variables}? I mean, the data , configuration, etc you are working on makes quite a bit of difference. As Walter mentioned, so many variables at hand can get scary...but they could be grouped as : 1) your SOLR setup (schema, # of docs, configuration) 2) your hardware and OS configuration. I would guess that to get a proper understanding and provide most useful information they could be treated separately. For example, for test 2), a sample SOLR configuration and data be provided and a set of test scripts be provided. Then anyone can provide information back on how fast their hardware / config works on SOLR-PERF-TEST_1 (optimised for overall speed) vs SOLR-PERF-TEST_2 (optimised for commit times) vs ... whatever. I am not too sure how to have a standard test for the first group... maybe the data and configuration examples from 2) would be useful enough for finetuning , as examples (similar to MySQL 'large' and 'huge' configurations)... just a thought... B _________________________ {Beto|Norberto|Numard} Meijome Do not take away the camels hump, you may be stopping him from being a camel. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.