Hi,

On Sat, Nov 29, 2014 at 2:27 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:

> On 11/29/14 1:30 PM, Toke Eskildsen wrote:
>
>> Michael Sokolov [msoko...@safaribooksonline.com] wrote:
>>
>>> I wonder if there's any value in providing this metric (total index size
>>> - stored field size - term vector size) as part of the admin panel?  Is
>>> it meaningful?  It seems like there would be a lot of cases where it
>>> could give a good rule of thumb for memory sizing, and it would save
>>> having to root around in the index folder.
>>>
>> At Lucene/Solr Revolution, I talked with Alexandre Rafalovitch about
>> this. We know (https://lucidworks.com/blog/sizing-hardware-in-the-
>> abstract-why-we-dont-have-a-definitive-answer/) that we cannot get the
>> full picture of an index, but it is a weekly occurrence on this mailing
>> list that people asks questions where it helps to have a gist of the index
>> metrics and how the index is used.
>>
>> Some sort of "Copy the content of this concentrated metrics box, when you
>> need to talk with other people about your index"-functionality in the admin
>> panel might help with this. To get an idea of usage, it could also contain
>> a few non-filled fields, such as "peak queries per second" or "typical
>> queries".
>>
>> - Toke Eskildsen
>>
> Yes - the cautions about the need for prototyping are all very well, but
> even if you take that advice, and build a prototype, it's not clear how to
> tell whether your setup has enough memory or not. You can add more and
> measure response times, but even then you only have a gross measurement,
> and no way of knowing where, in detail, the memory is being used.  Also,
> you might be able to improve your system to make better use of memory with
> more precise information. It seems like we ought to be able to monitor a
> running system, observe its memory requirements over time, and report on
> those.
>

+1 to that!
I haven't been following this aspect of development super closely, but I
believe there are memory/size estimators for various things at Lucene level
that Elasticsearch is nicely exposing via its stats API.  I don't know the
specifics around those estimators without digging in, otherwise I'd open a
JIRA, because I think this is valuable information -- at Sematext we
regularly deal with hardware sizing, memory / CPU usage estimates, etc.
etc., so the more of this info is surfaced the easier it will be for people
to work with Solr.

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

Reply via email to