There are also docValues files as well, right? And they have different
memory requirements depending on how they are setup. (not 100% sure
what I am trying to say here, though)

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 29 November 2014 at 13:16, Michael Sokolov
<msoko...@safaribooksonline.com> wrote:
> Of course testing is best, but you can also get an idea of the size of the
> non-storage part of your index by looking in the solr index folder and
> subtracting the size of the files containing the stored fields from the
> total size of the index.  This depends of course on the internal storage
> strategy of Lucene and may change from release to release, but it is
> documented. The .fdt and .fdx files are the stored field files (currently,
> at least, and if you don't have everything in a compound file).  If you are
> indexing term vectors (.tvd and .tvf files) as well, I think these may also
> be able to be excluded from the index size also when calculating the
> required memory, at least based on typical usage patterns for term vectors
> (ie highlighting).
>
> I wonder if there's any value in providing this metric (total index size -
> stored field size - term vector size) as part of the admin panel?  Is it
> meaningful?  It seems like there would be a lot of cases where it could give
> a good rule of thumb for memory sizing, and it would save having to root
> around in the index folder.
>
> -Mike
>
>
> On 11/29/14 12:16 PM, Erick Erickson wrote:
>>
>> bq: You should have memory to fit your whole database in disk cache and
>> then
>> some more.
>>
>> I have to disagree here if for no other reason than stored data, which
>> is irrelevant
>> for searching, may make up virtually none or virtually all of your
>> on-disk space.
>> Saying it all needs to fit in disk cache is too broad-brush a
>> statement, gotta test.
>>
>> In this case, though, I _do_ think that there's not enough memory here,
>> Toke's
>> comments are spot on.
>>
>> On Sat, Nov 29, 2014 at 2:02 AM, Toke Eskildsen <t...@statsbiblioteket.dk>
>> wrote:
>>>
>>> Po-Yu Chuang [ratbert.chu...@gmail.com] wrote:
>>>>
>>>> [...] Everything works fine now, but I noticed that the load
>>>> average of the server is high because there is constantly
>>>> heavy disk read access. Please point me some directions.
>>>> RAM: 18G
>>>> Solr home: 185G
>>>> disk read access constantly 40-60M/s
>>>
>>> Solr search performance is tightly coupled to the speed of small random
>>> reads. There are two obvious ways of ensuring that in these days:
>>>
>>> 1) Add more RAM to the server, so that the disk cache can hold a larger
>>> part of the index. If you add enough RAM (depends on your index, but 50-100%
>>> of the index size is a rule of thumb), you get "ideal" storage speed, by
>>> which I mean that the bottleneck moves away from storage. If you are using
>>> spinning drives, the 18GB of RAM is not a lot for a 185GB index.
>>>
>>> 2) Use SSDs instead of spinning drives (if you do not already do so). The
>>> speed-up depends a lot on what you are doing, but is is a cheap upgrade and
>>> it can later be coupled with extra RAM if it is not enough in itself.
>>>
>>> The Solr Wiki has this:
>>> https://wiki.apache.org/solr/SolrPerformanceProblems
>>> And I have this:
>>> http://sbdevel.wordpress.com/2013/06/06/memory-is-overrated/
>>>
>>> - Toke Eskildsen
>
>

Reply via email to