I thought docValues were per segment, so the price of un-inversion was
effectively paid on each commit for all the segments, as opposed to
just the updated one.

I admit I also find the story around docValues to be very confusing at
the moment. Especially on the interplay with "indexed=false". It would
make a VERY good article to have this clarified somehow by people in
the know.

Regards,
   Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 9 November 2015 at 11:04, Yonik Seeley <[email protected]> wrote:
> On Mon, Nov 9, 2015 at 10:55 AM, Demian Katz <[email protected]> 
> wrote:
>> I understand that by adding "docValues=true" to some of my fields, I can 
>> improve sorting/faceting performance.
>
> I don't think this is true in the general sense.
> docValues are built at index-time, so what you will save is initial
> un-inversion time (i.e. the first time a field is used after a new
> searcher is opened).
> After that point, docValues may be slightly slower.
>
> The other advantage of docValues is memory use... much/most of it is
> essentially "off-heap", being memory-mapped from disk.  This cuts down
> on memory issues and helps reduce longer GC pauses.
>
> docValues are good in general, and I think we should default to them
> more for Solr 6, but they are not better in all ways.
>
>> However, I have a couple of questions:
>>
>>
>> 1.)    Will Solr always take proper advantage of docValues when it is turned 
>> on
>
> Yes.
>
>> , or will I gain greater performance by turning of stored/indexed in 
>> situations where only docValues are necessary (e.g. a sort-only field)?
>>
>> 2.)    Will adding docValues to a field introduce significant performance 
>> penalties for non-docValues uses of that field, beyond the obvious fact that 
>> the additional data will consume more disk and memory?
>
> No, it's a separate part of the index.
>
> -Yonik
>
>
>> I'm asking this question because the existing schema has some multi-purpose 
>> fields, and I'm trying to determine whether I should just add 
>> "docValues=true" wherever it might help, or if I need to take a more 
>> thoughtful approach and potentially split some fields with copyFields, etc. 
>> This is particularly significant because my schema makes use of some dynamic 
>> field suffixes, and I'm not sure if I need to add new suffixes to 
>> differentiate docValues/non-docValues fields, or if it's okay to turn on 
>> docValues across the board "just in case."
>>
>> Apologies if these questions have already been answered - I couldn't find a 
>> totally clear answer in the places I searched.
>>
>> Thanks!
>>
>> - Demian

Reply via email to