On 9/11/2017 9:06 PM, Aman Tandon wrote: > We want to know about the indexing performance in the below mentioned > scenarios, consider the total number of 10 string fields and total number > of documents are 10 million. > > 1) indexed=true, stored=true > 2) indexed=true, docValues=true > > Which one should we prefer in terms of indexing performance, please share > your experience.
There are several settings in the schema for each field, things like indexed, stored, docValues, multiValued, and others. You should base your choices on what you need Solr to do. Choosing these settings based purely on desired indexing speed may result in Solr not doing what you want it to do. When the indexing system sends data to Solr with several threads or processes, Solr is *usually* capable of indexing data faster than most systems can supply it. The more settings you disable on a field, the faster Solr will be able to index. It is not possible to provide precise numbers, because performance depends on many factors, some of which you may not even know until you build a production system. https://lucidworks.com/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ All that said ... docValues MIGHT be a little bit faster than stored, because stored data is compressed, and the compression takes CPU time. On a fully populated production system, that statement might turn out to be wrong. There may be factors that result in stored fields working better. The best way to decide is to try it both ways with all your data. Thanks, Shawn