On 2/26/2019 1:34 AM, Saurabh Sharma wrote:
Now we want to do partial updates.I went through the documentation and
found that all the fields should be stored or docValues for partial
updates. I have few questions regarding this?

1) In case i am just fetching only 1 field while making query.What will the
performance impact due to all fields being stored? Lets say i have an "id"
field and i do have doc value true for the field, will solr use stored
fields in this case? will it load whole document in RAM ?

I am not aware of any option to keep docValues in RAM. If you have enough memory in your system (memory that has NOT been assigned to any program), then the OS *might* keep some or all of your index data in memory. That functionality, present in all modern operating systems, is the secret to good performance.

The stored data is compressed. The docValues data is not compressed. Uncompressing stored data uses CPU cycles. Generally if data must be read off of disk, compressed will be faster. But if the data has been cached by the OS and comes from memory, which you definitely want to happen if possible, uncompressed will likely be faster ... and it will definitely require less CPU.

If you have many fields but you're only fetching one, then docValues will almost certainly be faster than stored. All of the stored fields for one document are compressed together, so Solr will be reading data that it won't actually be using, in order to achieve decompression.

I believe that if you have both stored data and docValues for a field, Solr will use stored data for search results. I am not positive that this is the case, but I think it's what happens.

2)What's the impact of large stored fields (.fdt) on query time
performance. Do query time even depend on the stored field or they just
depend on indexes?

The size of your stored data will have no *DIRECT* impact on query performance. Stored data is not consulted for the query part. It is consulted when document data is retrieved to return with the response.

A large amount of stored data can have an indirect impact on query performance. If there is insufficient memory available to the OS disk cache, then reading the stored data to return results to the client will push information out of the disk cache that is needed for queries. If that happens, then Solr will need to re-read that data off the disk to do a query. Because disks are glacially slow compared to memory, performance will be impacted.

Here's a page about performance problems. Most of it is about memory, since that is usually the resource that has the biggest effect on performance:

https://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn

Reply via email to