On 4/18/2016 4:22 AM, Zheng Lin Edwin Yeo wrote: > I have many collections in Solr, but with only 1 shard. I found that the > index size across all the collections has passed the 1TB mark. Currently > the query speed is still normal, but the indexing speed seems to be become > slower. > > Will it affect the performance if I continue to increase the index size but > stick to 1 shard?
I have noticed overall *bulk* indexing speed slows down as the index gets bigger, but I suspect that a big part of the reason this happens is segment merging involves more *large* segments, tying up I/O resources. The amount of time required to index a small number of documents should not be affected much by index size, but something that is likely to take longer with a large index is the *commit* operation -- especially if Solr's caches are configured to autowarm. Running the index on SSD, or on a RAID10 volume with of a lot of regular disks, can greatly speed up indexing. The parity-based RAID levels (primarily 5 and 6) have a fairly severe write penalty, so I do not recommend them for Solr, unless indexing happens infrequently. Installing plenty of memory is very helpful for *query* speed, but it can also *indirectly* speed up indexing. If the disk is not busy when queries are happening, there's more I/O bandwidth available for writes. Thanks, Shawn