Hi Shawn, Thanks for your explanation.
I have set my segment size to 20GB under the TieredMergePolicy <mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> <int name= "maxMergeAtOnce">10</int> <int name="segmentsPerTier">10</int> <double name= "maxMergedSegmentMB">20480</double> </mergePolicy> Does it means that the segment merging will occurs more often, as it will need to keep merging during indexing till it reaches 20GB. I do have 192GB of RAM on my server which Solr is running on. Regards, Edwin On 18 April 2016 at 21:35, Shawn Heisey <apa...@elyograg.org> wrote: > On 4/18/2016 4:22 AM, Zheng Lin Edwin Yeo wrote: > > I have many collections in Solr, but with only 1 shard. I found that the > > index size across all the collections has passed the 1TB mark. Currently > > the query speed is still normal, but the indexing speed seems to be > become > > slower. > > > > Will it affect the performance if I continue to increase the index size > but > > stick to 1 shard? > > I have noticed overall *bulk* indexing speed slows down as the index > gets bigger, but I suspect that a big part of the reason this happens is > segment merging involves more *large* segments, tying up I/O resources. > > The amount of time required to index a small number of documents should > not be affected much by index size, but something that is likely to take > longer with a large index is the *commit* operation -- especially if > Solr's caches are configured to autowarm. > > Running the index on SSD, or on a RAID10 volume with of a lot of regular > disks, can greatly speed up indexing. The parity-based RAID levels > (primarily 5 and 6) have a fairly severe write penalty, so I do not > recommend them for Solr, unless indexing happens infrequently. > > Installing plenty of memory is very helpful for *query* speed, but it > can also *indirectly* speed up indexing. If the disk is not busy when > queries are happening, there's more I/O bandwidth available for writes. > > Thanks, > Shawn > >