Hi Shawn,

Thanks for your explanation.

I have set my segment size to 20GB under the TieredMergePolicy

<mergePolicy class="org.apache.lucene.index.TieredMergePolicy"> <int name=
"maxMergeAtOnce">10</int> <int name="segmentsPerTier">10</int> <double name=
"maxMergedSegmentMB">20480</double> </mergePolicy>

Does it means that the segment merging will occurs more often, as it will
need to keep merging during indexing till it reaches 20GB.

I do have 192GB of RAM on my server which Solr is running on.

Regards,
Edwin


On 18 April 2016 at 21:35, Shawn Heisey <apa...@elyograg.org> wrote:

> On 4/18/2016 4:22 AM, Zheng Lin Edwin Yeo wrote:
> > I have many collections in Solr, but with only 1 shard. I found that the
> > index size across all the collections has passed the 1TB mark. Currently
> > the query speed is still normal, but the indexing speed seems to be
> become
> > slower.
> >
> > Will it affect the performance if I continue to increase the index size
> but
> > stick to 1 shard?
>
> I have noticed overall *bulk* indexing speed slows down as the index
> gets bigger, but I suspect that a big part of the reason this happens is
> segment merging involves more *large* segments, tying up I/O resources.
>
> The amount of time required to index a small number of documents should
> not be affected much by index size, but something that is likely to take
> longer with a large index is the *commit* operation -- especially if
> Solr's caches are configured to autowarm.
>
> Running the index on SSD, or on a RAID10 volume with of a lot of regular
> disks, can greatly speed up indexing.  The parity-based RAID levels
> (primarily 5 and 6) have a fairly severe write penalty, so I do not
> recommend them for Solr, unless indexing happens infrequently.
>
> Installing plenty of memory is very helpful for *query* speed, but it
> can also *indirectly* speed up indexing.  If the disk is not busy when
> queries are happening, there's more I/O bandwidth available for writes.
>
> Thanks,
> Shawn
>
>

Reply via email to