On 10/25/2020 11:22 PM, Moulay Hicham wrote:
I am wondering about 3 other things:

1 - You mentioned that I need free disk space. Just to make sure that we
are talking about disc space here. RAM can still remain at the same size?
My current RAM size is  Index size < RAM < 1.5 Index size

You must always have enough disk space available for your indexes to double in size. We recommend having enough disk space for your indexes to *triple* in size, because there is a real-world scenario that will require that much disk space.

2 - When the merge is happening, it happens in disc and when it's
completed, then the data is sync'ed with RAM. I am just guessing here ;-).
I couldn't find a good explanation online about this.

If you have enough free memory, then the OS will make sure that the data is available in RAM. All modern operating systems do this automatically. Note that I am talking about memory that is not allocated to programs. Any memory assigned to the Solr heap (or any other program) will NOT be available for caching index data.

If you want ideal performance in typical situations, you must have as much free memory as the space your indexes take up on disk. For ideal performance in ALL situations, you'll want enough free memory to be able to hold both the original and optimized copies of your index data at the same time. We have seen that good performance can be achieved without going to this extreme, but if you have little free memory, Solr performance will be terrible.

I wrote a wiki page that covers this in some detail:

https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems

3 - Also I am wondering what recommendation you have for continuously
purging deleted documents. optimize? expungeDeletes? Natural Merge?
Here are more details about the need to purge documents.

The only way to guarantee that all deleted docs are purged is to optimize. You could use the expungeDeletes action ... but this might not get rid of all the deleted documents, and depending on how those documents are distributed across the whole index, expungeDeletes might not do anything at all. These operations are expensive (require a lot of time and system resources) and will temporarily increase the size of your index, up to double the starting size.

Before you go down the road of optimizing regularly, you should determine whether freeing up the disk space for deleted documents actually makes a substantial difference in performance. In very old Solr versions, optimizing the index did produce major performance gains... but current versions have much better performance on indexes that have deleted documents. Because performance is typically drastically reduced while the optimize is happening, the tradeoff may not be worthwhile.

Thanks,
Shawn

Reply via email to