On 10/25/2020 11:22 PM, Moulay Hicham wrote:
I am wondering about 3 other things:
1 - You mentioned that I need free disk space. Just to make sure that we
are talking about disc space here. RAM can still remain at the same size?
My current RAM size is Index size < RAM < 1.5 Index size
You must always have enough disk space available for your indexes to
double in size. We recommend having enough disk space for your indexes
to *triple* in size, because there is a real-world scenario that will
require that much disk space.
2 - When the merge is happening, it happens in disc and when it's
completed, then the data is sync'ed with RAM. I am just guessing here ;-).
I couldn't find a good explanation online about this.
If you have enough free memory, then the OS will make sure that the data
is available in RAM. All modern operating systems do this
automatically. Note that I am talking about memory that is not
allocated to programs. Any memory assigned to the Solr heap (or any
other program) will NOT be available for caching index data.
If you want ideal performance in typical situations, you must have as
much free memory as the space your indexes take up on disk. For ideal
performance in ALL situations, you'll want enough free memory to be able
to hold both the original and optimized copies of your index data at the
same time. We have seen that good performance can be achieved without
going to this extreme, but if you have little free memory, Solr
performance will be terrible.
I wrote a wiki page that covers this in some detail:
https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems
3 - Also I am wondering what recommendation you have for continuously
purging deleted documents. optimize? expungeDeletes? Natural Merge?
Here are more details about the need to purge documents.
The only way to guarantee that all deleted docs are purged is to
optimize. You could use the expungeDeletes action ... but this might
not get rid of all the deleted documents, and depending on how those
documents are distributed across the whole index, expungeDeletes might
not do anything at all. These operations are expensive (require a lot
of time and system resources) and will temporarily increase the size of
your index, up to double the starting size.
Before you go down the road of optimizing regularly, you should
determine whether freeing up the disk space for deleted documents
actually makes a substantial difference in performance. In very old
Solr versions, optimizing the index did produce major performance
gains... but current versions have much better performance on indexes
that have deleted documents. Because performance is typically
drastically reduced while the optimize is happening, the tradeoff may
not be worthwhile.
Thanks,
Shawn