Hi All, I am wondering if there is any recommendation or convention regarding planning and benchmarking a Solr node / Solr Cloud cluster infrastructure. I am looking for a somewhat more structured approach than trying with our forecast data volumes and keep adding more resources (CPU, RAM, disk etc.) or additional nodes until the performance is acceptable.
If there any way to analyze, benchmark and forecast Solr resource requirements according to the planned future load? E.g. if we have 2 cloud nodes today, and our data volume grows to the double, can we simply add two more nodes and except roughly the same performance? What if data volume grows 10x or 100x? How can we plan resources in advance? I _know_ that Solr / Solr Cloud scales _very well_ , but is there any supporting document, metrics etc available? I only heard about reports that Solr Cloud is widely used with great success, however in a company environment you normally have to show some numbers, proving that the investment into the hardware will return and that you will not require a 5-fold emergency increase in infrastructure funding just to keep the system up and running. If there is no such research document available, I would be much obliged if you could give some hints on what and how to measure in Solr / Solr cloud world. (E.g. what the optimal resource utilization of a Solr instance is, how to recognize if an instance is trashing etc.) Thanks, Peter