Hello, We are running some test for improving our solr performance.
We have around 15 collections on our solr cluster. But we are particularly interested in one collection holding high amount of documents. ( https://gist.github.com/AnkushKhanna/9a472bccc02d9859fce07cb0204862da) Issue: We see that there are high response time from the collection, for the same queries, when user load or update load is increased. What are we aiming for: Low response time (lower than 3 sec) in high update/traffic. Current collection, production: * Solr Cloud, 2 Shards 2 Replicas * Indexed: 5.4 million documents * 45 indexed fields per document * Soft commit: 5 seconds * Hard commit: 10 minutes Test Setup: * Indexed: 3 million documents * Rest is same as in production * Using gatling to mimic behaviour of updates and user traffic Finding: We see the problem occurring more often when: * query size is greater than 2000 characters (we can limit the search to 2000 characters, but is there a solution to do this without limiting the size) * there is high updates going on * high user traffic Some settings I explored: * 1 Shard and 3 Replicas * Hard commit: 5 minutes (Referencing https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ ) With both the above solutions we see some improvements, but not drastic. (Attach images) I would like to have more insights into the following questions: * Why is there an improvement with lowering the hard commit time, would it interesting to explore with lower hard commit time. Can some one provide some other pointer I could explore. Regards Ankush Khanna