We are creating a web application which would contain posts (something like FB or say Youtube). For the stable part of the data (i.e.the facets, search results & its content), we plan to use SOLR.
What should we use for the unstable part of the data (i.e. dynamic and volatile content such as Like counts, Comments counts, Viewcounts)? Option 1) Redis What about storing the "dynamic" data in a different data store (like Redis)? Thus, everytime the counts get refreshed, I do not have to reindex the data into SOLR at all. Thus SOLR indexing is only triggered when new posts are added to the site, and never on any activity on the posts by the users. Side-note :- I also looked at the SOLR-Redis plugin at https://github.com/sematext/solr-redis The plugin looks good, but not sure if the plugin can be used to fetch the data stored in Redis as part of the solr result set, i.e. in docs. The description looks more like the Redis data can be used in the function queries for boosting, sorting, etc. Anyone has experience with this? Option 2) SOLR NRT with Soft Commits We would depend on the in-built NRT features. Let's say we do soft-commits every second and hard-commits every 10 seconds. Suppose huge amount of dynamic data is created on the site across hundreds of posts, e.g. 100000 likes across 10000 posts. Thus, this would mean soft-commiting on 10000 rows every second. And then hard-commiting those many rows every 10 seconds. Isn't this overkill? Which option is preferred? How would you compare both options in terms of scalibility, maintenance, feasibility, best-practices, etc? Any real-life experiences or links to articles? Many thanks! p.s. EFF (external file fields) is not an option, as I read that the data in that file can only be used in function queries and cannot be returned as part of a document.