Hi All, We need some advice/views on the way we push our documents in SOLR (4.8.1). So, here are the requirements:
1. Document could be from 5 to 100 KB in size. 2. 10-50 users actively querying solr with different sort of data. 3. Data will be available frequently to be pushed to solr (streaming). It must be available with-in 15 seconds to be queried. Current scenario: We dump data to a json file and have a cron job (in java, each time a new file is created) which reads this file periodically and sends it to SOLR using solrj (via http). This file is massive and could be of size ~GBs in some cases (soft and hard solr commits are configured appropriately). Issue: 1. Multiple cores exist in this SOLR and they too follow similar pattern. 2. This causes SOLR to hang and cause OOM in some cases due to, too many FIleDescriptors opened (sometimes, due to other issues) We would like to know if using DataImportHandler give us any advantage? I just gave a quick glance on Solr Wiki but not clear if it offers any advantages in terms of performance (in this scenario). Regards, Prateek Jain