I have around 300,000 records to be uploaded on a solr cloud suggester. These records are dynamic i.e. new documents will be added and some document will be deleted in future on a regular basis. The problem I am facing is either:
1. Use FileDictionaryFactory: this method is an operational nightmare. I would need to keep generating the file and upload it to zookeeper (still haven't figured out how to upload huge file like this to zookeeper). And might need to create index on each server on the solr cloud separately. Doing this frequently does not seems possible. 2. Use DocumentDictionaryFactory: this method seems like an obvious choice, but building index here is a nightmare as well. Everytime I try to build index, I get the "No space left on the device" error. I tried building it on 5K records and it was successful. But it took 40 minutes and consumed all 10GB of memory during this entire 40 minutes. My question is, can we optimize this index building time if we follow the second approach. Or if I follow the first approach what should be the ideal way of dealing with frequent changes to be indexed on solr cloud. Thanks and regards, Diwakar Bhardwaj