hi all. complete noob as to solrcloud here. almost-non-noob on solr in general.
we're experiencing growing pains in our data and am thinking through moving to solrcloud as a result. i'm hoping to find out if it seems like a good strategy or if we need to get other areas of interest handled first before introducing new complexities. here's a rundown of things: - we are on a 30g ram aws instance - we have ~30g tucked away in the ../solr/server/ dir - our largest core is 6.8g w/ ~25 segments at any given time. this is also the core that our business directly runs off of, users interact with, etc. - 5g is for a logs type of dataset that analytics can be built off of to help inform the primary core above - 3g are taken up by 3 different third party sources that we use solr to warehouse and have available for query for the sake of linking items in our primary core to these cores for data enrichment - several others take up < 1g each - and then we have dev- and demo- flavors for some of these we had been operating on a 16gb machine till a few weeks ago (actually bumped while at lucene revolution bc i hadn't noticed how much we'd outgrown the cache size's needs till the week before!). the load when doing an import or running our heavier operations is much better and doesn't fall under the weight of the operations like it had been doing. we have no master/slave replica. all of our data is 'replicated' by the fact that it exists in mysql. if solr were to go down it'd be a nice big fire but one we could recover from within a couple hours by simply reimporting. i'd like to have a more sophisticated set up in place for fault tolerance than that, of course. i'd also like to see our heavy, many-query based operations be speedier and better capable of handling multi-threaded runs at once w/ ease. is this a matter of getting still more ram on the machine? cpus for faster processing? splitting up the read/write operations between master/slave? going full steam into a solrcloud configuration? one more note. per discussion at the conference i'm combing through our configs to make sure we trim any fat we can. also wanting to get optimization scheduled more regularly to help out w segmentation and garbage heap. not sure how far those two alone will get us, though. thanks for any thoughts! -- John Blythe