Re: Optimal configuration for high throughput indexing

2015-05-04 Thread Vinay Pothnis
Hi Shawn, Thanks for your inputs. The 12GB is for solr. I did read through your wiki and your G1 related recommended settings are already included. Tried a lower memory config (7G) as well and it did not result in any better results. Right now, in the process of changing the updates to use Solrj

Re: Optimal configuration for high throughput indexing

2015-05-04 Thread Shawn Heisey
On 5/4/2015 2:36 PM, Vinay Pothnis wrote: > But nonetheless, we will give the latest solrJ client + cloudSolrServer a > try. > > * Yes, the documents are pretty small. > * We are using G1 collector and there are no major GCs, but however, there > are a lot of minor GCs sometimes going upto 2s per m

Re: Optimal configuration for high throughput indexing

2015-05-04 Thread Vinay Pothnis
Hi Erick, Thanks for your inputs. I think long before we had made a conscious decision to skip solrJ client and use plain http. I think it might have been because at the time solrJ client was queueing update in its memory or something. But nonetheless, we will give the latest solrJ client + clou

Re: Optimal configuration for high throughput indexing

2015-05-03 Thread Erick Erickson
First, you shouldn't be using HttpSolrClient, use CloudSolrServer (CloudSolrClient in 5.x). That takes the ZK address and routes the docs to the leader, reducing the network hops docs have to go through. AFAIK, in cloud setups it is in every way superior to http. I'm guessing your docs aren't huge

Optimal configuration for high throughput indexing

2015-04-30 Thread Vinay Pothnis
Hello, I have a usecase with the following characteristics: - High index update rate (adds/updates) - High query rate - Low index size (~800MB for 2.4Million docs) - The documents that are created at the high rate eventually "expire" and are deleted regularly at half hour intervals I current