Thanks to Ryan Ernst, my issue is duplicate of SOLR-4449. I think that this proposal might be very useful (some supporting links are attached there. worth reading..)
On Tue, Jul 30, 2013 at 11:49 PM, Isaac Hebsh <isaac.he...@gmail.com> wrote: > Hi, > I submitted a new JIRA for this: > https://issues.apache.org/jira/browse/SOLR-5092 > > A (very initial) patch is already attached. Reviews are very welcome. > > > On Sun, Jul 28, 2013 at 4:50 PM, Erick Erickson > <erickerick...@gmail.com>wrote: > >> You'd probably start in CloudSolrServer in SolrJ code, >> as far as I know that's where the request is sent out. >> >> I'd think that would be better than changing Solr itself >> since if you found that this was useful you wouldn't >> be patching your Solr release, just keeping your client >> up to date. >> >> Best >> Erick >> >> On Sat, Jul 27, 2013 at 7:28 PM, Isaac Hebsh <isaac.he...@gmail.com> >> wrote: >> > Shawn, thank you for the tips. >> > I know the significant cons of virtualization, but I don't want to move >> > this thread into a virtualization pros/cons in the Solr(Cloud) case. >> > >> > I've just asked what is the minimal code change should be made, in >> order to >> > examine whether this is a possible solution or not.. :) >> > >> > >> > On Sun, Jul 28, 2013 at 1:06 AM, Shawn Heisey <s...@elyograg.org> >> wrote: >> > >> >> On 7/27/2013 3:33 PM, Isaac Hebsh wrote: >> >> > I have about 40 shards. repFactor=2. >> >> > The cause of slower shards is very interesting, and this is the main >> >> > approach we took. >> >> > Note that in every query, it is another shard which is the slowest. >> In >> >> 20% >> >> > of the queries, the slowest shard takes about 4 times more than the >> >> average >> >> > shard qtime. >> >> > While continuing investigation, remember it might be the >> virtualization / >> >> > storage-access / network / gc /..., so I thought that reducing the >> effect >> >> > of the slow shards might be a good (temporary or permanent) solution. >> >> >> >> Virtualization is not the best approach for Solr. Assuming you're >> >> dealing with your own hardware and not something based in the cloud >> like >> >> Amazon, you can get better results by running on bare metal and having >> >> multiple shards per host. >> >> >> >> Garbage collection is a very likely source of this problem. >> >> >> >> http://wiki.apache.org/solr/SolrPerformanceProblems#GC_pause_problems >> >> >> >> > I thought it should be an almost trivial code change (for proving the >> >> > concept). Isn't it? >> >> >> >> I have no idea what you're saying/asking here. Can you clarify? >> >> >> >> It seems to me that sending requests to all replicas would just >> increase >> >> the overall load on the cluster, with no real benefit. >> >> >> >> Thanks, >> >> Shawn >> >> >> >> >> > >