15M docs may still comfortably fit in a single shard!
I've seen up to 300M docs fit on a shard. Then
again I've seen 10M docs make things unacceptably
slow.
You simply cannot extrapolate from 10K to
5M reliably. Put all 5M docs on the stand-alone
servers and test _that_. Whenever I see numbers
lik
You may want to utilise Document routing (_route_) option to have your
query serve faster but above you are trying to compare apple with oranges
meaning your performance tests numbers have to be based on either your
actual numbers like 3-5 million docs per shard or sufficient enough to see
advantag
This is just for performance testing we have taken 10K records per shard. In
live scenario it would be 30L-50L per shard. I want to search document from
all shards, it will slow down and take too long time.
I know in case of solr Cloud, it will query all shard node and then return
result. Is ther
+1 to Susheel's question. Sharding inevitably adds
overhead. Roughly each shard is queried
for its top N docs (10 if, say, rows=10). The
doc ID and sort criteria (score by default) are returned
to the node that originally got the request. That node
then sorts the lists into the real top 10 to retur
Hello,
Question: Do you really need sharding/can live without sharding since you
mentioned only 10K records in one shard. What's your index/document size?
Thanks,
Susheel
On Mon, Jul 18, 2016 at 2:08 AM, kasimjinwala
wrote:
> currently I am using solrCloud 5.0 and I am facing query performanc
currently I am using solrCloud 5.0 and I am facing query performance issue
while using 3 implicit shards, each shard contain around 10K records.
when I am specifying shards parameter(*shards=shard1*) in query it gives
30K-35K qps. but while removing shards parameter from query it give
*1000-1500qp
On 1/9/2013 7:01 PM, sausarkar wrote:
Hi Yonik,
Could you merger this feature with 4.0 branch, We tried to use 4.1 it did
solve the CPU spike but we did get other issues. As we are very tight on
schedule so it would very beneficial if you could merge this feature with
4.0 branch.
4.1 *is* the
Hi Yonik,
Could you merger this feature with 4.0 branch, We tried to use 4.1 it did
solve the CPU spike but we did get other issues. As we are very tight on
schedule so it would very beneficial if you could merge this feature with
4.0 branch.
Let me know.
Thanks
--
View this message in contex
On Wed, Dec 12, 2012 at 5:03 PM, sausarkar wrote:
> We still could replicate the issue in 4.1 branch i.e. queries going to one
> server (numShards=1) is being distributed among all the servers which is
> creating CPU spikes in all the servers in the cloud. Do you think this
> behavior is as expect
We still could replicate the issue in 4.1 branch i.e. queries going to one
server (numShards=1) is being distributed among all the servers which is
creating CPU spikes in all the servers in the cloud. Do you think this
behavior is as expected or will be fixed in the 4.1 release?
--
View this mes
OK, I tried to reproduce it on trunk, and I can't (i.e. everything is
looking fine).
rm -rf example/solr/zoo_data
cp -rp example example2
cp -rp example example3
cd example
java -Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
cd exa
I'm still looking into this - didn't have a lot of luck seeing it with
a test and am going to look at it manually.
I'm hoping 4.1 by xmas!
We will see though...need to get others on board.
- Mark
On Tue, Dec 11, 2012 at 2:40 PM, sausarkar wrote:
> Do you know when will 4.1 be released or will
On Thu, Dec 6, 2012 at 8:08 PM, sausarkar wrote:
> Ok we think we found out the issue here. When solrcloud is started without
> specifying numShards argument solrcloud starts with a single shard but still
> thinks that there are multiple shards, so it forwards every single query to
> all the nodes
Do you know when will 4.1 be released or will there be a 4.0.1 release with
bug fixes from 4.0?
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026139.html
Sent from the Solr - User mailing list ar
iller-3 [via Lucene]"
> mailto:ml-node+s472066n4025457...@n3.nabble.com>>
> Date: Saturday, December 8, 2012 11:08 PM
> To: "Sarkar, Sauvik" mailto:sausar...@ebay.com>>
> Subject: Re: SolrCloud - Query performance degrades with multiple servers
>
> If
tyId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&am
s=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2
>
>
> Re: SolrCloud - Quer
%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=135491
On Dec 6, 2012, at 5:08 PM, sausarkar wrote:
> We solved the issue by explicitly adding numShards=1 argument to the solr
> start up script. Is this a bug?
Sounds like it…perhaps related to SOLR-3971…not sure though.
- Mark
Ryan, my new best friend! Please, file JIRA issue(s) for these items!
I'm sure you will get some feedback.
- Mark
On Dec 6, 2012, at 5:09 PM, Ryan Zezeski wrote:
> There are some gains to be made in Solr's distributed search code. A few
> weeks about I spent time profiling dist search using d
There are some gains to be made in Solr's distributed search code. A few
weeks about I spent time profiling dist search using dtrace/btrace and
found some areas for improvement. I planned on writing up some blog posts
and providing patches but I'll list them off now in case others have input.
1)
is a CPU spike each minute. Did you
also di this test on the SolrCloud, any observations or suggestions?
In Reply To
Re: SolrCloud - Query performance degrades with multiple servers
Dec 05, 2012; 7:59pm — by Mark Miller-3
This is just the std scatter gather distrib search stuff solr has bee
?
In Reply To
Re: SolrCloud - Query performance degrades with multiple servers
Dec 05, 2012; 7:59pm — by Mark Miller-3
This is just the std scatter gather distrib search stuff solr has been using
since around 1.4.
There is some overhead to that, but generally not much. I've measured it at
a
We measured for just 3 nodes the overhead is around 100ms. We also noticed is
that CPU spikes to 100% and some queries get blocked, this happens only when
cloud has multiple nodes but does not happen on single node. All the nodes
has the exact same configuration and JVM setting and hardware configu
M
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud - Query performance degrades with multiple servers
This is just the std scatter gather distrib search stuff solr has been using
since around 1.4.
There is some overhead to that, but generally not much. I've measured it at
around 30-50ms for
This is just the std scatter gather distrib search stuff solr has been using
since around 1.4.
There is some overhead to that, but generally not much. I've measured it at
around 30-50ms for a 100 machines, each with 10 million docs a few years ago.
So…that doesn't help you much…but FYI…
- Mark
26 matches
Mail list logo