Re: SolrCloud performance

2018-11-04 Thread Chuming Chen
Hi Shawn, Thank you very much for your analysis. I currently don’t have multiple machines to play with. I will try "one Solr instance and one ZK instance would be more efficient on a single server” you suggested. Thanks again, Chuming On Nov 4, 2018, at 7:56 PM, Shawn Heisey wrote: > On 1

Re: SolrCloud performance

2018-11-04 Thread Shawn Heisey
On 11/4/2018 8:38 AM, Chuming Chen wrote: I have shared a tar ball with you (apa...@elyograg.org) from google drive. The tar ball includes logs directories of 4 nodes, solrconfig.xml, solr.in.sh, and screenshot of TOP command. The log files is about 1 day’s log. However, I restarted the solr c

Re: SolrCloud performance

2018-11-04 Thread Chuming Chen
Hi Shawn, I have shared a tar ball with you (apa...@elyograg.org) from google drive. The tar ball includes logs directories of 4 nodes, solrconfig.xml, solr.in.sh, and screenshot of TOP command. The log files is about 1 day’s log. However, I restarted the solr cloud several times during that pe

Re: SolrCloud performance

2018-11-02 Thread Deepak Goel
Please see inline for my thoughts Deepak "The greatness of a nation can be judged by the way its animals are treated. Please consider stopping the cruelty by becoming a Vegan" +91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Pl

Re: SolrCloud performance

2018-11-02 Thread Shawn Heisey
On 11/2/2018 1:38 PM, Chuming Chen wrote: I am running a Solr cloud 7.4 with 4 shards and 4 nodes (JVM "-Xms20g -Xmx40g”), each shard has 32 million documents and 32Gbytes in size. A 40GB heap is probably completely unnecessary for an index of that size.  Does each machine have one replica on

Re: solrcloud performance problem

2016-05-11 Thread Shawn Heisey
On 5/10/2016 7:46 PM, lltvw wrote: > the args used to start solr are as following, and upload my screen shot to > http://www.yupoo.com/photos/qzone3927066199/96064170/, please help to take a > look, thanks. > > -DSTOP.PORT=7989 > -DSTOP.KEY= > -DzkHost=node1:2181,node2:2181,node3:2181/solr > -Dso

Re: solrcloud performance problem

2016-05-10 Thread Shawn Heisey
On 5/9/2016 11:42 PM, lltvw wrote: > By using jps command double check the parms used to start solr, i found that > the max heap size already set to 10G. So I made a big mistake yesterday. > > But by using solr admin UI, I select the collection with performance problem, > in the overview page I

Re: solrcloud performance problem

2016-05-09 Thread Toke Eskildsen
On Tue, 2016-05-10 at 00:41 +0800, lltvw wrote: > Recently we setup a 4.10 solrcloud env with about 9000 doc indexed > in it,this solrcloud with 12 shards, each shard on one separate > machine, but when we try to search some infor on solrcloud, the > response time is about 300ms. Could you pr

Re: solrcloud performance problem

2016-05-09 Thread Shawn Heisey
On 5/9/2016 9:11 PM, lltvw wrote: > You are right, the max heap is 512MB, thanks. 90 million documents split into 12 shards means 7.5 million documents per shard. With that many documents and a 512MB heap, you're VERY lucky if Solr doesn't experience OutOfMemoryError problems -- which will make S

Re: solrcloud performance problem

2016-05-09 Thread Shawn Heisey
On 5/9/2016 4:41 PM, lltvw wrote: > Shawn, thanks. > > Each machine with 48G memory installed, and now with 20G free, I check JVM > heap size use solr admin UI, the heap size is about 20M. What is the *max* heap? An unmodified install of Solr 5.x or later has a max heap of 512MB. In the admin

Re: solrcloud performance problem

2016-05-09 Thread Shawn Heisey
On 5/9/2016 10:52 AM, lltvw wrote: > Sorry, I missed the size of each shard, the size is about 3G each. Thanks. > > 在 2016-05-10 00:41:13,lltvw 写道: >> Recently we setup a 4.10 solrcloud env with about 9000 doc indexed in >> it,this solrcloud with 12 shards, each shard on one separate machine

Re: Solrcloud performance issues

2015-02-24 Thread longsan
why you use 15 replicas? more replicas more slower. -- View this message in context: http://lucene.472066.n3.nabble.com/Solrcloud-performance-issues-tp4186035p4188738.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solrcloud performance issues

2015-02-12 Thread Otis Gospodnetic
Hi, Did you say you have 150 servers in this cluster? And 10 shards for just 90M docs? If so, that 150 hosts sounds like too much for all other numbers I see here. I'd love to see some metrics here. e.g. what happens with disk IO around those commits? How about GC time/size info? Are JVM mem

Re: Solrcloud performance issues

2015-02-12 Thread Timothy Potter
Hi Vijay, We're working on SOLR-6816 ... would love for you to be a test site for any improvements we make ;-) Curious if you've experimented with changing the mergeFactor to a higher value, such as 25 and what happens if you set soft-auto-commits to something lower like 15 seconds? Also, make s

RE: SolrCloud performance issues regarding hardware configuration

2014-07-21 Thread Toke Eskildsen
search engn dev [sachinyadav0...@gmail.com] wrote: > Yes, You are right my facet queries are for text analytic purpose. Does this mean that facet calls are rare (at most one at a time)? > Users will send boolean and spatial queries. current performance for spatial > queries is 100qps with 150 con

Re: SolrCloud performance issues regarding hardware configuration

2014-07-20 Thread Himanshu Mehrotra
Hi, Increasing the number of replicas per shard will help you take more concurrent users/queries resulting in increased throughput. Thanks, Himanshu On Mon, Jul 21, 2014 at 9:25 AM, search engn dev wrote: > Thanks Erick, > > /"So your choices are either to increase memory (a lot) or not do th

Re: SolrCloud performance issues regarding hardware configuration

2014-07-20 Thread search engn dev
Thanks Erick, /"So your choices are either to increase memory (a lot) or not do this. It's a valid question whether this is useful information to present to a user (or are you doing some kind of analytics here?). "/ Yes, You are right my facet queries are for text analytic purpose. Users will

RE: SolrCloud performance issues regarding hardware configuration

2014-07-18 Thread Toke Eskildsen
search engn dev [sachinyadav0...@gmail.com] wrote: > out of 700 million documents 95-97% values are unique approx. That's quite a lot. If you are not already using DocValues for that, you should do so. So, each shard handles ~175M documents. Even with DocValues, there is an overhead of just hav

Re: SolrCloud performance issues regarding hardware configuration

2014-07-18 Thread Erick Erickson
Right, this is the worst kind of use-case for faceting. You have 150M docs/shard and are asking up to 125M buckets to count into, plus control structures. Performance of this (even without OOMs) will be a problem. Having multiple queries execute this simultaneously will increase memory usage. So y

RE: SolrCloud performance issues regarding hardware configuration

2014-07-18 Thread search engn dev
out of 700 million documents 95-97% values are unique approx. My facet query is : http://localhost:8983/solr/select?q=*:*&rows=0&facet=true&facet.limit=1&facet.field=user_digest Above query throws OOM exception as soon as fire it to solr. -- View this message in context: http://lucene.

RE: SolrCloud performance issues regarding hardware configuration

2014-07-18 Thread Toke Eskildsen
From: search engn dev [sachinyadav0...@gmail.com]: > 1 collection : 4 shards : each shard has one master and one replica > total documents : 700 million Are you using DocValues for your facet fields? What is the approximate number of unique values in your facets and what is their type (string, nu

Re: SolrCloud performance in VM environment

2013-10-23 Thread Erick Erickson
Be a bit careful here. 128G is lots of memory, you may encounter very long garbage collection pauses. Just be aware that this may be happening later. Best, Erick On Tue, Oct 22, 2013 at 5:04 PM, Tom Mortimer wrote: > Just tried it with no other changes than upping the RAM to 128GB total, and >

Re: SolrCloud performance in VM environment

2013-10-22 Thread Tom Mortimer
Just tried it with no other changes than upping the RAM to 128GB total, and it's flying. I think that proves that RAM is good. =) Will implement suggested changes later, though. cheers, Tom On 22 October 2013 09:04, Tom Mortimer wrote: > Boogie, Shawn, > > Thanks for the replies. I'm going to

Re: SolrCloud performance in VM environment

2013-10-22 Thread Tom Mortimer
Boogie, Shawn, Thanks for the replies. I'm going to try out some of your suggestions today. Although, without more RAM I'm not that optimistic.. Tom On 21 October 2013 18:40, Shawn Heisey wrote: > On 10/21/2013 9:48 AM, Tom Mortimer wrote: > >> Hi everyone, >> >> I've been working on an inst

Re: SolrCloud performance in VM environment

2013-10-21 Thread Shawn Heisey
On 10/21/2013 9:48 AM, Tom Mortimer wrote: Hi everyone, I've been working on an installation recently which uses SolrCloud to index 45M documents into 8 shards on 2 VMs running 64-bit Ubuntu (with another 2 identical VMs set up for replicas). The reason we're using so many shards for a relativel

RE: SolrCloud performance in VM environment

2013-10-21 Thread Boogie Shafer
some basic tips. -try to create enough shards that you can get the size of each index portion on the shard closer to the amount of RAM you have on each node (e.g. if you are ~140GB index on 16GB nodes, try doing 12-16 shards) -start with just the initial shards, add replicas later when you have

Re: SolrCloud Performance Issue

2013-10-21 Thread Erick Erickson
Shamik: You're right, the use of NOW shouldn't be making that much of a difference between versions. FYI, though, here's a way to use NOW and re-use fq clauses: http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/ It may well be this setting: 1000 Every second (assuming

Re: SolrCloud Performance Issue

2013-10-18 Thread Otis Gospodnetic
Hi, What happens if you have just 1 shard - no distributed search, like before? SPM for Solr or any other monitoring tool that captures OS and Solr metrics should help you find the source of the problem faster. Is disk IO the same? utilization of caches? JVM version, heap, etc.? CPU usage? network

Re: SolrCloud Performance Issue

2013-10-17 Thread shamik
I tried commenting out NOW in bq, but didn't make any difference in the performance. I do see minor entry in the queryfiltercache rate which is a meager 0.02. I'm really struggling to figure out the bottleneck, any known pain points I should be checking ? -- View this message in context: http

Re: SolrCloud Performance Issue

2013-10-17 Thread shamik
Thanks Primoz, I was suspecting that too. But then, its hard to imagine that query cache is only contributing to the big performance hit. The setting applies to the old configuration, and it works pretty well even with the query cache low hit rate. -- View this message in context: http://lucene

Re: SolrCloud Performance Issue

2013-10-16 Thread primoz . skale
Query result cache hit might be low due to using NOW in bf. NOW is always translated to current time and that of course changes from ms to ms... :) Primoz From: Shamik Bandopadhyay To: solr-user@lucene.apache.org Date: 17.10.2013 00:14 Subject:SolrCloud Performance Issue Hi

Re: SolrCloud Performance for High Query Volume

2013-01-18 Thread Niran Fajemisin
rsday, January 17, 2013 10:12 AM >Subject: Re: SolrCloud Performance for High Query Volume > > >Hello Niran, > > >> Now with the roughly the "same" schema and solrconfig configuration > > >Can you be more specific about what was changed and how? > &

Re: SolrCloud Performance - Indexing

2012-11-27 Thread Mark Miller
Yup, DIH is not optimal for SolrCloud yet. I made a few JIRA issues a short while ago that may help. I've seen people use it with SolrCloud in the past though - and it wasn't so slow…(though I'm sure slower than a single node). Search me... - Mark On Nov 27, 2012, at 1:24 PM, Mikhail Khludnev

Re: SolrCloud Performance - Indexing

2012-11-27 Thread Mikhail Khludnev
It sounds like DataImportHandler will not be really performant with SolrCloud. From what I see it should essentiallly work - it sends doc to the chain, which should distribute them via DistributedUpdateProcessor. But it works synchronously - no multithreading in DIH since 4.0! Does anyone has an ex

Re: SolrCloud Performance - Indexing

2012-11-27 Thread Mark Miller
To get the best speed out of SolrCloud you have to index from many clients (or threads). Even better is if you index to many nodes rather than one. Using a single thread against a single instance with replicas will be a fair amount slower with cloud than if you just used one node. - Mark On No