Hi Shawn,
Thank you very much for your analysis. I currently don’t have multiple machines
to play with. I will try "one Solr instance and one ZK instance would be more
efficient on a single server” you suggested.
Thanks again,
Chuming
On Nov 4, 2018, at 7:56 PM, Shawn Heisey wrote:
> On 1
On 11/4/2018 8:38 AM, Chuming Chen wrote:
I have shared a tar ball with you (apa...@elyograg.org) from google drive. The
tar ball includes logs directories of 4 nodes, solrconfig.xml, solr.in.sh, and
screenshot of TOP command. The log files is about 1 day’s log. However, I
restarted the solr c
Hi Shawn,
I have shared a tar ball with you (apa...@elyograg.org) from google drive. The
tar ball includes logs directories of 4 nodes, solrconfig.xml, solr.in.sh, and
screenshot of TOP command. The log files is about 1 day’s log. However, I
restarted the solr cloud several times during that pe
Please see inline for my thoughts
Deepak
"The greatness of a nation can be judged by the way its animals are
treated. Please consider stopping the cruelty by becoming a Vegan"
+91 73500 12833
deic...@gmail.com
Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool
"Pl
On 11/2/2018 1:38 PM, Chuming Chen wrote:
I am running a Solr cloud 7.4 with 4 shards and 4 nodes (JVM "-Xms20g
-Xmx40g”), each shard has 32 million documents and 32Gbytes in size.
A 40GB heap is probably completely unnecessary for an index of that
size. Does each machine have one replica on
On 5/10/2016 7:46 PM, lltvw wrote:
> the args used to start solr are as following, and upload my screen shot to
> http://www.yupoo.com/photos/qzone3927066199/96064170/, please help to take a
> look, thanks.
>
> -DSTOP.PORT=7989
> -DSTOP.KEY=
> -DzkHost=node1:2181,node2:2181,node3:2181/solr
> -Dso
On 5/9/2016 11:42 PM, lltvw wrote:
> By using jps command double check the parms used to start solr, i found that
> the max heap size already set to 10G. So I made a big mistake yesterday.
>
> But by using solr admin UI, I select the collection with performance problem,
> in the overview page I
On Tue, 2016-05-10 at 00:41 +0800, lltvw wrote:
> Recently we setup a 4.10 solrcloud env with about 9000 doc indexed
> in it,this solrcloud with 12 shards, each shard on one separate
> machine, but when we try to search some infor on solrcloud, the
> response time is about 300ms.
Could you pr
On 5/9/2016 9:11 PM, lltvw wrote:
> You are right, the max heap is 512MB, thanks.
90 million documents split into 12 shards means 7.5 million documents
per shard.
With that many documents and a 512MB heap, you're VERY lucky if Solr
doesn't experience OutOfMemoryError problems -- which will make S
On 5/9/2016 4:41 PM, lltvw wrote:
> Shawn, thanks.
>
> Each machine with 48G memory installed, and now with 20G free, I check JVM
> heap size use solr admin UI, the heap size is about 20M.
What is the *max* heap? An unmodified install of Solr 5.x or later has
a max heap of 512MB.
In the admin
On 5/9/2016 10:52 AM, lltvw wrote:
> Sorry, I missed the size of each shard, the size is about 3G each. Thanks.
>
> 在 2016-05-10 00:41:13,lltvw 写道:
>> Recently we setup a 4.10 solrcloud env with about 9000 doc indexed in
>> it,this solrcloud with 12 shards, each shard on one separate machine
why you use 15 replicas?
more replicas more slower.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solrcloud-performance-issues-tp4186035p4188738.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
Did you say you have 150 servers in this cluster? And 10 shards for just
90M docs? If so, that 150 hosts sounds like too much for all other numbers
I see here. I'd love to see some metrics here. e.g. what happens with
disk IO around those commits? How about GC time/size info? Are JVM mem
Hi Vijay,
We're working on SOLR-6816 ... would love for you to be a test site for any
improvements we make ;-)
Curious if you've experimented with changing the mergeFactor to a higher
value, such as 25 and what happens if you set soft-auto-commits to
something lower like 15 seconds? Also, make s
search engn dev [sachinyadav0...@gmail.com] wrote:
> Yes, You are right my facet queries are for text analytic purpose.
Does this mean that facet calls are rare (at most one at a time)?
> Users will send boolean and spatial queries. current performance for spatial
> queries is 100qps with 150 con
Hi,
Increasing the number of replicas per shard will help you take more
concurrent users/queries resulting in increased throughput.
Thanks,
Himanshu
On Mon, Jul 21, 2014 at 9:25 AM, search engn dev
wrote:
> Thanks Erick,
>
> /"So your choices are either to increase memory (a lot) or not do th
Thanks Erick,
/"So your choices are either to increase memory (a lot) or not do this.
It's a valid question whether this is useful information to present to a
user
(or are you doing some kind of analytics here?). "/
Yes, You are right my facet queries are for text analytic purpose. Users
will
search engn dev [sachinyadav0...@gmail.com] wrote:
> out of 700 million documents 95-97% values are unique approx.
That's quite a lot. If you are not already using DocValues for that, you should
do so.
So, each shard handles ~175M documents. Even with DocValues, there is an
overhead of just hav
Right, this is the worst kind of use-case for faceting. You have
150M docs/shard and are asking up to 125M buckets to count
into, plus control structures. Performance of this (even without OOMs)
will be a problem. Having multiple queries execute this simultaneously
will increase memory usage.
So y
out of 700 million documents 95-97% values are unique approx.
My facet query is :
http://localhost:8983/solr/select?q=*:*&rows=0&facet=true&facet.limit=1&facet.field=user_digest
Above query throws OOM exception as soon as fire it to solr.
--
View this message in context:
http://lucene.
From: search engn dev [sachinyadav0...@gmail.com]:
> 1 collection : 4 shards : each shard has one master and one replica
> total documents : 700 million
Are you using DocValues for your facet fields? What is the approximate number
of unique values in your facets and what is their type (string, nu
Be a bit careful here. 128G is lots of memory, you may encounter very long
garbage collection pauses. Just be aware that this may be happening later.
Best,
Erick
On Tue, Oct 22, 2013 at 5:04 PM, Tom Mortimer wrote:
> Just tried it with no other changes than upping the RAM to 128GB total, and
>
Just tried it with no other changes than upping the RAM to 128GB total, and
it's flying. I think that proves that RAM is good. =) Will implement
suggested changes later, though.
cheers,
Tom
On 22 October 2013 09:04, Tom Mortimer wrote:
> Boogie, Shawn,
>
> Thanks for the replies. I'm going to
Boogie, Shawn,
Thanks for the replies. I'm going to try out some of your suggestions
today. Although, without more RAM I'm not that optimistic..
Tom
On 21 October 2013 18:40, Shawn Heisey wrote:
> On 10/21/2013 9:48 AM, Tom Mortimer wrote:
>
>> Hi everyone,
>>
>> I've been working on an inst
On 10/21/2013 9:48 AM, Tom Mortimer wrote:
Hi everyone,
I've been working on an installation recently which uses SolrCloud to index
45M documents into 8 shards on 2 VMs running 64-bit Ubuntu (with another 2
identical VMs set up for replicas). The reason we're using so many shards
for a relativel
some basic tips.
-try to create enough shards that you can get the size of each index portion on
the shard closer to the amount of RAM you have on each node (e.g. if you are
~140GB index on 16GB nodes, try doing 12-16 shards)
-start with just the initial shards, add replicas later when you have
Shamik:
You're right, the use of NOW shouldn't be making that much of a difference
between versions. FYI, though, here's a way to use NOW and re-use fq
clauses:
http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/
It may well be this setting:
1000
Every second (assuming
Hi,
What happens if you have just 1 shard - no distributed search, like
before? SPM for Solr or any other monitoring tool that captures OS and
Solr metrics should help you find the source of the problem faster.
Is disk IO the same? utilization of caches? JVM version, heap, etc.?
CPU usage? network
I tried commenting out NOW in bq, but didn't make any difference in the
performance. I do see minor entry in the queryfiltercache rate which is a
meager 0.02.
I'm really struggling to figure out the bottleneck, any known pain points I
should be checking ?
--
View this message in context:
http
Thanks Primoz, I was suspecting that too. But then, its hard to imagine that
query cache is only contributing to the big performance hit. The setting
applies to the old configuration, and it works pretty well even with the
query cache low hit rate.
--
View this message in context:
http://lucene
Query result cache hit might be low due to using NOW in bf. NOW is always
translated to current time and that of course changes from ms to ms... :)
Primoz
From: Shamik Bandopadhyay
To: solr-user@lucene.apache.org
Date: 17.10.2013 00:14
Subject:SolrCloud Performance Issue
Hi
rsday, January 17, 2013 10:12 AM
>Subject: Re: SolrCloud Performance for High Query Volume
>
>
>Hello Niran,
>
>
>> Now with the roughly the "same" schema and solrconfig configuration
>
>
>Can you be more specific about what was changed and how?
>
&
Yup, DIH is not optimal for SolrCloud yet. I made a few JIRA issues a short
while ago that may help.
I've seen people use it with SolrCloud in the past though - and it wasn't so
slow…(though I'm sure slower than a single node).
Search me...
- Mark
On Nov 27, 2012, at 1:24 PM, Mikhail Khludnev
It sounds like DataImportHandler will not be really performant with
SolrCloud. From what I see it should essentiallly work - it sends doc to
the chain, which should distribute them via DistributedUpdateProcessor. But
it works synchronously - no multithreading in DIH since 4.0!
Does anyone has an ex
To get the best speed out of SolrCloud you have to index from many clients (or
threads). Even better is if you index to many nodes rather than one.
Using a single thread against a single instance with replicas will be a fair
amount slower with cloud than if you just used one node.
- Mark
On No
35 matches
Mail list logo