SolrCloud - Query performance degrades with multiple servers
We are using SolrCloud and trying to configure it for testing purposes, we are seeing that the average query time is increasing if we have more than one node in the SolrCloud cluster. We have a single shard 12 gigs index.Example:1 node, average query time *~28 msec* , load 140 queries/second3 nodes, average query time *~110 msec*, load 420 queries/second distributed equally on three servers so essentially 140 qps on each node.Is there any inter node communication going on for queries, is there any setting on the Solrcloud for query tuning for a cloud config with multiple nodes.Please help. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - Query performance degrades with multiple servers
We measured for just 3 nodes the overhead is around 100ms. We also noticed is that CPU spikes to 100% and some queries get blocked, this happens only when cloud has multiple nodes but does not happen on single node. All the nodes has the exact same configuration and JVM setting and hardware configuration. Any clues why this is happening? -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024941.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - Query performance degrades with multiple servers
I also did a test running a load directed to one single server in the cloud and checked the CPU usage of other servers. It seems that even if there are no load directed to those servers there is a CPU spike each minute. Did you also di this test on the SolrCloud, any observations or suggestions? In Reply To Re: SolrCloud - Query performance degrades with multiple servers Dec 05, 2012; 7:59pm — by Mark Miller-3 This is just the std scatter gather distrib search stuff solr has been using since around 1.4. There is some overhead to that, but generally not much. I've measured it at around 30-50ms for a 100 machines, each with 10 million docs a few years ago. So…that doesn't help you much…but FYI… - Mark On Dec 5, 2012, at 5:35 PM, sausarkar <[hidden email]> wrote: > We are using SolrCloud and trying to configure it for testing purposes, we > are seeing that the average query time is increasing if we have more than > one node in the SolrCloud cluster. We have a single shard 12 gigs > index.Example:1 node, average query time *~28 msec* , load 140 > queries/second3 nodes, average query time *~110 msec*, load 420 > queries/second distributed equally on three servers so essentially 140 qps > on each node.Is there any inter node communication going on for queries, > is > there any setting on the Solrcloud for query tuning for a cloud config > with > multiple nodes.Please help. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html > Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024961.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - Query performance degrades with multiple servers
Ok we think we found out the issue here. When solrcloud is started without specifying numShards argument solrcloud starts with a single shard but still thinks that there are multiple shards, so it forwards every single query to all the nodes in the cloud. We did a tcpdump on the node where queries are not targeted and found out that it is receiving POST requests from the node where queries are started. *&start=0&fsv=true&distrib=false&isShard=true&shard.url=serve1.com * We solved the issue by explicitly adding numShards=1 argument to the solr start up script. Is this a bug? Re: SolrCloud - Query performance degrades with multiple servers Dec 06, 2012; 3:13pm — by sausarkar I also did a test running a load directed to one single server in the cloud and checked the CPU usage of other servers. It seems that even if there are no load directed to those servers there is a CPU spike each minute. Did you also di this test on the SolrCloud, any observations or suggestions? In Reply To Re: SolrCloud - Query performance degrades with multiple servers Dec 05, 2012; 7:59pm — by Mark Miller-3 This is just the std scatter gather distrib search stuff solr has been using since around 1.4. There is some overhead to that, but generally not much. I've measured it at around 30-50ms for a 100 machines, each with 10 million docs a few years ago. So…that doesn't help you much…but FYI… - Mark On Dec 5, 2012, at 5:35 PM, sausarkar <[hidden email]> wrote: > We are using SolrCloud and trying to configure it for testing purposes, we > are seeing that the average query time is increasing if we have more than > one node in the SolrCloud cluster. We have a single shard 12 gigs > index.Example:1 node, average query time *~28 msec* , load 140 > queries/second3 nodes, average query time *~110 msec*, load 420 > queries/second distributed equally on three servers so essentially 140 qps > on each node.Is there any inter node communication going on for queries, > is > there any setting on the Solrcloud for query tuning for a cloud config > with > multiple nodes.Please help. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html > Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024986.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - Query performance degrades with multiple servers
Spoke too early it seems that SolrCloud is still distributing queries to all the servers even if numShards=1 We are seeing POST request to all servers in the cluster, please let me know what is the solution. Here is an example: (the variable isShard should be false in our case as single shard, please help) POST /solr/core0/select HTTP/1.1 Content-Charset: UTF-8 Content-Type: application/x-www-form-urlencoded; charset=UTF-8 User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0 Content-Length: 991 Host: server1 Connection: Keep-Alive lowercaseOperators=true&mm=70%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2 Re: SolrCloud - Query performance degrades with multiple servers Dec 06, 2012; 6:29pm — by Mark Miller-3 On Dec 6, 2012, at 5:08 PM, sausarkar <[hidden email]> wrote: > We solved the issue by explicitly adding numShards=1 argument to the solr > start up script. Is this a bug? Sounds like it…perhaps related to SOLR-3971…not sure though. - Mark -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - Query performance degrades with multiple servers
Thank you very much will wait for the results from your tests. From: "Mark Miller-3 [via Lucene]" mailto:ml-node+s472066n4025457...@n3.nabble.com>> Date: Saturday, December 8, 2012 11:08 PM To: "Sarkar, Sauvik" mailto:sausar...@ebay.com>> Subject: Re: SolrCloud - Query performance degrades with multiple servers If that's true, we will fix it for 4.1. I can look closer tomorrow. Mark Sent from my iPhone On Dec 9, 2012, at 2:04 AM, sausarkar <[hidden email]> wrote: > Spoke too early it seems that SolrCloud is still distributing queries to all > the servers even if numShards=1 We are seeing POST request to all servers in > the cluster, please let me know what is the solution. Here is an example: > (the variable isShard should be false in our case as single shard, please > help) > > POST /solr/core0/select HTTP/1.1 > Content-Charset: UTF-8 > Content-Type: application/x-www-form-urlencoded; charset=UTF-8 > User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0 > Content-Length: 991 > Host: server1 > Connection: Keep-Alive > > lowercaseOperators=true&mm=70%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2 > > > Re: SolrCloud - Query performance degrades with multiple servers > Dec 06, 2012; 6:29pm — by Mark Miller-3 > > On Dec 6, 2012, at 5:08 PM, sausarkar <[hidden email]> wrote: > >> We solved the issue by explicitly adding numShards=1 argument to the solr >> start up script. Is this a bug? > > Sounds like it…perhaps related to SOLR-3971…not sure though. > > - Mark > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html > Sent from the Solr - User mailing list archive at Nabble.com. If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025457.html To unsubscribe from SolrCloud - Query performance degrades with multiple servers, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4024660&code=c2F1c2Fya2FyQGViYXkuY29tfDQwMjQ2NjB8LTE0MTU2ODg5MDk=>. NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025573.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - Query performance degrades with multiple servers
Do you know when will 4.1 be released or will there be a 4.0.1 release with bug fixes from 4.0? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026139.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - Query performance degrades with multiple servers
We still could replicate the issue in 4.1 branch i.e. queries going to one server (numShards=1) is being distributed among all the servers which is creating CPU spikes in all the servers in the cloud. Do you think this behavior is as expected or will be fixed in the 4.1 release? -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026521.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - Query performance degrades with multiple servers
Hi Yonik, Could you merger this feature with 4.0 branch, We tried to use 4.1 it did solve the CPU spike but we did get other issues. As we are very tight on schedule so it would very beneficial if you could merge this feature with 4.0 branch. Let me know. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4032088.html Sent from the Solr - User mailing list archive at Nabble.com.
date query performance (TrieDate)
we are experiencing performance issues with date range queries. We have configured the date fields as following: Our queries are rounded every minute: qt=ads&debugQuery=false&fl=id,StartDt_t110,...&fq=Status_i110:2&fq=StartDt_t110:{ NOW/MINUTE-150DAYS TO NOW/MINUTE-90DAYS }&start=0&rows=20 Index size - 10 million documents in Solr 4.1 Start of every minute we see the query speed to be about 10-12 seconds and increases over time. Is there a solution for this. -- View this message in context: http://lucene.472066.n3.nabble.com/date-query-performance-TrieDate-tp4038419.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: date query performance (TrieDate)
When the query time jump to more than 10 seconds the linux load average spikes up to more than 100 in a 16 CPU machine. Any one has any suggestions? -- View this message in context: http://lucene.472066.n3.nabble.com/date-query-performance-TrieDate-tp4038419p4038422.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Really bad query performance for date range queries
Unfortunately we need data by minute we cannot go hour, is there an option for 3 minutes or 5 minutes? something is like NOW/3MIN? I am also noticing when I generating around 110 queries per second (date range ones) after sometime solr does not respond and just freezes. Is there a way to cure this? -- View this message in context: http://lucene.472066.n3.nabble.com/Really-bad-query-performance-for-date-range-queries-tp4038435p4038455.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Really bad query performance for date range queries
We have a 96GB ram machine with 16 processors. the JVM is set to use 60 GB. The test that we are running are purely query there is no indexing going on. I dont see garbage collection when I attach visualVM but see frequent CPU spikes ~once every minute. -- View this message in context: http://lucene.472066.n3.nabble.com/Really-bad-query-performance-for-date-range-queries-tp4038435p4038633.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR 4 Alpha Out Of Mem Err
Hi Mark, I am am also facing the same issue when trying to index in SolrCloud using DIH running on a non-leader server. The DIH server is creating around 10k threads and then OOM cannot create thread error. Do you know when or which version this issue will be solved. I think a workaround for this issue is to find the leader from zookeeper and run the DIH on the leader. Sauvik -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-4-Alpha-Out-Of-Mem-Err-tp3995033p3996378.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR 4 Alpha - distributed DIH available?
If I try to run DIH on the SolrCloud it can hit any one of the servers and start the import process, but if we try to get the import status from any other server it returns no import is running. Only the server that is running the DIH gives back the correct import status. So if we run DIH behind a load balancer we can get incorrect import status so we have to stick DIH to a specific server. So my question is there a distributed DIH available for the SolrCloud? -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-4-Alpha-distributed-DIH-available-tp3996404.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCloud - load balancing
Do anyone know if query using CommonsHttpSolrServer on SolrCloud is automatically load balanced? I am trying to load test using solrmeter on one if the node I am seeing all the nodes seems to be hit. Any clues -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-load-balancing-tp3999143.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - load balancing
actually I noticing that the CommonsHttpSolrServer seems to load balancing by hitting all servers in the cluster, just wanted to confirm if that is the case. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-load-balancing-tp3999143p3999145.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud - load balancing
Hi Mark, You are referring to the CloudSolrServer not CommonsHttpSolrServer right? does the CommonsHttpSolrServer also round robin? Do you recommend any tool for load testing SolrCloud? Thanks, Sauvik -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-load-balancing-tp3999143p3999159.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 4.0 schedule
anyone has any clue when will the beta version of Solr 4.0 be released also is their any timeframe when the first GA release for Solr4.0? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-0-schedule-tp4000561.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR 4 Alpha Out Of Mem Err
Hello Mark, Has this issue been fixed in the BETA release? - Sauvik -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-4-Alpha-Out-Of-Mem-Err-tp3995033p4001266.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr4.0 BETA - Error when StempelPolishStemFilterFactory
I just upgrade to Solr 4.0.0-BETA and it seems there is a problem with the StempelPolishStemFilterFactory it cannot find a resource, it seems that the bug was introduced in the new beta release, this works fine in the alpha release. Here is the exception I am seeing in the logs: SEVERE: null:java.lang.RuntimeException: java.io.IOException: Can't find resource '/org/apache/lucene/analysis/pl/stemmer_2.tbl' in classpath or 'solr/collection1/conf/', cwd /apache-solr-4.0.0-BETA/example at org.apache.solr.schema.IndexSchema.(IndexSchema.java:116) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:539) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:360) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:309) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:106) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258) Anyone has any clue on this? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr4-0-BETA-Error-when-StempelPolishStemFilterFactory-tp4001724.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr4.0 BETA - Error when StempelPolishStemFilterFactory
No I tried that, Solr is finding and loading all the other contrib jars but only for the Polish one it is complaining, it seems like this is a bug. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr4-0-BETA-Error-when-StempelPolishStemFilterFactory-is-used-tp4001724p4001736.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr4.0 BETA - Error when StempelPolishStemFilterFactory
Thank you. From: "steve_rowe [via Lucene]" mailto:ml-node+s472066n4001741...@n3.nabble.com>> Date: Thursday, August 16, 2012 5:05 PM To: "Sarkar, Sauvik" mailto:sausar...@ebay.com>> Subject: RE: Solr4.0 BETA - Error when StempelPolishStemFilterFactory I can reproduce - I agree, this seems like a bug. I've opened an issue: https://issues.apache.org/jira/browse/SOLR-3737 Thanks for reporting! Steve -Original Message- From: sausarkar [mailto:[hidden email]] Sent: Thursday, August 16, 2012 6:42 PM To: [hidden email] Subject: RE: Solr4.0 BETA - Error when StempelPolishStemFilterFactory No I tried that, Solr is finding and loading all the other contrib jars but only for the Polish one it is complaining, it seems like this is a bug. If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr4-0-BETA-Error-when-StempelPolishStemFilterFactory-is-used-tp4001724p4001741.html To unsubscribe from Solr4.0 BETA - Error when StempelPolishStemFilterFactory is used, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4001724&code=c2F1c2Fya2FyQGViYXkuY29tfDQwMDE3MjR8LTE0MTU2ODg5MDk=>. NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> -- View this message in context: http://lucene.472066.n3.nabble.com/Solr4-0-BETA-Error-when-StempelPolishStemFilterFactory-is-used-tp4001724p4001758.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCloud issue - accents are not getting removed
We noticed that when we use SolrCloud the accents are not getting removed when we use the filter. We are using the -Dbootstrap_conf=true to load the conf folder. This filter however is working fine in the standalone solr startup scenario. Did anyone notice the same issue? Do we need to do something else for SolrCloud for the filters? -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-issue-accents-are-not-getting-removed-tp4001891.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud issue - accents are not getting removed
Just found out that this was not an SolrCloud issue but an issue with the Tomcat configuration basically the URIEncoding was missing added it and it fixed the problem. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-issue-accents-are-not-getting-removed-tp4001891p4001902.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR 4 Alpha - distributed DIH available?
can someone let me know how to configure DIH in a cloud environment, should it point to one specific server or to the load balancer for distributing dih on all the servers. One problem with the load balancer approach is that there is no good way to tell whether a dih is already running in the cloud. Can anyone suggest a better solution? -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-4-Alpha-distributed-DIH-available-tp3996404p4002469.html Sent from the Solr - User mailing list archive at Nabble.com.