SolrCloud - Query performance degrades with multiple servers

2012-12-05 Thread sausarkar
We are using SolrCloud and trying to configure it for testing purposes, we
are seeing that the average query time is increasing if we have more than
one node in the SolrCloud cluster. We have a single shard 12 gigs
index.Example:1 node, average query time *~28 msec* , load 140
queries/second3 nodes, average query time *~110 msec*, load 420
queries/second distributed equally on three servers so essentially 140 qps
on each node.Is there any inter node communication going on for queries, is
there any setting on the Solrcloud for query tuning for a  cloud config with
multiple nodes.Please help.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread sausarkar
We measured for just 3 nodes the overhead is around 100ms. We also noticed is
that CPU spikes to 100% and some queries get blocked, this happens only when
cloud has multiple nodes but does not happen on single node. All the nodes
has the exact same configuration and JVM setting and hardware configuration.

Any clues why this is happening?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024941.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread sausarkar
I also did a test running a load directed to one single server in the cloud
and checked the CPU usage of other servers. It seems that even if there are
no load directed to those servers there is a CPU spike each minute. Did you
also di this test on the SolrCloud, any observations or suggestions?


In Reply To
Re: SolrCloud - Query performance degrades with multiple servers
Dec 05, 2012; 7:59pm — by   Mark Miller-3
This is just the std scatter gather distrib search stuff solr has been using
since around 1.4. 

There is some overhead to that, but generally not much. I've measured it at
around 30-50ms for a 100 machines, each with 10 million docs a few years
ago. 

So…that doesn't help you much…but FYI… 

- Mark 

On Dec 5, 2012, at 5:35 PM, sausarkar <[hidden email]> wrote: 

> We are using SolrCloud and trying to configure it for testing purposes, we 
> are seeing that the average query time is increasing if we have more than 
> one node in the SolrCloud cluster. We have a single shard 12 gigs 
> index.Example:1 node, average query time *~28 msec* , load 140 
> queries/second3 nodes, average query time *~110 msec*, load 420 
> queries/second distributed equally on three servers so essentially 140 qps 
> on each node.Is there any inter node communication going on for queries,
> is 
> there any setting on the Solrcloud for query tuning for a  cloud config
> with 
> multiple nodes.Please help. 
> 
> 
> 
> -- 
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024961.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

2012-12-06 Thread sausarkar
Ok we think we found out the issue here. When solrcloud is started without
specifying numShards argument solrcloud starts with a single shard but still
thinks that there are multiple shards, so it forwards every single query to
all the nodes in the cloud. We did a tcpdump on the node where queries are
not targeted and found out that it is receiving POST requests from the node
where queries are started.
*&start=0&fsv=true&distrib=false&isShard=true&shard.url=serve1.com
*

We solved the issue by explicitly adding numShards=1 argument to the solr
start up script. Is this a bug?

Re: SolrCloud - Query performance degrades with multiple servers
Dec 06, 2012; 3:13pm — by   sausarkar
I also did a test running a load directed to one single server in the cloud
and checked the CPU usage of other servers. It seems that even if there are
no load directed to those servers there is a CPU spike each minute. Did you
also di this test on the SolrCloud, any observations or suggestions? 


In Reply To 
Re: SolrCloud - Query performance degrades with multiple servers 
Dec 05, 2012; 7:59pm — by   Mark Miller-3 
This is just the std scatter gather distrib search stuff solr has been using
since around 1.4. 

There is some overhead to that, but generally not much. I've measured it at
around 30-50ms for a 100 machines, each with 10 million docs a few years
ago. 

So…that doesn't help you much…but FYI… 

- Mark 

On Dec 5, 2012, at 5:35 PM, sausarkar <[hidden email]> wrote: 

> We are using SolrCloud and trying to configure it for testing purposes, we 
> are seeing that the average query time is increasing if we have more than 
> one node in the SolrCloud cluster. We have a single shard 12 gigs 
> index.Example:1 node, average query time *~28 msec* , load 140 
> queries/second3 nodes, average query time *~110 msec*, load 420 
> queries/second distributed equally on three servers so essentially 140 qps 
> on each node.Is there any inter node communication going on for queries,
> is 
> there any setting on the Solrcloud for query tuning for a  cloud config
> with 
> multiple nodes.Please help. 
> 
> 
> 
> -- 
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4024986.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

2012-12-08 Thread sausarkar
Spoke too early it seems that SolrCloud is still distributing queries to all
the servers even if numShards=1 We are seeing POST request to all servers in
the cluster, please let me know what is the solution. Here is an example:
(the variable isShard should be false in our case as single shard, please
help)

POST /solr/core0/select HTTP/1.1
Content-Charset: UTF-8
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
Content-Length: 991
Host: server1
Connection: Keep-Alive

lowercaseOperators=true&mm=70%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2


Re: SolrCloud - Query performance degrades with multiple servers
Dec 06, 2012; 6:29pm — by   Mark Miller-3

On Dec 6, 2012, at 5:08 PM, sausarkar <[hidden email]> wrote: 

> We solved the issue by explicitly adding numShards=1 argument to the solr 
> start up script. Is this a bug? 

Sounds like it…perhaps related to SOLR-3971…not sure though. 

- Mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

2012-12-09 Thread sausarkar
Thank you very much will wait for the results from your tests.

From: "Mark Miller-3 [via Lucene]" 
mailto:ml-node+s472066n4025457...@n3.nabble.com>>
Date: Saturday, December 8, 2012 11:08 PM
To: "Sarkar, Sauvik" mailto:sausar...@ebay.com>>
Subject: Re: SolrCloud - Query performance degrades with multiple servers

If that's true, we will fix it for 4.1. I can look closer tomorrow.

Mark

Sent from my iPhone

On Dec 9, 2012, at 2:04 AM, sausarkar <[hidden 
email]> wrote:

> Spoke too early it seems that SolrCloud is still distributing queries to all
> the servers even if numShards=1 We are seeing POST request to all servers in
> the cluster, please let me know what is the solution. Here is an example:
> (the variable isShard should be false in our case as single shard, please
> help)
>
> POST /solr/core0/select HTTP/1.1
> Content-Charset: UTF-8
> Content-Type: application/x-www-form-urlencoded; charset=UTF-8
> User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
> Content-Length: 991
> Host: server1
> Connection: Keep-Alive
>
> lowercaseOperators=true&mm=70%&fl=EntityId&df=EntityId&q.op=AND&q.alt=*:*&qs=10&stopwords=true&defType=edismax&rows=3000&q=*:*&start=0&fsv=true&distrib=false&*isShard=true&*shard.url=*server1*:9090/solr/core0/|*server2*:9090/solr/core0/|*server3*:9090/solr/core0/&NOW=1354918880447&wt=javabin&version=2
>
>
> Re: SolrCloud - Query performance degrades with multiple servers
> Dec 06, 2012; 6:29pm — by   Mark Miller-3
>
> On Dec 6, 2012, at 5:08 PM, sausarkar <[hidden email]> wrote:
>
>> We solved the issue by explicitly adding numShards=1 argument to the solr
>> start up script. Is this a bug?
>
> Sounds like it…perhaps related to SOLR-3971…not sure though.
>
> - Mark
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025455.html
> Sent from the Solr - User mailing list archive at Nabble.com.



If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025457.html
To unsubscribe from SolrCloud - Query performance degrades with multiple 
servers, click 
here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4024660&code=c2F1c2Fya2FyQGViYXkuY29tfDQwMjQ2NjB8LTE0MTU2ODg5MDk=>.
NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4025573.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-11 Thread sausarkar
Do you know when will 4.1 be released or will there be a 4.0.1 release with
bug fixes from 4.0?

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026139.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

2012-12-12 Thread sausarkar
We still could replicate the issue in 4.1 branch i.e. queries going to one
server (numShards=1) is being distributed among all the servers which is
creating CPU spikes in all the servers in the cloud. Do you think this
behavior is as expected or will be fixed in the 4.1 release?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4026521.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - Query performance degrades with multiple servers

2013-01-09 Thread sausarkar
Hi Yonik,

Could you merger this feature with 4.0 branch, We tried to use 4.1 it did
solve the CPU spike but we did get other issues. As we are very tight on
schedule so it would very beneficial if you could merge this feature with
4.0 branch.

Let me know.

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660p4032088.html
Sent from the Solr - User mailing list archive at Nabble.com.


date query performance (TrieDate)

2013-02-04 Thread sausarkar
we are experiencing performance issues with date range queries. We have
configured the date fields as following:



Our queries are rounded every minute:

qt=ads&debugQuery=false&fl=id,StartDt_t110,...&fq=Status_i110:2&fq=StartDt_t110:{
NOW/MINUTE-150DAYS TO NOW/MINUTE-90DAYS }&start=0&rows=20

Index size - 10 million documents in Solr 4.1

Start of every minute we see the query speed to be about 10-12 seconds and
increases over time. Is there a solution for this.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/date-query-performance-TrieDate-tp4038419.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: date query performance (TrieDate)

2013-02-04 Thread sausarkar
When the query time jump to more than 10 seconds the linux load average
spikes up to more than 100 in a 16 CPU machine. Any one has any suggestions?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/date-query-performance-TrieDate-tp4038419p4038422.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Really bad query performance for date range queries

2013-02-04 Thread sausarkar
Unfortunately we need data by minute we cannot go hour, is there an option
for 3 minutes or 5 minutes? something is like NOW/3MIN?

I am also noticing when I generating around 110 queries per second (date
range ones) after sometime solr does not respond and just freezes. Is there
a way to cure this?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Really-bad-query-performance-for-date-range-queries-tp4038435p4038455.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Really bad query performance for date range queries

2013-02-05 Thread sausarkar
We have a 96GB ram machine with 16 processors. the JVM is set to use 60 GB.
The test that we are running are purely query there is no indexing going on.
I dont see garbage collection when I attach visualVM but see frequent CPU
spikes ~once every minute.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Really-bad-query-performance-for-date-range-queries-tp4038435p4038633.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR 4 Alpha Out Of Mem Err

2012-07-20 Thread sausarkar
Hi Mark,

I am am also facing the same issue when trying to index in SolrCloud using
DIH running on a non-leader server. The DIH server is creating around 10k
threads and then OOM cannot create thread error. 

Do you know when or which version this issue will be solved. I think a
workaround for this issue is to find the leader from zookeeper and run the
DIH on the leader.

Sauvik



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-4-Alpha-Out-Of-Mem-Err-tp3995033p3996378.html
Sent from the Solr - User mailing list archive at Nabble.com.


SOLR 4 Alpha - distributed DIH available?

2012-07-20 Thread sausarkar
If I try to run DIH on the SolrCloud it can hit any one of the servers and
start the import process, but if we try to get the import status from any
other server it returns no import is running. Only the server that is
running the DIH gives back the correct import status. So if we run DIH
behind a load balancer we can get incorrect import status so we have to
stick DIH to a specific server.

So my question is there a distributed DIH available for the SolrCloud? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-4-Alpha-distributed-DIH-available-tp3996404.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrCloud - load balancing

2012-08-03 Thread sausarkar
Do anyone know if query using CommonsHttpSolrServer on SolrCloud is
automatically load balanced? I am trying to load test using solrmeter on one
if the node I am seeing all the nodes seems to be hit. Any clues



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-load-balancing-tp3999143.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - load balancing

2012-08-03 Thread sausarkar
actually I noticing that the CommonsHttpSolrServer seems to load balancing by
hitting all servers in the cluster, just wanted to confirm if that is the
case.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-load-balancing-tp3999143p3999145.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud - load balancing

2012-08-03 Thread sausarkar
Hi Mark,

You are referring to the CloudSolrServer not CommonsHttpSolrServer right? 
does the CommonsHttpSolrServer also round robin?

Do you recommend any tool for load testing SolrCloud?

Thanks,

Sauvik



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-load-balancing-tp3999143p3999159.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr 4.0 schedule

2012-08-10 Thread sausarkar
anyone has any clue when will the beta version of Solr 4.0 be released also
is their any timeframe when the first GA release for Solr4.0?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-schedule-tp4000561.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR 4 Alpha Out Of Mem Err

2012-08-14 Thread sausarkar
Hello Mark,

Has this issue been fixed in the BETA release?

- Sauvik



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-4-Alpha-Out-Of-Mem-Err-tp3995033p4001266.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr4.0 BETA - Error when StempelPolishStemFilterFactory

2012-08-16 Thread sausarkar
I just upgrade to Solr 4.0.0-BETA and it seems there is a problem with the
StempelPolishStemFilterFactory it cannot find a resource, it seems that the
bug was introduced in the new beta release, this works fine in the alpha
release.

Here is the exception I am seeing in the logs:

SEVERE: null:java.lang.RuntimeException: java.io.IOException: Can't find
resource '/org/apache/lucene/analysis/pl/stemmer_2.tbl' in classpath or
'solr/collection1/conf/', cwd /apache-solr-4.0.0-BETA/example
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:116)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:539)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:360)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:309)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:106)
at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
at
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
at
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
at
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)

Anyone has any clue on this?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-0-BETA-Error-when-StempelPolishStemFilterFactory-tp4001724.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr4.0 BETA - Error when StempelPolishStemFilterFactory

2012-08-16 Thread sausarkar
No I tried that, Solr is finding and loading all the other contrib jars but
only for the Polish one it is complaining, it seems like this is a bug.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-0-BETA-Error-when-StempelPolishStemFilterFactory-is-used-tp4001724p4001736.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr4.0 BETA - Error when StempelPolishStemFilterFactory

2012-08-16 Thread sausarkar
Thank you.

From: "steve_rowe [via Lucene]" 
mailto:ml-node+s472066n4001741...@n3.nabble.com>>
Date: Thursday, August 16, 2012 5:05 PM
To: "Sarkar, Sauvik" mailto:sausar...@ebay.com>>
Subject: RE: Solr4.0 BETA - Error when StempelPolishStemFilterFactory

I can reproduce - I agree, this seems like a bug.

I've opened an issue: https://issues.apache.org/jira/browse/SOLR-3737

Thanks for reporting!

Steve

-Original Message-
From: sausarkar [mailto:[hidden 
email]]
Sent: Thursday, August 16, 2012 6:42 PM
To: [hidden email]
Subject: RE: Solr4.0 BETA - Error when StempelPolishStemFilterFactory

No I tried that, Solr is finding and loading all the other contrib jars but
only for the Polish one it is complaining, it seems like this is a bug.



If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/Solr4-0-BETA-Error-when-StempelPolishStemFilterFactory-is-used-tp4001724p4001741.html
To unsubscribe from Solr4.0 BETA - Error when StempelPolishStemFilterFactory is 
used, click 
here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4001724&code=c2F1c2Fya2FyQGViYXkuY29tfDQwMDE3MjR8LTE0MTU2ODg5MDk=>.
NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-0-BETA-Error-when-StempelPolishStemFilterFactory-is-used-tp4001724p4001758.html
Sent from the Solr - User mailing list archive at Nabble.com.

SolrCloud issue - accents are not getting removed

2012-08-17 Thread sausarkar
We noticed that when we use SolrCloud the accents are not getting removed
when we use the 
filter. We are using the -Dbootstrap_conf=true to load the conf folder.

This filter however is working fine in the standalone solr startup scenario.

Did anyone notice the same issue? Do we need to do something else for
SolrCloud for the filters?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-issue-accents-are-not-getting-removed-tp4001891.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud issue - accents are not getting removed

2012-08-17 Thread sausarkar
Just found out that this was not an SolrCloud issue but an issue with the
Tomcat configuration basically the URIEncoding was missing added it and it
fixed the problem.







--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-issue-accents-are-not-getting-removed-tp4001891p4001902.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR 4 Alpha - distributed DIH available?

2012-08-21 Thread sausarkar
can someone let me know how to configure DIH in a cloud environment, should
it point to one specific server or to the load balancer for distributing dih
on all the servers. 
One problem with the load balancer approach is that there is no good way to
tell whether a dih is already running in the cloud. 

Can anyone suggest a better solution?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-4-Alpha-distributed-DIH-available-tp3996404p4002469.html
Sent from the Solr - User mailing list archive at Nabble.com.