Solr/ZK issues

2015-06-17 Thread Sunil . Srinivasan
Hi Folks,

We are seeing the following in our logs on our Solr nodes after which Solr 
nodes go into multiple full GCs  and eventually runs out of heap. We saw this 
ticket - https://issues.apache.org/jira/browse/SOLR-7338 - wondering that’s the 
one causing it.  We are currently on 4.10.0

INFO  - 2015-06-17 08:06:28.163; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@422f41e9 
name:ZooKeeperConnection Watcher:got event WatchedEvent state:Expired type:None 
path:null path:null type:None
INFO  - 2015-06-17 08:06:28.163; 
org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper session 
was expired. Attempting to reconnect to recover relationship with ZooKeeper...
INFO  - 2015-06-17 08:06:28.166; 
org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection expired - 
starting a new one...
INFO  - 2015-06-17 08:06:28.171; 
org.apache.solr.common.cloud.ConnectionManager; Waiting for client to connect 
to ZooKeeper
INFO  - 2015-06-17 08:06:28.177; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@422f41e9 
name:ZooKeeperConnection Watcher: got event WatchedEvent state:SyncConnected 
type:None path:null path:null type:None
INFO  - 2015-06-17 08:06:28.177; 
org.apache.solr.common.cloud.ConnectionManager; Client is connected to ZooKeeper
INFO  - 2015-06-17 08:06:28.178; 
org.apache.solr.common.cloud.ConnectionManager$1; Connection with ZooKeeper 
reestablished.
INFO  - 2015-06-17 08:06:28.178; 
org.apache.solr.common.cloud.DefaultConnectionStrategy; Reconnected to ZooKeeper
INFO  - 2015-06-17 08:06:28.179; 
org.apache.solr.common.cloud.ConnectionManager; Connected:true
WARN  - 2015-06-17 08:06:28.179; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=category coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=category_shadow coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=rules_shadow coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=rules coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=catalog_shadow coreNodeName=core_node2
WARN  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; 
Stopping recovery for core=catalog coreNodeName=core_node2
INFO  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.ZkController; publishing 
core=category state=down collection=category
INFO  - 2015-06-17 08:06:28.180; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.186; org.apache.solr.cloud.ZkController; publishing 
core=category_shadow state=down collection=category_shadow
INFO  - 2015-06-17 08:06:28.186; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.189; org.apache.solr.cloud.ZkController; publishing 
core=rules_shadow state=down collection=rules_shadow
INFO  - 2015-06-17 08:06:28.189; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.191; org.apache.solr.cloud.ZkController; publishing 
core=rules state=down collection=rules
INFO  - 2015-06-17 08:06:28.191; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.193; org.apache.solr.cloud.ZkController; publishing 
core=catalog_shadow state=down collection=catalog_shadow
INFO  - 2015-06-17 08:06:28.193; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.194; org.apache.solr.cloud.ZkController; publishing 
core=catalog state=down collection=catalog
INFO  - 2015-06-17 08:06:28.194; org.apache.solr.cloud.ZkController; numShards 
not found on descriptor - reading it from system property
INFO  - 2015-06-17 08:06:28.198; org.apache.solr.cloud.ZkController; Replica 
core_node2 NOT in leader-initiated recovery, need to wait for leader to see 
down state.
o wait for leader to see down state.
WARN  - 2015-06-17 08:07:51.188; org.apache.solr.cloud.ZkController;
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /collections/rules_shadow/leader_elect/shard1/election
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
at 
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:290)
at 
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:287)
at

Re: [EXTERNAL] Grouping and group.facet performance disaster

2017-05-31 Thread Sunil . Srinivasan
Use group.cache.percent – for your index size, it might work well.

Thanks,

On 5/31/17, 4:16 AM, "Marek Tichy"  wrote:

Hi,

I'm getting a very slow response times on grouping, especially on facet
grouping.

Without grouping, the query takes 14ms, faceting 57ms.

With grouping, the query time goes up to 1131ms, with facet grouping,
the faceting goes up to the unbearable 12103 ms.

Single solr instance, 927086docs, 518.23 MB size, solr 6.4.1.

Is this really the price of grouping ? Are there any magic
tricks/tips/techniques to improve the speed ?
The query params below.

Many thanks for any help, much appreciated.

Best
 Marek Tichy








fq=((type:knihy) OR (type:defekty))
fl=*
start=0
f.ebook_formats.facet.mincount=1
f.authorid.facet.mincount=1
f.thematicgroupid.facet.mincount=1
f.articleparts.facet.mincount=1
f.type.facet.mincount=1
f.languageid.facet.mincount=1
f.showwindow.facet.mincount=1
f.articletypeid_grouped.facet.mincount=1
f.languageid.facet.limit=10
f.ebook_formats.facet.limit=10
f.authorid.facet.limit=10
f.type.facet.limit=10
f.articleparts.facet.limit=10
f.thematicgroupid.facet.limit=10
f.articletypeid_grouped.facet.limit=10
f.showwindow.facet.limit=100
version=2.2
group.limit=30
rows=30
echoParams=all
sort=date desc,planneddate asc
group.field=edition
facet.method=enum
group.truncate=false
group.format=grouped
group=true
group.ngroups=true
stats=true
facet=true
group.facet=true
stats.field={!distinctValues=true}categoryid
facet.field={!ex=at}articletypeid_grouped
facet.field={!ex=at}type
facet.field={!ex=author}authorid
facet.field={!ex=format}articleparts
facet.field={!ex=format}ebook_formats
facet.field={!ex=lang}languageid
facet.field={!ex=sw}showwindow
facet.field={!ex=tema}thematicgroupid
stats.field={!min=true max=true}price
stats.field={!min=true max=true}yearout




Solr edismax parser with multi-word synonyms

2019-07-17 Thread Sunil Srinivasan

I have enabled the SynonymGraphFilter in my field configuration in order to 
support multi-word synonyms (I am using Solr 7.6). Here is my field 
configuration:


  



  
  





And this is my synonyms.txt file:
frozen dinner,microwave food

Scenario 1: blue shirt (query with no synonyms)

Here is my first Solr query:
http://localhost:8983/solr/base/search?q=blue+shirt&qf=title&defType=edismax&debugQuery=on

And this is the parsed query I see in the debug output:
+((title:blue) (title:shirt))

Scenario 2: frozen dinner (query with synonyms)

Now, here is my second Solr query:
http://localhost:8983/solr/base/search?q=frozen+dinner&qf=title&defType=edismax&debugQuery=on

And this is the parsed query I see in the debug output:
+(((+title:microwave +title:food) (+title:frozen +title:dinner)))

I am wondering why the first query looks for documents containing at least one 
of the two query tokens, whereas the second query looks for documents with both 
of the query tokens? I would understand if it looked for both the tokens of the 
synonyms (i.e. both microwave and food) to avoid the sausagization problem. But 
I would like to get partial matches on the original query at least (i.e. it 
should also match documents containing just the token 'dinner').

Would any one know why the behavior is different across queries with and 
without synonyms? And how could I work around this if I wanted partial matches 
on queries that also have synonyms?

Ideally, I would like the parsed query in the second case to be:
+(((+title:microwave +title:food) (title:frozen title:dinner)))

I'd appreciate any help with this. Thanks!


Re: Re: Solr edismax parser with multi-word synonyms

2019-07-18 Thread Sunil Srinivasan
Hi Erick, 
Is there anyway I can get it to match documents containing at least one of the 
words of the original query? i.e. 'frozen' or 'dinner' or both. (But not 
partial matches of the synonyms)
Thanks,Sunil


-Original Message-
From: Erick Erickson 
To: solr-user 
Sent: Thu, Jul 18, 2019 04:42 AM
Subject: Re: Solr edismax parser with multi-word synonyms


This is not a phrase query, rather it’s requiring either pair of words
to appear in the title.

You’ve told it that “frozen dinner” and “microwave foods” are synonyms. 
So it’s looking for both the words “microwave” and “foods” in the title field, 
or “frozen” and “dinner” in the title field.

You’d see the same thing with single-word synonyms, albeit a little less
confusingly.


Best,
Erick


> On Jul 18, 2019, at 1:01 AM, kshitij tyagi  
> wrote:
> 
> Hi sunil,
> 
> 1. as you have added "microwave food" in synonym as a multiword synonym to
> "frozen dinner", edismax parsers finds your synonym in the file and is
> considering your query as a Phrase query.
> 
> This is the reason you are seeing parsed query as  +(((+title:microwave
> +title:food) (+title:frozen +title:dinner))), frozen dinner is considered
> as a phrase here.
> 
> If you want partial match on your query then you can add frozen dinner,
> microwave food, microwave, food to your synonym file and you will see the
> parsed query as:
> "+(((+title:microwave +title:food) title:miccrowave title:food
> (+title:frozen +title:dinner)))"
> Another option is to write your own custom query parser and use it as a
> plugin.
> 
> Hope this helps!!
> 
> kshitij
> 
> 
> On Thu, Jul 18, 2019 at 9:14 AM Sunil Srinivasan  wrote:
> 
>> 
>> I have enabled the SynonymGraphFilter in my field configuration in order
>> to support multi-word synonyms (I am using Solr 7.6). Here is my field
>> configuration:
>> 
>>    
>>      
>>    
>> 
>>    
>>      
>>      > synonyms="synonyms.txt"/>
>>    
>> 
>> 
>> 
>> 
>> And this is my synonyms.txt file:
>> frozen dinner,microwave food
>> 
>> Scenario 1: blue shirt (query with no synonyms)
>> 
>> Here is my first Solr query:
>> 
>> http://localhost:8983/solr/base/search?q=blue+shirt&qf=title&defType=edismax&debugQuery=on
>> 
>> And this is the parsed query I see in the debug output:
>> +((title:blue) (title:shirt))
>> 
>> Scenario 2: frozen dinner (query with synonyms)
>> 
>> Now, here is my second Solr query:
>> 
>> http://localhost:8983/solr/base/search?q=frozen+dinner&qf=title&defType=edismax&debugQuery=on
>> 
>> And this is the parsed query I see in the debug output:
>> +(((+title:microwave +title:food) (+title:frozen +title:dinner)))
>> 
>> I am wondering why the first query looks for documents containing at least
>> one of the two query tokens, whereas the second query looks for documents
>> with both of the query tokens? I would understand if it looked for both the
>> tokens of the synonyms (i.e. both microwave and food) to avoid the
>> sausagization problem. But I would like to get partial matches on the
>> original query at least (i.e. it should also match documents containing
>> just the token 'dinner').
>> 
>> Would any one know why the behavior is different across queries with and
>> without synonyms? And how could I work around this if I wanted partial
>> matches on queries that also have synonyms?
>> 
>> Ideally, I would like the parsed query in the second case to be:
>> +(((+title:microwave +title:food) (title:frozen title:dinner)))
>> 
>> I'd appreciate any help with this. Thanks!
>>