Solr/ZK issues
Hi Folks, We are seeing the following in our logs on our Solr nodes after which Solr nodes go into multiple full GCs and eventually runs out of heap. We saw this ticket - https://issues.apache.org/jira/browse/SOLR-7338 - wondering that’s the one causing it. We are currently on 4.10.0 INFO - 2015-06-17 08:06:28.163; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@422f41e9 name:ZooKeeperConnection Watcher:got event WatchedEvent state:Expired type:None path:null path:null type:None INFO - 2015-06-17 08:06:28.163; org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... INFO - 2015-06-17 08:06:28.166; org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection expired - starting a new one... INFO - 2015-06-17 08:06:28.171; org.apache.solr.common.cloud.ConnectionManager; Waiting for client to connect to ZooKeeper INFO - 2015-06-17 08:06:28.177; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@422f41e9 name:ZooKeeperConnection Watcher: got event WatchedEvent state:SyncConnected type:None path:null path:null type:None INFO - 2015-06-17 08:06:28.177; org.apache.solr.common.cloud.ConnectionManager; Client is connected to ZooKeeper INFO - 2015-06-17 08:06:28.178; org.apache.solr.common.cloud.ConnectionManager$1; Connection with ZooKeeper reestablished. INFO - 2015-06-17 08:06:28.178; org.apache.solr.common.cloud.DefaultConnectionStrategy; Reconnected to ZooKeeper INFO - 2015-06-17 08:06:28.179; org.apache.solr.common.cloud.ConnectionManager; Connected:true WARN - 2015-06-17 08:06:28.179; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=category coreNodeName=core_node2 WARN - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=category_shadow coreNodeName=core_node2 WARN - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=rules_shadow coreNodeName=core_node2 WARN - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=rules coreNodeName=core_node2 WARN - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=catalog_shadow coreNodeName=core_node2 WARN - 2015-06-17 08:06:28.180; org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for core=catalog coreNodeName=core_node2 INFO - 2015-06-17 08:06:28.180; org.apache.solr.cloud.ZkController; publishing core=category state=down collection=category INFO - 2015-06-17 08:06:28.180; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-06-17 08:06:28.186; org.apache.solr.cloud.ZkController; publishing core=category_shadow state=down collection=category_shadow INFO - 2015-06-17 08:06:28.186; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-06-17 08:06:28.189; org.apache.solr.cloud.ZkController; publishing core=rules_shadow state=down collection=rules_shadow INFO - 2015-06-17 08:06:28.189; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-06-17 08:06:28.191; org.apache.solr.cloud.ZkController; publishing core=rules state=down collection=rules INFO - 2015-06-17 08:06:28.191; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-06-17 08:06:28.193; org.apache.solr.cloud.ZkController; publishing core=catalog_shadow state=down collection=catalog_shadow INFO - 2015-06-17 08:06:28.193; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-06-17 08:06:28.194; org.apache.solr.cloud.ZkController; publishing core=catalog state=down collection=catalog INFO - 2015-06-17 08:06:28.194; org.apache.solr.cloud.ZkController; numShards not found on descriptor - reading it from system property INFO - 2015-06-17 08:06:28.198; org.apache.solr.cloud.ZkController; Replica core_node2 NOT in leader-initiated recovery, need to wait for leader to see down state. o wait for leader to see down state. WARN - 2015-06-17 08:07:51.188; org.apache.solr.cloud.ZkController; org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /collections/rules_shadow/leader_elect/shard1/election at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:290) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:287) at
Re: [EXTERNAL] Grouping and group.facet performance disaster
Use group.cache.percent – for your index size, it might work well. Thanks, On 5/31/17, 4:16 AM, "Marek Tichy" wrote: Hi, I'm getting a very slow response times on grouping, especially on facet grouping. Without grouping, the query takes 14ms, faceting 57ms. With grouping, the query time goes up to 1131ms, with facet grouping, the faceting goes up to the unbearable 12103 ms. Single solr instance, 927086docs, 518.23 MB size, solr 6.4.1. Is this really the price of grouping ? Are there any magic tricks/tips/techniques to improve the speed ? The query params below. Many thanks for any help, much appreciated. Best Marek Tichy fq=((type:knihy) OR (type:defekty)) fl=* start=0 f.ebook_formats.facet.mincount=1 f.authorid.facet.mincount=1 f.thematicgroupid.facet.mincount=1 f.articleparts.facet.mincount=1 f.type.facet.mincount=1 f.languageid.facet.mincount=1 f.showwindow.facet.mincount=1 f.articletypeid_grouped.facet.mincount=1 f.languageid.facet.limit=10 f.ebook_formats.facet.limit=10 f.authorid.facet.limit=10 f.type.facet.limit=10 f.articleparts.facet.limit=10 f.thematicgroupid.facet.limit=10 f.articletypeid_grouped.facet.limit=10 f.showwindow.facet.limit=100 version=2.2 group.limit=30 rows=30 echoParams=all sort=date desc,planneddate asc group.field=edition facet.method=enum group.truncate=false group.format=grouped group=true group.ngroups=true stats=true facet=true group.facet=true stats.field={!distinctValues=true}categoryid facet.field={!ex=at}articletypeid_grouped facet.field={!ex=at}type facet.field={!ex=author}authorid facet.field={!ex=format}articleparts facet.field={!ex=format}ebook_formats facet.field={!ex=lang}languageid facet.field={!ex=sw}showwindow facet.field={!ex=tema}thematicgroupid stats.field={!min=true max=true}price stats.field={!min=true max=true}yearout
Solr edismax parser with multi-word synonyms
I have enabled the SynonymGraphFilter in my field configuration in order to support multi-word synonyms (I am using Solr 7.6). Here is my field configuration: And this is my synonyms.txt file: frozen dinner,microwave food Scenario 1: blue shirt (query with no synonyms) Here is my first Solr query: http://localhost:8983/solr/base/search?q=blue+shirt&qf=title&defType=edismax&debugQuery=on And this is the parsed query I see in the debug output: +((title:blue) (title:shirt)) Scenario 2: frozen dinner (query with synonyms) Now, here is my second Solr query: http://localhost:8983/solr/base/search?q=frozen+dinner&qf=title&defType=edismax&debugQuery=on And this is the parsed query I see in the debug output: +(((+title:microwave +title:food) (+title:frozen +title:dinner))) I am wondering why the first query looks for documents containing at least one of the two query tokens, whereas the second query looks for documents with both of the query tokens? I would understand if it looked for both the tokens of the synonyms (i.e. both microwave and food) to avoid the sausagization problem. But I would like to get partial matches on the original query at least (i.e. it should also match documents containing just the token 'dinner'). Would any one know why the behavior is different across queries with and without synonyms? And how could I work around this if I wanted partial matches on queries that also have synonyms? Ideally, I would like the parsed query in the second case to be: +(((+title:microwave +title:food) (title:frozen title:dinner))) I'd appreciate any help with this. Thanks!
Re: Re: Solr edismax parser with multi-word synonyms
Hi Erick, Is there anyway I can get it to match documents containing at least one of the words of the original query? i.e. 'frozen' or 'dinner' or both. (But not partial matches of the synonyms) Thanks,Sunil -Original Message- From: Erick Erickson To: solr-user Sent: Thu, Jul 18, 2019 04:42 AM Subject: Re: Solr edismax parser with multi-word synonyms This is not a phrase query, rather it’s requiring either pair of words to appear in the title. You’ve told it that “frozen dinner” and “microwave foods” are synonyms. So it’s looking for both the words “microwave” and “foods” in the title field, or “frozen” and “dinner” in the title field. You’d see the same thing with single-word synonyms, albeit a little less confusingly. Best, Erick > On Jul 18, 2019, at 1:01 AM, kshitij tyagi > wrote: > > Hi sunil, > > 1. as you have added "microwave food" in synonym as a multiword synonym to > "frozen dinner", edismax parsers finds your synonym in the file and is > considering your query as a Phrase query. > > This is the reason you are seeing parsed query as +(((+title:microwave > +title:food) (+title:frozen +title:dinner))), frozen dinner is considered > as a phrase here. > > If you want partial match on your query then you can add frozen dinner, > microwave food, microwave, food to your synonym file and you will see the > parsed query as: > "+(((+title:microwave +title:food) title:miccrowave title:food > (+title:frozen +title:dinner)))" > Another option is to write your own custom query parser and use it as a > plugin. > > Hope this helps!! > > kshitij > > > On Thu, Jul 18, 2019 at 9:14 AM Sunil Srinivasan wrote: > >> >> I have enabled the SynonymGraphFilter in my field configuration in order >> to support multi-word synonyms (I am using Solr 7.6). Here is my field >> configuration: >> >> >> >> >> >> >> >> > synonyms="synonyms.txt"/> >> >> >> >> >> >> And this is my synonyms.txt file: >> frozen dinner,microwave food >> >> Scenario 1: blue shirt (query with no synonyms) >> >> Here is my first Solr query: >> >> http://localhost:8983/solr/base/search?q=blue+shirt&qf=title&defType=edismax&debugQuery=on >> >> And this is the parsed query I see in the debug output: >> +((title:blue) (title:shirt)) >> >> Scenario 2: frozen dinner (query with synonyms) >> >> Now, here is my second Solr query: >> >> http://localhost:8983/solr/base/search?q=frozen+dinner&qf=title&defType=edismax&debugQuery=on >> >> And this is the parsed query I see in the debug output: >> +(((+title:microwave +title:food) (+title:frozen +title:dinner))) >> >> I am wondering why the first query looks for documents containing at least >> one of the two query tokens, whereas the second query looks for documents >> with both of the query tokens? I would understand if it looked for both the >> tokens of the synonyms (i.e. both microwave and food) to avoid the >> sausagization problem. But I would like to get partial matches on the >> original query at least (i.e. it should also match documents containing >> just the token 'dinner'). >> >> Would any one know why the behavior is different across queries with and >> without synonyms? And how could I work around this if I wanted partial >> matches on queries that also have synonyms? >> >> Ideally, I would like the parsed query in the second case to be: >> +(((+title:microwave +title:food) (title:frozen title:dinner))) >> >> I'd appreciate any help with this. Thanks! >>