Re: Down Replica is elected as Leader (solr v8.7.0)

2021-02-11 Thread Rahul Goswami
I haven’t delved into the exact reason for this, but what generally helps to avoid this situation in a cluster is i) During shutdown (in case you need to restart the cluster), let the overseer node be the last one to shut down. ii) While restarting, let the Overseer node be the first one to start i

Re: StandardTokenizerFactory doesn't split on underscore

2021-01-09 Thread Rahul Goswami
t on underscores if that is your use case. > > On Sat, Jan 9, 2021 at 2:58 PM Rahul Goswami > wrote: > > > Nope. The underscore is preserved right after tokenization even before it > > reaches any filters. You can choose the type "text_general" and try an &

Re: StandardTokenizerFactory doesn't split on underscore

2021-01-09 Thread Rahul Goswami
g wrote: > did you configured PatternReplaceFilterFactory? > > > > > > > > > > > > > > > > > > At 2021-01-08 12:16:06, "Rahul Goswami" wrote: > >Hello, > >So recently I was debugging a problem on Solr 7.7.2 where the query was

StandardTokenizerFactory doesn't split on underscore

2021-01-07 Thread Rahul Goswami
Hello, So recently I was debugging a problem on Solr 7.7.2 where the query wasn't returning the desired results. Turned out that the indexed terms had underscore separated terms, but the query didn't. I was under the impression that terms separated by underscore are also tokenized by StandardTokeni

Re: Need urgent help -- High cpu on solr

2020-10-16 Thread Rahul Goswami
In addition to the insightful pointers by Zisis and Erick, I would like to mention an approach in the link below that I generally use to pinpoint exactly which threads are causing the CPU spike. Knowing this you can understand which aspect of Solr (search thread, GC, update thread etc) is taking mo

Re: Question about solr commits

2020-10-08 Thread Rahul Goswami
Shawn, So if the autoCommit interval is 15 seconds, and one update request arrives at t=0 and another at t=10 seconds, then will there be two timers one expiring at t=15 and another at t=25 seconds, but this would amount to ONLY ONE commit at t=15 since that one would include changes from both upda

Re: Solr 7.7 - Few Questions

2020-10-06 Thread Rahul Goswami
ard to hear back your experiences on > Solr Scale up. > > Regards, > Manisha Rahatadkar > > -Original Message- > From: Rahul Goswami > Sent: Sunday, October 4, 2020 11:49 PM > To: ch...@opensourceconnections.com; solr-user@lucene.apache.org > Subject: Re: Solr 7.7 - Few Questio

Re: Solr 7.7 - Few Questions

2020-10-04 Thread Rahul Goswami
so for example if the > > query hit one of the PDFs you could show the user the original email, > > plus the 9 other attachments, using the shared ID as a key. > > > > HTH, > > > > Charlie > > > > On 02/10/2020 01:53, Rahul Goswami wrote: > > >

Re: Solr 7.7 - Few Questions

2020-10-01 Thread Rahul Goswami
Manisha, In addition to what Shawn has mentioned above, I would also like you to reevaluate your use case. Do you *need to* index the whole document ? eg: If it's an email, the body of the email *might* be more important than any attachments, in which case you could choose to only index the email b

Re: ApacheCon at Home 2020 starts tomorrow!

2020-09-29 Thread Rahul Goswami
Thanks for sharing this Anshum. Day 1 had some really interesting sessions. Missed out on a couple that I would have liked to listen to. Are the recordings of these sessions available anywhere? -Rahul On Mon, Sep 28, 2020 at 7:08 PM Anshum Gupta wrote: > Hey everyone! > > ApacheCon at Home 2020

Re: Delete from Solr console fails

2020-09-26 Thread Rahul Goswami
it did not > > > work. > > > > > > > > > > Btw, when I try to today, I am no longer getting the "Connection > lost" > > > > > error. The delete command returns with status=success, however the > > > > document > >

Re: Delete from Solr console fails

2020-09-24 Thread Rahul Goswami
Goutham, Is the field you are trying to delete by indexed=true in the schema ? If the uniqueKey is indexed=true, does delete by id work for you? ( uniqueKey:value) Also, instead of "Solr Command" if you choose the Document type as "XML" does it make any difference? Rahul On Thu, Sep 24, 2020 at

Re: How to remove duplicate tokens from solr

2020-09-17 Thread Rahul Goswami
Is this for a phrase search? If yes then the position of the token would matter too and not sure which token would you want to remove. "eg "tshirt hat tshirt". Also, are you looking to save space and want this at index time? Or just want to remove duplicates from the search string? If this is at s

Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-18 Thread Rahul Goswami
I agree with Phill, Noble and Ilan above. The problematic term is "slave" (not master) which I am all for changing if it causes less regression than removing BOTH master and slave. Since some people have pointed out Github changing the "master" terminology, in my personal opinion, it was not a meas

Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-17 Thread Rahul Goswami
+1 on avoiding SolrCloud terminology. In the interest of keeping it obvious and simple, may I I please suggest primary/secondary? On Wed, Jun 17, 2020 at 5:14 PM Atita Arora wrote: > I agree avoiding using of solr cloud terminology too. > > I may suggest going for "prime" and "clone" > (Short an

Re: when to use docvalue

2020-05-20 Thread Rahul Goswami
Eric, Thanks for that explanation. I have a follow up question on that. I find the scenario of stored=true and docValues=true to be tricky at times... would like to know when is each of these scenarios preferred over the other two for primitive datatypes: 1) stored=true and docValues=false 2) stor

Re: Solr filter cache hits not reflecting

2020-04-20 Thread Rahul Goswami
Hoss, Thank you for such a succinct explanation! I was not aware of the order of lookups (queryResultCache followed by filterCache). Makes sense now. Sorry for the false alarm! Rahul On Mon, Apr 20, 2020 at 4:04 PM Chris Hostetter wrote: > : 4) A query with different fq. > : > http://localhost

Re: Solr filter cache hits not reflecting

2020-04-20 Thread Rahul Goswami
Hi Hoss, Thanks for your detailed response. In your steps if you go a step further and search again with the same fq, you should be able to uncover the problem. Here are the step-by-step observations on Solr 8.5 (7.2.1 and 7.7.2 have the same issue) 1) Before any queries: http://localhost:8984/

Solr filter cache hits not reflecting

2020-04-20 Thread Rahul Goswami
Hello, I was trying to analyze the filter cache performance and noticed a strange thing. Upon searching with fq, the entry gets added to the cache the first time. Observing from the "Stats/Plugins" tab on Solr admin UI, the 'lookup' and 'inserts' count gets incremented. However, if I search with t

Re: Zookeeper upgrade required with Solr upgrade?

2020-02-13 Thread Rahul Goswami
eb 13, 2020 at 9:26 AM Erick Erickson wrote: > That should be OK. There were no code changes necessary for that upgrade. > see SOLR-13363 > > > On Feb 12, 2020, at 5:34 PM, Rahul Goswami > wrote: > > > > Hello, > > We are running a SolrCloud (7.2.1) cluster an

Zookeeper upgrade required with Solr upgrade?

2020-02-12 Thread Rahul Goswami
Hello, We are running a SolrCloud (7.2.1) cluster and upgrading to Solr 7.7.2. We run a separate multi node zookeeper ensemble which currently runs Zookeeper 3.4.10. Is it also required to upgrade Zookeeper (to 3.4.14 as per change.txt for Solr 7.7.2) along with Solr ? I tried a few basic updates

Performance comparison for wildcard searches

2020-02-03 Thread Rahul Goswami
Hello, I am working with Solr 7.2.1 and had a question regarding the performance of wildcard searches. q=*:* vs q=id:* vs q=id:[* TO *] Can someone please rank them in the order of performance with the underlying reason? Thanks, Rahul

Re: How expensive is core loading?

2020-01-29 Thread Rahul Goswami
l documents and the index size (to gather stats about the Solr server), is the amount of memory consumed proportional to the index size in some way? Thanks, Rahul On Wed, Jan 29, 2020 at 6:43 PM Shawn Heisey wrote: > On 1/29/2020 3:01 PM, Rahul Goswami wrote: > > 1) How expensive is c

Re: How expensive is core loading?

2020-01-29 Thread Rahul Goswami
You might use Luke to get that info from the index files without loading > them > into Solr. > > https://code.google.com/archive/p/luke/ > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Jan 29, 2

How expensive is core loading?

2020-01-29 Thread Rahul Goswami
Hello, I am using Solr 7.2.1 on a Solr node running in standalone mode (-Xmx 8 GB). I wish to implement a service to monitor the server stats (like number of docs per core, index size etc) .This would require me to load the core and my concern is that for a node hosting 100+ cores, this could be ex

Solr indexing performance

2019-12-05 Thread Rahul Goswami
Hello, We have a Solr 7.2.1 Solr Cloud setup where the client is indexing in 5 parallel threads with 5000 docs per batch. This is a test setup and all documents are indexed on the same node. We are seeing connection timeout issues thereafter some time into indexing. I am yet to analyze GC pauses a

Re: [ANNOUNCE] Apache Solr 8.3.1 released

2019-12-04 Thread Rahul Goswami
Thanks Ishan. I was just going through the list of fixes in 8.3.1 (published in changes.txt) and couldn't see the below JIRA. SOLR-13971 : Velocity response writer's resource loading now possible only through startup parameters. Is it linked approp

Re: Solr 8.2 indexing issues

2019-11-21 Thread Rahul Goswami
Hi Sujatha, How did you upgrade your cluster ? Did you restart each node in the cluster one by one after upgrade (while other nodes were running on 6.6.2) or did you bring down the entire cluster and bring up one upgraded node at a time? Thanks, Rahul On Thu, Nov 14, 2019 at 7:03 AM Paras Lehana

Re: Upgrade solr from 7.2.1 to 8.2

2019-11-19 Thread Rahul Goswami
Hello, Just wanted to follow up in case my question fell through the cracks :) Would appreciate help on this. Thanks, Rahul On Fri, Nov 15, 2019 at 5:32 PM Rahul Goswami wrote: > Hello, > > We are planning to upgrade our SolrCloud cluster from 7.2.1 (hosted on > Windows server)

Upgrade solr from 7.2.1 to 8.2

2019-11-15 Thread Rahul Goswami
Hello, We are planning to upgrade our SolrCloud cluster from 7.2.1 (hosted on Windows server) to 8.2. I read the documentation which mentions that I need to be on Solr 7.3 and higher to be able to upgrade

Re: Custom update processor not kicking in

2019-09-19 Thread Rahul Goswami
You have to be ready to > reindex as time passes, either to upgrade to a major version > 2 greater than what you're using now or because the requirements > change yet again. > > Best, > Erick > > On Thu, Sep 19, 2019 at 12:36 AM Rahul Goswami > wrote: > > > > Er

Re: Custom update processor not kicking in

2019-09-18 Thread Rahul Goswami
Eric, Markus, Thank you for your inputs. I made sure that the jar file is found correctly since the core reloads fine and also prints the log lines from my processor during update request (getInstane() method of the update factory). The reason why I want to insert the processor between distributed

Custom update processor not kicking in

2019-09-18 Thread Rahul Goswami
Hello, I am using solr 7.2.1 in a standalone mode. I created a custom update request processor and placed it between the distributed processor and run update processor in my chain. I made sure the chain is invoked since I see log lines from the getInstance() method of my processor factory. But I d

Re: SolrCloud indexing triggers merges and timeouts

2019-07-12 Thread Rahul Goswami
19 at 2:11 AM Rahul Goswami wrote: > Shawn,Erick, > Thank you for the explanation. The merge scheduler params make sense now. > > Thanks, > Rahul > > On Wed, Jul 3, 2019 at 11:30 AM Erick Erickson > wrote: > >> Two more tidbits to add to Shawn’s expla

Re: SolrCloud indexing triggers merges and timeouts

2019-07-04 Thread Rahul Goswami
lt) merges in “tiers” that > are of similar size. So you can have multiple merges going on > at the same time on disjoint sets of segments. > > Best, > Erick > > > On Jul 3, 2019, at 7:54 AM, Shawn Heisey wrote: > > > > On 7/2/2019 10:53 PM, Rahul Goswami wrote:

Re: SolrCloud indexing triggers merges and timeouts

2019-07-02 Thread Rahul Goswami
iculty wrapping my head around this, and would appreciate if you could help clear it for me. Thanks, Rahul On Thu, Jun 13, 2019 at 7:33 AM Shawn Heisey wrote: > On 6/6/2019 9:00 AM, Rahul Goswami wrote: > > *OP Reply* : Total 48 GB per node... I couldn't see another software > us

Re: Configuration recommendation for SolrCloud

2019-07-01 Thread Rahul Goswami
On Sat, Jun 29, 2019 at 1:13 PM Toke Eskildsen wrote: > Rahul Goswami wrote: > > We are running Solr 7.2.1 and planning for a deployment which will grow > to > > 4 billion documents over time. We have 16 nodes at disposal.I am thinking > > between 3 configurations: >

Configuration recommendation for SolrCloud

2019-06-25 Thread Rahul Goswami
Hello, We are running Solr 7.2.1 and planning for a deployment which will grow to 4 billion documents over time. We have 16 nodes at disposal.I am thinking between 3 configurations: 1 cluster - 16 nodes vs 2 clusters - 8 nodes each vs 4 clusters -4 nodes each Irrespective of the configuration, ea

Re: SolrCloud: Configured socket timeouts not reflecting

2019-06-24 Thread Rahul Goswami
r this part is different on the master. Regards, Rahul On Thu, Jun 20, 2019 at 8:22 PM Rahul Goswami wrote: > Hi Gus, > Thanks for the response and referencing the umbrella JIRA for these kind > of issues. I see that it won't solve the problem since the builder object > wh

Re: SolrCloud: Configured socket timeouts not reflecting

2019-06-20 Thread Rahul Goswami
sues.apache.org/jira/browse/SOLR-13457 > > -Gus > > On Tue, Jun 18, 2019 at 5:52 PM Rahul Goswami > wrote: > > > Hello, > > > > I was looking into the code to try to get to the root of this issue. > Looks > > like this is an issue after all (as of 7.2.1 w

Re: SolrCloud: Configured socket timeouts not reflecting

2019-06-18 Thread Rahul Goswami
teShardHandlerConfig().getDistributedSocketTimeout(); } I found this open JIRA on this issue: https://issues.apache.org/jira/browse/SOLR-12550?jql=text%20~%20%22distribUpdateSoTimeout%22 Should I update the JIRA with this ? Thanks, Rahul On Thu, Jun 13, 2019 at 12:00 AM Rahul Goswami wrote: > Hello, >

SolrCloud: Configured socket timeouts not reflecting

2019-06-12 Thread Rahul Goswami
Hello, I am running Solr 7.2.1 in cloud mode. To overcome a setup hardware bottleneck, I tried to configure distribUpdateSoTimeout and socketTimeout to a value greater than the default 10 mins. I did this by passing these as system properties at Solr start up time (-DdistribUpdateSoTimeout and -Ds

Re: SolrCloud indexing triggers merges and timeouts

2019-06-12 Thread Rahul Goswami
/measures. Thanks, Rahul On Thu, Jun 6, 2019 at 11:00 AM Rahul Goswami wrote: > Thank you for your responses. Please find additional details about the > setup below: > > We are using Solr 7.2.1 > > > I have a solrcloud setup on Windows server with below config: > >

Re: SolrCloud indexing triggers merges and timeouts

2019-06-06 Thread Rahul Goswami
ndex.ConcurrentMergeScheduler", "maxMergeCount":2, "maxThreadCount":2}, Thanks, Rahul On Wed, Jun 5, 2019 at 4:24 PM Shawn Heisey wrote: > On 6/5/2019 9:39 AM, Rahul Goswami wrote: > > I have a solrcloud setup on Windows server with below config: > >

SolrCloud indexing triggers merges and timeouts

2019-06-05 Thread Rahul Goswami
Hello, I have a solrcloud setup on Windows server with below config: 3 nodes, 24 shards with replication factor 2 Each node hosts 16 cores. Index size is 1.4 TB per node Xms 8 GB , Xmx 24 GB Directory factory used is SimpleFSDirectoryFactory The cloud is all nice and green for the most part. Only

Re: Graph query extremely slow

2019-06-01 Thread Rahul Goswami
, since the parameters of this fq don't change shouldn't I expect to gain any advantage out of using the filterCache? Thanks, Rahul On Wed, May 22, 2019 at 7:40 AM Toke Eskildsen wrote: > On Wed, 2019-05-15 at 21:37 -0400, Rahul Goswami wrote: > > fq={!graph from=from_field to=

Re: Graph query extremely slow

2019-05-19 Thread Rahul Goswami
Hello experts, Just following up in case my previous email got lost in the big stack of queries. Would appreciate any help on optimizing a graph query. Or any pointers on the direction to investigate. Thanks, Rahul On Wed, May 15, 2019 at 9:37 PM Rahul Goswami wrote: > Hello, >

Graph query extremely slow

2019-05-15 Thread Rahul Goswami
Hello, I am running Solr 7.2.1 in standalone mode with 8GB heap. I have an index with ~4 million documents. Not too big. I am using a graph query parser to filter out some documents as below: fq={!graph from=from_field to=to_field returnRoot=false} Both from_field and to_field are indexed and of

Re: Delay searches till log replay finishes

2019-03-21 Thread Rahul Goswami
you think > your situation is different? Do you have any evidence that would be a > problem at all? > > Best, > Erick > > > > On Mar 8, 2019, at 11:05 AM, Shawn Heisey wrote: > > > > On 3/8/2019 10:44 AM, Rahul Goswami wrote: > >> 1) Is there currently

Re: Delay searches till log replay finishes

2019-03-08 Thread Rahul Goswami
latively short > for a variety of reasons, not the least of which is that it’ll grow > infinitely until a hard commit happens. > > Best, > Erick > > > On Mar 8, 2019, at 8:48 AM, Rahul Goswami wrote: > > > > What I am observing is that Solr is fully started up ev

Re: Delay searches till log replay finishes

2019-03-08 Thread Rahul Goswami
1 On Thu, Mar 7, 2019 at 11:36 PM Zheng Lin Edwin Yeo wrote: > Hi, > > Do you mean that when you startup Solr, it will automatically do the search > request even before the Solr is fully started up? > > Regards, > Edwin > > > On Fri, 8 Mar 2019 at 10:13, Rahul Goswami

Delay searches till log replay finishes

2019-03-07 Thread Rahul Goswami
Hello Solr gurus, I am using Solr 7.2.1 (non-SolrCloud). I have a situation where Solr got killed before it could commit updates to the disk resulting in log replay on startup. During this interval, I observe that a searcher is opened even before log replay has finished, resulting in some stale re

Re: Full index replication upon service restart

2019-02-21 Thread Rahul Goswami
. A _lot_. > Or you're only using a _very_ small bit of your index. > > Sorry to be so negative, but this is not a situation that's amenable to > a quick fix. > > Best, > Erick > > > > > On Mon, Feb 11, 2019 at 4:10 PM Rahul Goswami > wrote:

Re: Full index replication upon service restart

2019-02-11 Thread Rahul Goswami
arting > with Solr 7.3 and, > in particular, Solr 7.5. That might address your replicas that just > fail to replicate ever, > but won't help that replicas need to full sync anyway. > > That said, by far the simplest thing would be to stop indexing during > your maintenance

Full index replication upon service restart

2019-02-05 Thread Rahul Goswami
Hello Solr gurus, So I have a scenario where on Solr cluster restart the replica node goes into full index replication for about 7 hours. Both replica nodes are restarted around the same time for maintenance. Also, during usual times, if one node goes down for whatever reason, upon restart it agai

Re: SPLITSHARD not working as expected

2019-01-30 Thread Rahul Goswami
created post split? Regards, Rahul On Wed, Jan 30, 2019 at 1:18 AM Rahul Goswami wrote: > Thanks for the reply Jan. I have been referring to documentation for > SPLISHARD on 7.2.1 > <https://lucene.apache.org/solr/guide/7_2/collections-api.html#splitshard> > which > see

Re: Error using collapse parser with /export

2019-01-29 Thread Rahul Goswami
ight be a bug afterall (while working with /export)? Thanks, Rahul On Sun, Jan 27, 2019 at 9:55 PM Rahul Goswami wrote: > Hi Joel, > > Thanks for responding to the query. > > Answers to your questions: > 1) After collapsing is it not possible to use the /select handler?

Re: SPLITSHARD not working as expected

2019-01-29 Thread Rahul Goswami
ink you need a > screenshot here, what you describe is the default behaviour. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > > 28. jan. 2019 kl. 09:05 skrev Rahul Goswami : > > > > Hello, > > I am using Solr 7.2.1. I c

SPLITSHARD not working as expected

2019-01-28 Thread Rahul Goswami
Hello, I am using Solr 7.2.1. I created a two node example collection on the same machine. Two shards with two replicas each. I then called SPLITSHARD on shard2 and expected the split shards to have one replica on each node. However I see that for shard2_1, both replicas reside on the same node. Is

Re: Error using collapse parser with /export

2019-01-27 Thread Rahul Goswami
> Streaming Expression? > > Either of those cases would be the typical uses of these features. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Sun, Jan 20, 2019 at 10:13 PM Rahul Goswami > wrote: > > > Hello, > > > > Following up on my

Re: Error using collapse parser with /export

2019-01-20 Thread Rahul Goswami
an 17, 2019 at 4:58 PM Rahul Goswami wrote: > Hello, > > I am using SolrCloud on Solr 7.2.1. > I get the NullPointerException in the Solr logs (in ExportWriter.java) > when the /stream handler is invoked with a search() streaming expression > with qt="/export" contain

Error using collapse parser with /export

2019-01-17 Thread Rahul Goswami
Hello, I am using SolrCloud on Solr 7.2.1. I get the NullPointerException in the Solr logs (in ExportWriter.java) when the /stream handler is invoked with a search() streaming expression with qt="/export" containing fq="{!collapse field=id_field sort="time desc"} (among other fq's. I tried elimina

Re: Able to search with indexed=false and docvalues=true

2018-11-20 Thread Rahul Goswami
particularly functional for any industry size load anyway. Thanks, Rahul On Tue, Nov 20, 2018 at 3:37 AM Toke Eskildsen wrote: > On Mon, 2018-11-19 at 22:19 -0500, Rahul Goswami wrote: > > I am using SolrCloud 7.2.1. My understanding is that setting > > docvalues=true would optimize fac

Re: Error:Missing Required Fields for Atomic Updates

2018-11-19 Thread Rahul Goswami
/> > stored="true" required="false" multiValued="false" /> > stored="true" required="false" multiValued="false" /> > stored="true" required="false" multiValued="false" /> > req

Re: Error:Missing Required Fields for Atomic Updates

2018-11-19 Thread Rahul Goswami
What’s your update query? You need to provide the unique id field of the document you are updating. Rahul On Mon, Nov 19, 2018 at 10:58 PM Rajeswari Kolluri < rajeswari.koll...@oracle.com> wrote: > Hi, > > > > > > Using Solr 7.5.0. While performing atomic updates on a document on Solr > Cloud

Able to search with indexed=false and docvalues=true

2018-11-19 Thread Rahul Goswami
I am using SolrCloud 7.2.1. My understanding is that setting docvalues=true would optimize faceting, grouping and sorting; but for a field to be searchable it needs to be indexed=true. However I was dumbfounded today when I executed a successful search on a field with below configuration: However