RE: Solr Down Issue

2020-08-10 Thread Rashmi Jain
Hello Team, Can you please check this on high priority. Regards, Rashmi From: Rashmi Jain Sent: Sunday, August 9, 2020 7:21 PM To: solr-user@lucene.apache.org Cc: Ritesh Sinha Subject: Solr Down Issue Hello Team, I am Rashmi jain implemented solr on one of our

Fwd: Help For solr in data config.xml regarding fetching record

2020-08-10 Thread Rajat Diwate
Hi Team, I need some help regarding this issue I have posted it on nabble but no response yet kindly check the shared link please provided an solution .if not please reply with a suggested solution/mail where i can get help regarding this issue. https://lucene.472066.n3.nabble.com/Need-Help-Fo

Re: java.lang.StackOverflowError if pass long string in q parameter

2020-08-10 Thread Furkan KAMACI
Hi Kumar, The problem you have here is StackOverflowError which is not related to the character limit of the q parameter. First of all, consider using pagination to fetch data from Solr. Secondly, share your configuration settings to startup Solr and how much index you have to check whether you ne

Force open a searcher in solr.

2020-08-10 Thread Akshay Murarka
Hey, I have a use case where none of the document in my solr index is changing but I still want to open a new searcher through the curl api. On executing the below curl command curl “XXX.XX.XX.XXX:9744/solr/mycollection/update?openSearcher=true&commit=true” it doesn’t open a new searcher. Below

Re: java.lang.StackOverflowError if pass long string in q parameter

2020-08-10 Thread Dominique Bejean
Hi, It looks like uid field is a text field with graph filter. Do you really need this for this specific large "OR" query ? Can't you use a string field instead ? Do you need to compute the score for this query ? Maybe you can use fq instead of q ? You will have performance improvements by not com

Re: Solr Down Issue

2020-08-10 Thread Dominique Bejean
Hi, Did you analyse your gc logs ? If not, it is the first action to do. Enable gc logs and use a tool like https://gceasy.io/ Please provide more details about your configuration (JVM settings, ...) and use case (QPS, queries, ...) We just know you have 28 million indexed books (just metadata or

Re: Force open a searcher in solr.

2020-08-10 Thread Erick Erickson
In a word, “no”. There is explicit code to _not_ open a new searcher if the index hasn’t changed because it’s an expensive operation. Could you explain _why_ you want to open a new searcher even though the index is unchanged? The reason for the check in the first place is that nothing has chang

How to forcefully open new searcher, in case when there is no change in Solr index

2020-08-10 Thread raj.yadav
I have a use case where none of the document in my solr index is changing but I still want to open a new searcher through the curl api. On executing the below curl command curl "XXX.XX.XX.XXX:9744/solr/mycollection/update?openSearcher=true&commit=true" it doesn't open a new searcher. Below is

Re: Force open a searcher in solr.

2020-08-10 Thread Akshay Murarka
Hey, So I have external file fields that have some data that get updated regularly. Whenever those get updated we need the open searcher operation to happen. The value in this external files are used in boosting and other function/range queries. On Mon, Aug 10, 2020 at 5:08 PM Erick Erickson wro

Re: Why External File Field is marked as indexed in solr admin SCHEMA page?

2020-08-10 Thread raj.yadav
Hi Chris, Chris Hostetter-3 wrote > ...ExternalFileField is "special" and as noted in it's docs it is not > searchable -- it doesn't actaully care what the indexed (or "stored") > properties are ... but the default values of those properties as assigend > by the schema defaults are still there

Re: Production Issue: TIMED_WAITING - Will net.ipv4.tcp_tw_reuse=1 help?

2020-08-10 Thread Doss
Hi, In solr 8.3.1 source, I see the following , which I assume could be the reason for the issue "Max requests queued per destination 3000 exceeded for HttpDestination", solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java: private static final int MAX_OUTSTANDING_REQUEST

Re: Force open a searcher in solr.

2020-08-10 Thread Erick Erickson
Ah, ok. That makes sense. I wonder if your use-case would be better served, though, by “in place updates”, see: https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html This has been around in since Solr 6.5… Best, Erick > On Aug 10, 2020, at 8:24 AM, Akshay Murarka wrote: >

Re: How to forcefully open new searcher, in case when there is no change in Solr index

2020-08-10 Thread Erick Erickson
Are you also posting the same question as :Akshay Murarka ? Please do not do this if so, use one e-mail address. would in-place updates serve your use-case better? See: https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html > On Aug 10, 2020, at 8:17 AM, raj.yadav wrote: >

Re: Production Issue: TIMED_WAITING - Will net.ipv4.tcp_tw_reuse=1 help?

2020-08-10 Thread Dominique Bejean
Hi Doss, See a lot of TIMED_WATING connection occurs with high tcp traffic infrastructure as in a LAMP solution when the Apache server can't anymore connect to the MySQL/MariaDB database. In this case, tweak net.ipv4.tcp_tw_reuse is a possible solution (but never net.ipv4.tcp_tw_recycle as you sug

RE: Can create collections with Drupal 8 configset

2020-08-10 Thread Shane Brooks
> The class that doesn't load (in your error message) is not located in the > icu4j jar. It is located in the lucene-analyzers-icu-X.Y.Z.jar file, which > is found in the contrib/analysis-extras/lucene-libs subdirectory. Thanks Shawn, this was exactly my issue. That subdirectory was missing co

Replicas in Recovery During Atomic Updates

2020-08-10 Thread Anshuman Singh
Hi, We have a SolrCloud cluster with 10 nodes. We have 6B records ingested in the Collection. Our use case requires atomic updates ("inc") on 5 fields. Now almost 90% documents are atomic updates and as soon as we start our ingestion pipelines, multiple shards start going into recovery, sometimes

Re: Replicas in Recovery During Atomic Updates

2020-08-10 Thread Erick Erickson
Good question, what do the logs say? You’ve provided very little information to help diagnose the issue. As to your observation that atomic updates are expensive, that’s true. Under the covers, Solr has to go out and fetch the document, overlay your changes and then re-index the full document. So,

Re: Force open a searcher in solr.

2020-08-10 Thread raj.yadav
Erick Erickson wrote > Ah, ok. That makes sense. I wonder if your use-case would be better > served, though, by “in place updates”, see: > https://lucene.apache.org/solr/guide/8_1/updating-parts-of-documents.html > This has been around in since Solr 6.5… As per documentation `in place update` is o

Re: Production Issue: TIMED_WAITING - Will net.ipv4.tcp_tw_reuse=1 help?

2020-08-10 Thread Doss
Hi Dominique, Thanks for your response. Find below the details, please do let me know if anything I missed. *- hardware architecture and sizing* >> Centos 7, VMs,4CPUs, 66GB RAM, 16GB Heap, 250GB SSD *- JVM version / settings* >> Red Hat, Inc. OpenJDK 64-Bit Server VM, version:"14.0.1 14.0

Re: Replicas in Recovery During Atomic Updates

2020-08-10 Thread Jörn Franke
How do you ingest it exactly with Atomtic updates ? Is there an update processor in-between? What are your settings for hard/soft commit? For the shared going to recovery - do you have a log entry or something ? What is the Solr version? How do you setup ZK? > Am 10.08.2020 um 16:24 schrieb

Re: Production Issue: TIMED_WAITING - Will net.ipv4.tcp_tw_reuse=1 help?

2020-08-10 Thread Dominique Bejean
Doss, See below. Dominique Le lun. 10 août 2020 à 17:41, Doss a écrit : > Hi Dominique, > > Thanks for your response. Find below the details, please do let me know if > anything I missed. > > > *- hardware architecture and sizing* > >> Centos 7, VMs,4CPUs, 66GB RAM, 16GB Heap, 250GB SSD > > >

Re: Force open a searcher in solr.

2020-08-10 Thread Erick Erickson
Right, but you can use those with function queries. Assuming your eff entry is a doc ID plus single numeric, I was wondering if you can accomplish what you need to with function queries... > On Aug 10, 2020, at 11:30 AM, raj.yadav wrote: > > Erick Erickson wrote >> Ah, ok. That makes sense. I

Re: Replicas in Recovery During Atomic Updates

2020-08-10 Thread Anshuman Singh
Just to give you an idea, this is how we are ingesting: {"id": 1, "field1": {"inc": 20}, "field2": {"inc": 30}, "field3": 40. "field4": "some string"} We are using Solr-8.5.1. We have not configured any update processor. Hard commit happens every minute or at 100k docs, soft commit happens every

Replication not occurring to newly added SOLRCloud nodes

2020-08-10 Thread Shane Brooks
Main info: SOLRCloud 7.7.3, Zookeeper 3.4.14 I have a 2 node SOLRCloud installation, 3 zookeeper instances, configured in AWS to autoscale. I am currently testing with 9 collections. My issue is that when I scale out and a node is added to the SOLRCloud cluster, I get replication to the new node

Cannot add replica during backup

2020-08-10 Thread Ashwin Ramesh
Hi everybody, We are using solr 7.6 (SolrCloud). We notices that when the backup is running, we cannot add any replicas to the collection. By the looks of it, the job to add the replica is put into the Overseer queue, but it is not being processed. Is this expected? And are there any workarounds?

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-10 Thread Ashwin Ramesh
I would love an answer to this too! On Fri, Aug 7, 2020 at 12:18 AM Bram Van Dam wrote: > Hey folks, > > Been reading up about the various ways of creating backups. The whole > "shared filesystem for Solrcloud backups"-thing is kind of a no-go in > our environment, so I've been looking for ways

Re: Production Issue: TIMED_WAITING - Will net.ipv4.tcp_tw_reuse=1 help?

2020-08-10 Thread Doss
Hi Dominique, Thanks for the response. I don't think I would use a JVM version 14. OpenJDK 11 in my opinion is the best choice for LTS version. >> We will try changing it. You change a lot of default values. Any specific raisons ? Il seems very aggressive ! >> Our product team wants data to be

Re: Cannot add replica during backup

2020-08-10 Thread Aroop Ganguly
12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins using the collection backup api. How are you taking the backup? Do you actually see any backup progress or u are just seeing the task in the overseer queue linger ? I have seen restore tasks hanging in the queue forever des

Re: Cannot add replica during backup

2020-08-10 Thread Ashwin Ramesh
Hey Aroop, the general process for our backup is: - Connect all machines to an EFS drive (AWS's NFS service) - Call the collections API to backup into EFS - ZIP the directory once the backup is completed - Copy the ZIP into an s3 bucket I'll probably have to see which part of the process is the sl

Re: Cannot add replica during backup

2020-08-10 Thread Aroop Ganguly
Hi Ashwin Thanks for sharing this detail. Do you mind sharing how big are each of these indices ? I am almost sure this is network capacity and constraints related per your aws setup. Yes if you can confirm that the backup is complete, or you just want the system to move on discarding the backu

Re: Cannot add replica during backup

2020-08-10 Thread Ashwin Ramesh
Hi Aroop, We have 16 shards each approx 30GB - total is ~480GB. I'm also pretty sure it's a network issue. Very interesting that you can index 20x the data in 15 min! >> It would also help to ensure your overseer is on a node with a role that exempts it from any Solr index responsibilities. How w

Re: Solr + Parquets

2020-08-10 Thread Russell Jurney
There are ways to load data directly from Spark to Solr but I didn't find any of them satisfactory so I just create enough Spark partitions with reparition() (increase partition count)/coalesce() (decrease partition count) that I get as many Parquet files as I want and then I use a bash script to i

Re: Solr + Parquets

2020-08-10 Thread Aroop Ganguly
> script to iterate and load the files via the post command. You mean load parquet filed over post? That sounds unbelievable … Do u mean you created Solr doc for each parquet record in a partition and used solrJ or some other java lib to post the docs to Solr? df.mapPatitions(p => { ///batch th

Re: Solr + Parquets

2020-08-10 Thread Russell Jurney
Sorry, I'm a goofball. I use Parquet but use bzip2 json format for the last hop. Thanks, Russell Jurney @rjurney russell.jur...@gmail.com LI FB datasyndrome.com On Mon, Aug 10, 2020 at 7:56 PM Aroop

Re: Cannot add replica during backup

2020-08-10 Thread Aroop Ganguly
> We have 16 shards each approx 30GB - total is ~480GB. I'm also pretty sure > it's a network issue. Very interesting that you can index 20x the data in > 15 min! Not index but backup an index in 15min. >>> It would also help to ensure your overseer is on a node with a role that > exempts it fro