Re: How expensive is core loading?

2020-01-29 Thread Emir Arnautović
Hi Rahul, It depends. You might have warm up queries that would populate caches. For each core Solr exposes JMX stats so you can read just those without “touching" core. You can also try using some of existing tools for monitoring Solr, but I don’t think that any of them provides you info about

Re: Solr fact response strange behaviour

2020-01-29 Thread Mikhail Khludnev
What's happen at AutoCompleteAPI.java:170 ? On Wed, Jan 29, 2020 at 9:28 PM Kaminski, Adi wrote: > Sure, thanks for the guidance and the assistance anyway. > > Here is the stack trace: > Here is the stack trace: > [29/01/20 08:09:41:041 IST] [http-nio-8080-exec-2] ERROR api.BaseAPI: > There was

Re: How expensive is core loading?

2020-01-29 Thread Edward Ribeiro
Hi, Luke was an standalone app and now is a Lucene module. Read here: https://github.com/DmitryKey/luke You don't need Solr to use it (LukeRequestHandler is a plus). Best, Edward Em qua, 29 de jan de 2020 20:35, Rahul Goswami escreveu: > Thanks for your response Walter. But I could not find

Re: Can I create 1000 cores in SOLR CLOUD

2020-01-29 Thread Natarajan, Rajeswari
Good to know Shawn. Thanks, Rajeswari On 1/29/20, 12:52 PM, "Shawn Heisey" wrote: On 1/27/2020 4:59 AM, Vignan Malyala wrote: > We are currently using solr without cloud with 500 cores. It works good. > > Now we are planning to expand it using solr cloud with 1000 cores, (2 c

Re: Clarity on Stable Release

2020-01-29 Thread Dave
But! If we don’t have people throwing a new release into production and finding real world problems we can’t trust that the current release problems will be exposed and then remedied, so it’s a double edged sword. I personally agree with staying a major version back, but that’s because it takes

Re: Solr Searcher 100% Latency Spike

2020-01-29 Thread Shawn Heisey
On 1/29/2020 2:48 PM, Karl Stoney wrote: I know the images didn't load btw so when I say spike I mean p95th response time going from 50ms to 100-120ms momentarily. I agree with Erick on looking at what users can actually notice. When the normal response time is 50 milliseconds, even if that d

Re: Clarity on Stable Release

2020-01-29 Thread Jeff
Thanks Shawn! Your answer is very helpful. Especially your note about keeping up to date with the latest major version after a number of releases. On Wed, Jan 29, 2020 at 6:35 PM Shawn Heisey wrote: > On 1/29/2020 11:24 AM, Jeff wrote: > > Now, we are considering 8.2.0, 8.3.1, or 8.4.1 to use as

Re: Solr Searcher 100% Latency Spike

2020-01-29 Thread Erick Erickson
Autowarming is significantly misunderstood. One of it's purposes in “the bad old days” was to rebuild very expensive on-heap structures for searching/sorting/grouping/and function queries. These are exactly what docValues are designed to make much, much faster. If you are still using spinning d

Re: How expensive is core loading?

2020-01-29 Thread Rahul Goswami
Hi Shawn, Thanks for the inputs. I realize I could have been clearer. By "expensive", I mean expensive in terms of memory utilization. Eg: Let's say I have a core with an index size of 10 GB and is not loaded on startup as per configuration. If I load it in order to know the total documents and the

Re: How expensive is core loading?

2020-01-29 Thread Shawn Heisey
On 1/29/2020 3:01 PM, Rahul Goswami wrote: 1) How expensive is core loading if I am only getting stats like the total docs and size of the index (no expensive queries)? 2) Does the memory consumption on core loading depend on the index size ? 3) What is a reasonable value for transient cache size

Re: Clarity on Stable Release

2020-01-29 Thread Shawn Heisey
On 1/29/2020 11:24 AM, Jeff wrote: Now, we are considering 8.2.0, 8.3.1, or 8.4.1 to use as they seem to be stable. But it is hard to determine if we should be using the bleeding edge or a few minor versions back since each of these includes many bug fixes. It is unclear to me why some fixes get

Re: How expensive is core loading?

2020-01-29 Thread Rahul Goswami
Thanks for your response Walter. But I could not find a Java api for Luke for writing my tool. Is there one? I also tried using the LukeRequestHandler that comes with Solr, but invoking it causes the Solr core to be loaded. Rahul On Wed, Jan 29, 2020 at 5:20 PM Walter Underwood wrote: > You mi

Re: Easiest way to export the entire index

2020-01-29 Thread Edward Ribeiro
HI Amanda, Below is crude prototype in Bash that fetches documents from Solr using cursorMark: https://gist.github.com/eribeiro/de1588aaa1759c02ea40cc281e8aedc8 This is a crude prototype, but should shed some light for your use case (I copied the code below too): Best, Edward --

Re: How expensive is core loading?

2020-01-29 Thread Walter Underwood
You might use Luke to get that info from the index files without loading them into Solr. https://code.google.com/archive/p/luke/ wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jan 29, 2020, at 2:01 PM, Rahul Goswami wrote: > > Hello, > I am using

How expensive is core loading?

2020-01-29 Thread Rahul Goswami
Hello, I am using Solr 7.2.1 on a Solr node running in standalone mode (-Xmx 8 GB). I wish to implement a service to monitor the server stats (like number of docs per core, index size etc) .This would require me to load the core and my concern is that for a node hosting 100+ cores, this could be ex

Re: Solr Searcher 100% Latency Spike

2020-01-29 Thread Karl Stoney
So interestingly tweaking my filter cache i've got the warming time down to 1s (from 10!) and also reduced my memory footprint due to the smaller cache size. However, I still get these latency spikes (these changes have made no difference to them). So the theory about them being due to the warm

Re: Solr Searcher 100% Latency Spike

2020-01-29 Thread Walter Underwood
Looking at the log, that takes one or two seconds after a complete batch reload (master/slave). So that is loading a cold index, all new files. This is not a big index, about a half million book titles. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > O

Re: Solr Searcher 100% Latency Spike

2020-01-29 Thread Karl Stoney
Out of curiosity, could you define "fast"? I'm wondering what sort of figures people target their searcher warm time at From: Walter Underwood Sent: 29 January 2020 21:13 To: solr-user@lucene.apache.org Subject: Re: Solr Searcher 100% Latency Spike I use a static

Re: Solr Searcher 100% Latency Spike

2020-01-29 Thread Walter Underwood
I use a static set of warming queries, about 20 of them. That is fast and gets a decent amount of the index into file buffers. Your top queries won’t change much unless you have a news site or a seasonal business. Like this: introduction inter

Re: Solr Searcher 100% Latency Spike

2020-01-29 Thread Karl Stoney
Hey Shawn, Thanks for the reply - funnily enough that is exactly what i'm trialing now. I've significantly lowered the autoWarm (as well as the size) and still have a 0.95+ cache hit rate through searcher loads. I'm going to continue to tweak these values down so long as i keep the hit rate ab

Re: Solr Searcher 100% Latency Spike

2020-01-29 Thread Shawn Heisey
On 1/29/2020 12:44 PM, Karl Stoney wrote: Looking for a bit of support here.  When we soft commit (every 10 minutes), we get a latency spike that means response times for solr are loosely double, as you can see in this screenshot: Attachments almost never make it to the list. We cannot see an

Re: Can I create 1000 cores in SOLR CLOUD

2020-01-29 Thread Shawn Heisey
On 1/27/2020 4:59 AM, Vignan Malyala wrote: We are currently using solr without cloud with 500 cores. It works good. Now we are planning to expand it using solr cloud with 1000 cores, (2 cores for each of my client with different domain data). SolrCloud starts having scalability issues once yo

Re: Operation backup caused exception : AccessDeniedException

2020-01-29 Thread Shawn Heisey
On 1/29/2020 3:26 AM, Salmaan Rashid Syed wrote: I was trying to execute the backup command using curl command on my work computer to see why EC2 instance was giving the previous error. On my current computer, I have root privileges. But when I execute the command on my work computer, I have a di

Re: Performance Issue since Solr 7.7 with wt=javabin

2020-01-29 Thread Karl Stoney
Could anyone produce a patch for 7.7 please? From: Florent Sithi Sent: 29 January 2020 14:34 To: solr-user@lucene.apache.org Subject: Re: Performance Issue since Solr 7.7 with wt=javabin yes thanks so much, fixed in 8.4.0 -- Sent from: https://eur03.safelinks

Solr Searcher 100% Latency Spike

2020-01-29 Thread Karl Stoney
Hi All, Looking for a bit of support here. When we soft commit (every 10 minutes), we get a latency spike that means response times for solr are loosely double, as you can see in this screenshot: [cid:ed9fa791-0776-43fc-8f22-d8a568f5c084] These do correlate to GC spikes (albeit not particularl

Re: Solr fact response strange behaviour

2020-01-29 Thread Jason Gerlowski
Thanks Adi, There's no SolrJ code in your stacktrace, so this was something other than SOLR-13780 apparently. Best of luck! Jason On Wed, Jan 29, 2020 at 1:28 PM Kaminski, Adi wrote: > > Sure, thanks for the guidance and the assistance anyway. > > Here is the stack trace: > Here is the stack t

For Hierarchical data structure is Graph Query a good option ?

2020-01-29 Thread sambasivarao giddaluri
Hi , I have a data in hierarchical structure ex: parent --> children --> grandchildren Usecase: Get parent docs by adding filter on children and grand children or Get grand children docs by adding filters on parent and children To accommodate this use case i have flattened the docs by addin

RE: Solr fact response strange behaviour

2020-01-29 Thread Kaminski, Adi
Sure, thanks for the guidance and the assistance anyway. Here is the stack trace: Here is the stack trace: [29/01/20 08:09:41:041 IST] [http-nio-8080-exec-2] ERROR api.BaseAPI: There was an Exception calling Solr java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long at

Clarity on Stable Release

2020-01-29 Thread Jeff
TL;DR: I am having difficulty on deciding on a release that is stable to use and would like this to be easier. Recently it has been rather difficult to figure out what release to use based on its stability. This is probably in part because of the rapid release cadence and also the versioning being

Unable to get ICUFoldingFilterFactory class loaded in unsecured 8.4.1 SolrCloud

2020-01-29 Thread Andy C
I have a schema currently used with Solr 7.3.1 that uses the ICU contrib extensions. Previously I used a directive in the solrconfig.xml to load the icu4j and lucene-analyzers-icu jars. The 8.4 upgrade notes indicate that this approach is no longer supported for SolrCloud unless you enable authen

KeeperErrorCode= BadVersion

2020-01-29 Thread Rajeswari Natarajan
Hi, Getting below exception. We have solrcloud 7.6 installed and have commented off the below in solrconfig.xml what could be the reason. Thanks, Rajeswari 2020-01-17T13:03:40.84206185Z 2020-01-17 13:03:40,841 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got us

Re: Easiest way to export the entire index

2020-01-29 Thread Steve Ge
@Amanda You can try using curl and write output to a file   curl http://localhost:8983/Solr?q={theSolrQuery) > out.json   theSolrQuery - you need to specify all attrs you want exported, not just * If you are on Windows, there is a Windows curl tool you can download to use Steve On Wed, Ja

Re: Performance Issue since Solr 7.7 with wt=javabin

2020-01-29 Thread Florent Sithi
yes thanks so much, fixed in 8.4.0 -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Solr Nested Documents not properly working

2020-01-29 Thread Yirmiyahu Fischer
Could you please answer my question on https://stackoverflow.com/questions/59566421/solr-nested-documents-not-properly-setup Thank you. Yirmiyahu Fischer Senior Developer Signature IT

Re: Solr fact response strange behaviour

2020-01-29 Thread Jason Gerlowski
Hey Adi, There was a separate JIRA for this on the SolrJ objects it sounds like you're using: SOLR-13780. That JIRA was fixed, apparently in 8.3, so I'm surprised you're still seeing the issue. If you include the full stacktrace and a snippet of code to reproduce, I'm curious to take a look. Th

Re: Bug in scoreNodes function of streaming expressions?

2020-01-29 Thread Pratik Patel
Thanks a lot. I will update the ticket with more details if appropriate. Pratik On Wed, Jan 29, 2020 at 10:07 AM Joel Bernstein wrote: > Here is the ticket: > https://issues.apache.org/jira/browse/SOLR-14231 > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Wed, Jan 29, 2020 at 10:0

Re: Easiest way to export the entire index

2020-01-29 Thread David Hastings
i do this often and just create a 30gb file using wget, On Wed, Jan 29, 2020 at 10:21 AM Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi Amanda, > I assume that you have all the fields stored so you will be able to export > full document. > > Several thousands records should not be to

Re: Easiest way to export the entire index

2020-01-29 Thread Emir Arnautović
Hi Amanda, I assume that you have all the fields stored so you will be able to export full document. Several thousands records should not be too much to use regular start+rows to paginate results, but the proper way of doing that would be to use cursors. Adjust page size to avoid creating huge

Re: Bug in scoreNodes function of streaming expressions?

2020-01-29 Thread Joel Bernstein
Here is the ticket: https://issues.apache.org/jira/browse/SOLR-14231 Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Jan 29, 2020 at 10:03 AM Joel Bernstein wrote: > Hi Pratik, > > I'll create the ticket now and report back. If you've got a fix please > post it to the ticket and I'll try

Re: Bug in scoreNodes function of streaming expressions?

2020-01-29 Thread Joel Bernstein
Hi Pratik, I'll create the ticket now and report back. If you've got a fix please post it to the ticket and I'll try to get this in for the next release. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Jan 28, 2020 at 11:52 AM pratik@semandex wrote: > Joel Bernstein wrote > > Ok, that so

Re: Solr Cloud on Docker?

2020-01-29 Thread Scott Stults
One of our clients has been running a big Solr Cloud (100-ish nodes, TB index, billions of docs) in kubernetes for over a year and it's been wonderful. I think during that time the biggest scrapes we got were when we ran out of disk space. Performance and reliability has been solid otherwise. Like

Easiest way to export the entire index

2020-01-29 Thread Amanda Shuman
Dear all: I've been asked to produce a JSON file of our index so it can be combined and indexed with other records. (We run solr 5.3.1 on this project; we're not going to upgrade, in part because funding has ended.) The index has several thousand rows, but nothing too drastic. Unfortunately, this

Re: Can I create 1000 cores in SOLR CLOUD

2020-01-29 Thread Vignan Malyala
Guys, Did anyone work on this type of thing? Can you please help with this? For real time deployment and issues? On Mon, Jan 27, 2020 at 5:29 PM Vignan Malyala wrote: > Hi all, > > We are currently using solr without cloud with 500 cores. It works good. > > Now we are planning to expand it using

Re: Solr 7.7: Using Tika in Production

2020-01-29 Thread Erick Erickson
I doubt that’d work. When Solr gets an update, it forwards the document to the leader of the shard it’s going to eventually reside on. Among other things, the Solr node hosting no replicas would need to go to ZK and pull down the config you've created for Tika to know what to do. There’s no tech

Re: Query Regarding SOLR cross collection join

2020-01-29 Thread Mikhail Khludnev
It's time to enforce and document field type constraints https://issues.apache.org/jira/browse/SOLR-14230. On Mon, Jan 27, 2020 at 4:12 PM Doss wrote: > @ Alessandro Benedetti , Thanks for your input! > > @ Mikhail Khludnev , I made docValues="true" for from & to and did a index > rotation, now

Re: Performance Issue since Solr 7.7 with wt=javabin

2020-01-29 Thread Jan Høydahl
Check out SOLR-14013 which I believe is what you are looking for Jan > 29. jan. 2020 kl. 11:46 skrev Florent Sithi : > > Hi Paras, > > Thanks for your answer and your ideas ;) > > I have the exact same issue than Andy "wt=javabin&version=2"

Re: Spell check with data from database and not from english dictionary

2020-01-29 Thread seeteshh
Hello Jan Let me work on your suggestions too. Also I had one query While working on the spell check component, I dont any suggestion for the incorrect word typed example : In spellcheck.q, I type "Teh" instead of "The" or "saa" instead of "sea" "responseHeader":{ "status":0, "QTim

Re: Performance Issue since Solr 7.7 with wt=javabin

2020-01-29 Thread Florent Sithi
Hi Paras, Thanks for your answer and your ideas ;) I have the exact same issue than Andy "wt=javabin&version=2" have really poor performances comprared to wt=json I'm using : - solr 7.7.2 - OpenJDK8U-jdk_x64_linux_hotspot_8u222b10 or jdk-8u241-linux-x64 (same behaviour) The server have much R

Re: Operation backup caused exception : AccessDeniedException

2020-01-29 Thread Salmaan Rashid Syed
Hi Shawn, I was trying to execute the backup command using curl command on my work computer to see why EC2 instance was giving the previous error. On my current computer, I have root privileges. But when I execute the command on my work computer, I have a different problem. It states that the path

Re: In-place re-indexing after DocValue schema change

2020-01-29 Thread moscovig
Tank you Emir. I tried this locally (changing schema, re-index all implace) and I wasn't able to sort on the doc value fields anymore (someone actually mentioned this before on that forum - https://lucene.472066.n3.nabble.com/DocValues-error-td4240116.html) with the next error "Error from server a

Re: In-place re-indexing after DocValue schema change

2020-01-29 Thread Emir Arnautović
Hi, 1. No, it’s not valid. Solr will look at schema to see if it can use docValues or if it has to uninvert field and it assumes that all fields will have doc values. You might expect from wrong results to errors if you do something like that. 2. Not sure if it would work, but It is not better t

In-place re-indexing after DocValue schema change

2020-01-29 Thread moscovig
Hi all We are about to alter our schema with some DocValue annotations. According to docs, we should whether delete all docs and re-insert, or create a new collection with the new schema. 1. Is it valid to modify the schema in the current collection, where all documents were created without docV