Re: Schema update/Reload during Live traffic

2018-03-23 Thread Shawn Heisey
On 3/23/2018 10:32 PM, Susheel Kumar wrote: I did schema update to Solr cloud of Source CDCR cluster and same on target. After Collection Reload, noticed "error opening searcher" / IndexWriter closed etc. on leader node while all replica's went into recovery mode. Later after restarting Solr on

Schema update/Reload during Live traffic

2018-03-23 Thread Susheel Kumar
Hello, I did schema update to Solr cloud of Source CDCR cluster and same on target. After Collection Reload, noticed "error opening searcher" / IndexWriter closed etc. on leader node while all replica's went into recovery mode. Later after restarting Solr on Leader noticed below too many file ope

Re: Problem accessing /solr/_shard1_replica_n1/get

2018-03-23 Thread Shawn Heisey
On 3/23/2018 4:08 AM, Hendrik Haddorp wrote: I did not define a /get request handler but I also don't see one being default in the solrconfig.xml files that come with Solr 7.2.1. Do I need to add that as described in https://www.garysieling.com/blog/fixing-solrj-error-expected-mime-type-applica

Re: Some performance questions....

2018-03-23 Thread Shawn Heisey
On 3/23/2018 1:13 PM, Deepak Goel wrote: > Yes I am now creating a client object only once. On Linux it has superb > results (performance improves by around two times). However on Windows it > has no improvement > > *SoftwareThroughput (/sec)Response Time (msec)Utilization (%CPU)UnTuned > (Windows)

Re: Solr on HDInsight to write to Active Data Lake

2018-03-23 Thread Abhi Basu
I'll try it out. Thanks Abhi On Fri, Mar 23, 2018, 6:22 PM Rick Leir wrote: > Abhi > Check your lib directives. > > https://lucene.apache.org/solr/guide/6_6/lib-directives-in-solrconfig.html#lib-directives-in-solrconfig > > I suspect your jars are not in a lib dir mentioned in solrconfig.xml >

Re: Some performance questions....

2018-03-23 Thread Rick Leir
Deep, What is the test so I can try it. 75 or 90 ms .. is that the JVM startup time? Cheers -- Rick >> >> >I have stated the numbers which I found during my test. The best way to >verify them is for someone else to run the same test. Otherwise I don't >see >how we can verify the results -- S

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-23 Thread Shawn Heisey
On 3/23/2018 3:47 PM, Webster Homer wrote: > Just FYI I had a project recently where I tried to use cursorMark in > Solrcloud and solr 7.2.0 and it was very unreliable. It couldn't even > return consistent numberFound values. I posted about it in this forum. > Using the start and rows arguments in

RE: InetAddressPoint support in Solr or other IP type?

2018-03-23 Thread Mike Cooper
Thanks David. Is there a reason we wouldn't want to base the Solr implementation on the InetAddressPoint class? https://lucene.apache.org/core/7_2_1/misc/org/apache/lucene/document/InetAddressPoint.html I realize that is in the "misc" package for now, so it's not part of core Lucene. But it is

Re: Some performance questions....

2018-03-23 Thread Shawn Heisey
On 3/23/2018 11:31 AM, Deepak Goel wrote: > Do you have any specific questions about the benchmark setup? How many docs are in the Solr index?  How much disk space does it consume?  How much total memory is in the machine?  How much memory is allocated to Java heaps?  Is there any other software r

Re: solrj question

2018-03-23 Thread Shawn Heisey
On 3/23/2018 3:24 PM, Webster Homer wrote: > I see this in the output: > Lexical error at line 1, column 1759. Encountered: after : > "/select?defType=edismax&start=0&rows=25&... > It has basically the entire solr query which it obviously couldn't parse. > > solrQuery = new SolrQuery(log.getQuery

Re: Solr on HDInsight to write to Active Data Lake

2018-03-23 Thread Rick Leir
Abhi Check your lib directives. https://lucene.apache.org/solr/guide/6_6/lib-directives-in-solrconfig.html#lib-directives-in-solrconfig I suspect your jars are not in a lib dir mentioned in solrconfig.xml Cheers -- Rick On March 23, 2018 11:12:17 AM EDT, Abhi Basu <9000r...@gmail.com> wrote: >MS

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-23 Thread Webster Homer
Just FYI I had a project recently where I tried to use cursorMark in Solrcloud and solr 7.2.0 and it was very unreliable. It couldn't even return consistent numberFound values. I posted about it in this forum. Using the start and rows arguments in SolrQuery did work reliably so I abandoned cursorMa

solrj question

2018-03-23 Thread Webster Homer
I am working on a program to play back queries from a log file. It seemed straight forward. The log has the solr query written to it. via the SolrQuery.toString method. The SolrQuery class has a constructor which takes a string. It does instantiate a SolrQuery object, however when I try to actuall

Re: InetAddressPoint support in Solr or other IP type?

2018-03-23 Thread David Smiley
Hi, For IPv4, use TrieIntField with precisionStep=8 For IPv6 https://issues.apache.org/jira/browse/SOLR-6741 There's nothing there yet; you could help out if you are familiar with the codebase. Or you might try something relatively simple involving edge ngrams. ~ David On Thu, Mar 22, 2018 a

Re: querying vs. highlighting: complete freedom?

2018-03-23 Thread Erick Erickson
Arturas: Try to field-qualify your hl.q parameter. That looks like: hl.q=trans:Kundigung or hl.q=trans:Kündigung I saw the exact behavior you describe when I did _not_ specify the field in the hl.q parameter, i.e. hl.q=Kundigung or hl.q=Kündigung didn't show all highlights. But when I did spe

Re: Some performance questions....

2018-03-23 Thread Deepak Goel
Deepak "Please stop cruelty to Animals, help by becoming a Vegan" +91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Plant a Tree, Go Green" On Fri, Mar 23, 2018 at 11:38 PM, Shawn Heisey wrote: > On 3/23/2018 11:21 AM, Deepak Go

Re: querying vs. highlighting: complete freedom?

2018-03-23 Thread Arturas Mazeika
Hi Erick, Thanks for the update and the infos. Your post brought quite a bit of light into the picture and now I understand quite a bit more about what you are saying. Your explanation makes sense and can be quite useful in certain scenarious. What stroke me from your description is that you are

Re: Some performance questions....

2018-03-23 Thread Shawn Heisey
On 3/23/2018 11:21 AM, Deepak Goel wrote: >> I tried the above suggestion. The throughput and utilisation remain the >> same (they dont increase even if I increase the load). The response time >> comes down. >> Are you still creating a new client object for every query?  Changing how the client ob

Re: Solr OpenNLP integration

2018-03-23 Thread Shawn Heisey
On 3/23/2018 10:28 AM, Алексей Пономаренко wrote: > Hi, I have issues to integrate OpenNLP to Solr. > > https://stackoverflow.com/questions/49433989/can-not-apply-patch-lucene-2899-patch-to-solr-on-windows > here is a link to SO that describes problem. > > Can you help me? That SO question has an

RE: Solr 7.1 and 5.4 differences in bf

2018-03-23 Thread Homero Gonzalez
Hi Erick, I am ok with getting differences because lucene uses different similarity algorithm in qf that in the new version may be better. The problem I am reporting is with the bf behavior. Since queryNorm varies from query to query I have not found a way to have consistent boost results betwee

Re: Some performance questions....

2018-03-23 Thread Deepak Goel
Deepak "Please stop cruelty to Animals, help by becoming a Vegan" +91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Plant a Tree, Go Green" On Tue, Mar 20, 2018 at 3:32 AM, Shawn Heisey wrote: > On 3/16/2018 4:24 PM, Deepak Goel

Re: CDCR performance issues

2018-03-23 Thread Tom Peters
Thanks for responding. My responses are inline. > On Mar 23, 2018, at 8:16 AM, Amrit Sarkar wrote: > > Hey Tom, > > I'm also having issue with replicas in the target data center. It will go >> from recovering to down. And when one of my replicas go to down in the >> target data center, CDCR wil

Re: Some performance questions....

2018-03-23 Thread Deepak Goel
Deepak "Please stop cruelty to Animals, help by becoming a Vegan" +91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Plant a Tree, Go Green" On Thu, Mar 22, 2018 at 1:25 AM, Deepak Goel wrote: > > > > > Deepak > "Please stop crue

Re: querying vs. highlighting: complete freedom?

2018-03-23 Thread Erick Erickson
bq: this is not a typical case that one searches for a keyword but highlights something else This isn't really an unusual case, apparently I mislead you. What I was trying to convey is that the analysis chain used is firmly attached to a particular _field_. There's no way to say "use one analysis

Re: Solr 7.1 and 5.4 differences in bf

2018-03-23 Thread Erick Erickson
I would not focus at all on getting the same ordering. There are ongoing improvements and changes, for instance: LUCENE-7368: Removed query normalization Instead, I'd focus on the question "is the ranking I'm seeing in 7.x better or worse than 5.4 and tune until you could say "yes"... Best, Erick

Solr OpenNLP integration

2018-03-23 Thread Алексей Пономаренко
Hi, I have issues to integrate OpenNLP to Solr. https://stackoverflow.com/questions/49433989/can-not-apply-patch-lucene-2899-patch-to-solr-on-windows here is a link to SO that describes problem. Can you help me?

Re: solrcloud Auto-commit doesn't seem reliable

2018-03-23 Thread Webster Homer
It's been a while since I had time to look further into this. I'll have to go back through logs, which I need to get retrieved by an admin. On Fri, Mar 23, 2018 at 8:45 AM, Amrit Sarkar wrote: > Elaino, > > When you say commits not working, the solr logs not printing "commit" > messages? or docu

Solr 7.1 and 5.4 differences in bf

2018-03-23 Thread Homero Gonzalez
Hi, I am working on the migration of SOLR 5.4 to 7.1 and I have not been able to get the same order in the results. Looks like the problem is with the bf parameter. We use edismax with both boost and bf functions. It is important to have some functions as bf so they add to the score and the imp

Solr on HDInsight to write to Active Data Lake

2018-03-23 Thread Abhi Basu
MS Azure does not support Solr 4.9 on HDI, so I am posting here. I would like to write index collection data to HDFS (hosted on ADL). Note: I am able to get to ADL from hadoop fs command like, so hadoop is configured correctly to get to ADL: hadoop fs -ls adl:// This is what I have done so far: 1

Re: solrcloud Auto-commit doesn't seem reliable

2018-03-23 Thread Amrit Sarkar
Elaino, When you say commits not working, the solr logs not printing "commit" messages? or documents are not appearing when we search. Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaram

Re: CDCR performance issues

2018-03-23 Thread Susheel Kumar
Yea, Amrit. to clarify we have 30 sec soft commit on target data center and for the test when we use Documents tab, the default Commit Within=1000 ms which makes the commit quickly on source and then we just wait for it to appear on target data center per commit strategy. On Fri, Mar 23, 2018 at

Re: CDCR performance issues

2018-03-23 Thread Amrit Sarkar
Susheel, That is the correct behavior, "commit" operation is not propagated to target and the documents will be visible in the target as per commit strategy devised there. Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn:

Re: CDCR performance issues

2018-03-23 Thread Susheel Kumar
Just a simple check, if you go to source solr and index single document from Documents tab, then keep querying target solr for the same document. How long does it take the document to appear in target data center. In our case, I can see document show up in target within 30 sec which is our soft co

Re: CDCR performance issues

2018-03-23 Thread Amrit Sarkar
Hey Tom, I'm also having issue with replicas in the target data center. It will go > from recovering to down. And when one of my replicas go to down in the > target data center, CDCR will no longer send updates from the source to > the target. Are you able to figure out the issue? As long as the

Re: querying vs. highlighting: complete freedom?

2018-03-23 Thread Arturas Mazeika
Hi Mathesis (Stefan), Thanks for the questions. This made me look at the problem from a distance and re-frame the situation. Good questions indeed. Trying to go around: consider a user who describes herself as being a BMW fan, being convinced that all BMW need to be the blackest color possible (f

Re: querying vs. highlighting: complete freedom?

2018-03-23 Thread Stefan Matheis
Perhaps we try it the other way round .. what's your use case for this? I'm trying to think of a situation where I'd need this a as user? The only reason I see myself doing this is CTRL+F in a page when the search result is not immediately visible for me ;) On Mar 23, 2018 9:41 AM, "Arturas Maze

Problem accessing /solr/_shard1_replica_n1/get

2018-03-23 Thread Hendrik Haddorp
Hi, I have a Solr Cloud 7.2.1 setup and used SolrJ (7.2.1) to create 1000 collections with a few documents. During that I got multiple times in the Solr logs exceptions because an access of the /get handler of a collection failed. The call stack looks like this:     at org.apache.solr.client.

Re: querying vs. highlighting: complete freedom?

2018-03-23 Thread Arturas Mazeika
Hi Erick et al, >From your answer I understand that this is not a typical case that one searches for a keyword but highlights something else. Since we have two parameters (q vs hl.q) I thought they are freely combinable. From your answer I understand that this is not really the case. My current un