Upgrading solarj from 6.5.1 to 8.0.0

2019-03-20 Thread Lahiru Jayasekera
Hi all, I need help implementing the following code in solarj 8.0.0. private SolrClient server, adminServer; this.adminServer = new HttpSolrClient(SolrClientUrl); this.server = new HttpSolrClient( SolrClientUrl + "/" + mapping.getCoreName() ); if (serverUserAuth) { HttpClientUtil.setBasicAuth(

Re: RegexReplaceProcessorFactory pattern to detect multiple \n

2019-03-20 Thread Zheng Lin Edwin Yeo
Hi Paul, Would like to check, if there is any difference in performance when we use the two different patterns method? (\n\W*){2,} [ \t\x0b\f]*\r?\n Regards, Edwin On Thu, 14 Mar 2019 at 09:36, Zheng Lin Edwin Yeo wrote: > Hi Paul, > > Thanks for your reply. > > So far we did not find cases

Re: Gather Nodes Streaming

2019-03-20 Thread Zheng Lin Edwin Yeo
Hi, What is the fieldType of your 'to field? Which tokenizers/filters is it using? Also, which Solr version are you using? Regards, Edwin On Thu, 21 Mar 2019 at 01:57, Susmit Shukla wrote: > Hi, > > Trying to use solr streaming 'gatherNodes' function. It is for extracting > email graph based

ClassCastException on partial update TrieDateField Solr 7.7.1

2019-03-20 Thread damienk
Hi, I've upgraded a collection from Solr 6 to Solr 7.7.1 and now when I do a partial update on a doc and set a TrieDateField I'm seeing a ClassCastException. I understand TrieDateField's are deprecated and I am planning to re-index using to a DatePointField, but I was expecting this to work. Has a

Re: Upgrading tika

2019-03-20 Thread Geoffrey Willis
Thanks for the explanation, it makes sense. I have noticed that sometimes a pdf document with spaces in the name can kill Tika and as result Solr so I get your point. Was trying to keep my webApp all Javascript/Typescript so went with the exposed extract/update handler. In my looking around I di

RE: Upgrading tika

2019-03-20 Thread Phil Scadden
While using the update/extract handler is good for test, tika is a heavyweight with the risk that a bad document would compromise the solr instance and tika even with ordinary docs is a hog. I wrote code with solrj to do the indexing and run it on completely different machine to the solr instan

Re:BM25F in Solr

2019-03-20 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)
If you want a 'global' IDF across different fields, maybe one solution is to use a copyfield to copy all the fields in a common field (e.g, title, authors, body, footer all copied into a copyfield call text), and then you should be able to use it with a function query or by implementing your own

Re: Re: Re: obfuscated password error

2019-03-20 Thread Branham, Jeremy (Experis)
Hard to see in email, particularly because my email server strips urls, but a few thinigs I would suggest – Be sure there aren’t any spaces after your line continuation characters ‘\’. This has bit me before. Check the running processes JVM args and compare `ps –ef | grep solr` Also, I’d recomme

Re: Range query syntax on a polygon field is returning all documents

2019-03-20 Thread David Smiley
Hi Mitchell, Seems like there's a bug based on what you've shown. * Can you please try RptWithGeometrySpatialField instead of SpatialRecursivePrefixTreeFieldType to see if the problem goes away? This could point to a precision issue; though still what you've seen is suspicious. * Can you try one o

Gather Nodes Streaming

2019-03-20 Thread Susmit Shukla
Hi, Trying to use solr streaming 'gatherNodes' function. It is for extracting email graph based on from and to fields. It requires 'to' field to be a single value field with docvalues enabled since it is used internally for sorting and unique streams The 'to' field can contain multiple email addr

Re: Upgrading tika

2019-03-20 Thread Geoffrey Willis
Could you expand on that please? I’m currently building a webApp that submits documents to Solr/Tika via the update/extract handler and it’s working fine. What do you mean when you say “You do not want to have your Solr instance processing via Tika”? If that’s a bad design choice please elaborat

Re: Nested geofilt query for LTR feature

2019-03-20 Thread David Smiley
Hi, I've never used the LTR module, but I suspect I might know what the error is. I think that the "query" Function Query has parsing limitations on what you pass to it. At least it used to. Try to put the embedded query onto another parameter and then refer to it with a dollar-sign. See the e

CDCR one source multiple targets

2019-03-20 Thread Arnold Bronley
Hi, is it possible to use CDCR with one source SolrCloud cluster and multiple target SolrCloud clusters? I tried to edit the zkHost setting in source cluster's solrconfig file by adding multiple comma separated values for target zkhosts for multuple target clusters. But the CDCR replication happen

BM25F in Solr

2019-03-20 Thread Jan Høydahl
Hi There have been several discussions in the past on how to do BM25F scoring in Solr. People have mentioned BlendedTermQuery and in Lucene 8.0 we got a new BM25FQuery. What I mainly want is to normalize the doc freq (IDF) across fields, so that e.g. title field uses same doc-freq as body field

Re: Re: obfuscated password error

2019-03-20 Thread Satya Marivada
Sending again, with highlighted text in yellow. So I got a chance to do a diff of the environments solr-6.3.0 folder within contents. solr-6.3.0/bin/solr file has the difference highlighted in yellow. Any idea of what is going on in that if else in solr file? *The working configuration file cont

RE: Upgrading tika

2019-03-20 Thread Tannen, Lev (USAEO) [Contractor]
Thank you Shawn and Erick, I truly did not want to dive into Tika and Cxf worlds, but it looks I have no choice. -Original Message- From: Shawn Heisey Sent: Wednesday, March 20, 2019 11:09 AM To: solr-user@lucene.apache.org Subject: Re: Upgrading tika On 3/20/2019 8:24 AM, Tannen, Lev

Re: Use of ShingleFilter causing very large BooleanQuery structures in Solr 7.1

2019-03-20 Thread Erick Erickson
The Apache mail server aggressively strips attachments, so yours didn’t come through. People often provide links to images stored somewhere else As to why this is behaving this way, I’m pretty clueless. A _complete_ shot in the dark is the query parsing changed its default for split on white

Re: Upgrading tika

2019-03-20 Thread Shawn Heisey
On 3/20/2019 8:24 AM, Tannen, Lev (USAEO) [Contractor] wrote: I still need your advice. The program I have to fix uses class AutoDetectParser along with Solrj for parsing PDF files before sending the result to the solr server. To do this it linked two tika jar files taken from the solr distribu

Re: Upgrading tika

2019-03-20 Thread Erick Erickson
Well, I’d have to do the same thing, go spelunking in Tika.. When I used it from SolrJ, I just linked to the Tika distro and it “just worked”, but I admit that was a while ago. Your best bet would probably be the Tika user’s list. Best, Erick > On Mar 20, 2019, at 7:24 AM, Tannen, Lev (USAEO)

Re: Re: obfuscated password error

2019-03-20 Thread Satya Marivada
So I got a chance to do a diff of the environments solr-6.3.0 folder within contents. solr-6.3.0/bin/solr file has the difference highlighted in yellow. Any idea of what is going on in that if else in solr file? *The working configuration file contents are (ssl.properties below has the keystore p

RE: Upgrading tika

2019-03-20 Thread Tannen, Lev (USAEO) [Contractor]
Hi Erick, I still need your advice. The program I have to fix uses class AutoDetectParser along with Solrj for parsing PDF files before sending the result to the solr server. To do this it linked two tika jar files taken from the solr distribution. Namely: tika-core and tika-parsers. Maybe it u

Suggester case (in)sensitive

2019-03-20 Thread Moritz Schmidt
Hello everyone. I’m trying to build autocomplete functionality. My setup works but has one problem: When using HighFrequencyDictionaryFactory the Suggestion-Results I get are all lowercase as defined in my schema.xml: Without the LowerCaseFilterFactory I get my results as I wa

Use of ShingleFilter causing very large BooleanQuery structures in Solr 7.1

2019-03-20 Thread Hubert-Price, Neil
Hello All, We have a recently upgraded system that went from Solr 4.6 to Solr 7.1 (used as part of an ecommerce application). In the upgraded version we are seeing frequent issues with very high Solr memory usage for certain types of query, but the older 4.6 version does not produce the same r