Re: CloudSolrClient getDocCollection

2019-02-08 Thread Hendrik Haddorp
Hi Jason, thanks for your answer. Yes, you would need one watch per state.json and thus one watch per collection. That should however not really be a problem with ZK. I would assume that the Solr server instances need to monitor those nodes to be up to date on the cluster state. Using org.apa

Re: Java object binding not working

2019-02-08 Thread Jason Gerlowski
Hi Swapnil, Ray did suggest a potential cause. Your Java object has "name" as a String, but Solr returns the "name" value as an ArrayList. Usually Solr returns ArrayLists when the field in question is multivalued, so it's a safe bet that Solr is treating your "name" field as multivalued. You can

Re: CloudSolrClient getDocCollection

2019-02-08 Thread Jason Gerlowski
Hi Henrik, I'll try to answer, and let others correct me if I stray. I wasn't around when CloudSolrClient was written, so take this with a grain of salt: "Why does the client need that timeout?Wouldn't it make sense to use a watch?" You could probably write a CloudSolrClient that uses watch

Re: Ignore accent in a request

2019-02-08 Thread elisabeth benoit
yes you do and use the char filter at index and query time Le ven. 8 févr. 2019 à 19:20, SAUNIER Maxence a écrit : > For the charFilter, I need to reindex all documents ? > > -Message d'origine- > De : Erick Erickson > Envoyé : vendredi 8 février 2019 18:03 > À : solr-user > Objet : R

Fwd: Java object binding not working

2019-02-08 Thread Swapnil Katkar
Hi, It would be beneficial to me if you provide me at least some hint to resolve this problem. Thanks in advance! Regards, Swapnil Katkar -- Forwarded message - From: Swapnil Katkar Date: Tue, Feb 5, 2019 at 10:58 PM Subject: Fwd: Java object binding not working To: Hello,

RE: Ignore accent in a request

2019-02-08 Thread SAUNIER Maxence
For the charFilter, I need to reindex all documents ? -Message d'origine- De : Erick Erickson Envoyé : vendredi 8 février 2019 18:03 À : solr-user Objet : Re: Ignore accent in a request Elisabeth's suggestion is spot on for the accent. One other thing I noticed. You are using Keyword

Re: Ignore accent in a request

2019-02-08 Thread Erick Erickson
Elisabeth's suggestion is spot on for the accent. One other thing I noticed. You are using KeywordTokenizerFactory combined with EdgeNGramFilterFactory. This implies that you can't search for individual _words_, only prefix queries, i.e. je je s je su je sui je suis You can't search for "suis" fo

Re: Query of Death Lucene/Solr 7.6

2019-02-08 Thread Michael Gibney
Hi Markus, As of 7.6, LUCENE-8531 reverted a graph/Spans-based phrase query implementation (introduced in 6.5 -- LUCENE-7699 ) to an implementation that builds a separate phrase query for each pos

Re: Indexing in one collection affect index in another collection

2019-02-08 Thread Zheng Lin Edwin Yeo
Hi Shawn, Thanks for your reply. Although the space in the OS disk cache could be the issue, but we didn't face this problem previously, especially in our other setup using Solr 6.5.1, which contains much more data (more than 1 TB), as compared to our current setup in Solr 7.6.0, in which the dat

Re: Get recent documents from solr

2019-02-08 Thread Jan Høydahl
Add a field to schema which will insert the actual indexing date Then query q=*:*&sort=indextime desc But if you re-index everything (why?) then you need to map some date stamp from the source DB into the same field in Solr schema, that you can then sort on. You're indexing from 4 DB views in

RE: Ignore accent in a request

2019-02-08 Thread SAUNIER Maxence
Thanks you ! -Message d'origine- De : elisabeth benoit Envoyé : vendredi 8 février 2019 14:12 À : solr-user@lucene.apache.org Objet : Re: Ignore accent in a request Hello, We use solr 7 and use with mapping-ISOLatin1Accent.txt containing lines like # À => A "\u00C0" => "A" # Á =>

RE: Solr Index Size after reindex

2019-02-08 Thread Mathieu Menard
Hi Andrea, I've checked this information and here is the result: PRODUCTION STAGING numDocs 5.365.213 4.537.651 MaxDoc 5.845.469 5.129.556 It seems that there is more than 800.00 docs in PRODUCTION that will explain the size of indexes more important. But there is a thing that I don't

RE: change in White Space when upgrading 6.6 to 7.4

2019-02-08 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
> Can we take this thread back to the mailing list, please? It would be good to > allow other people to weigh in! Sure -Original Message- From: Matt Pearce Sent: Friday, February 08, 2019 6:45 AM To: Oakley, Craig (NIH/NLM/NCBI) [C] Subject: Re: change in White Space when upgrading 6.

Re: RegexReplaceProcessorFactory pattern to detect multiple \n

2019-02-08 Thread Zheng Lin Edwin Yeo
Hi Paul, Regarding the regex (\n\s*){2,} that we are using, when we try in on https://regex101.com/, it is able to give us the correct result for all the examples (ie: All of them will only have , and not more than that like what we are getting in Solr in our earlier examples). Could there be a p

Re: Solr Index Size after reindex

2019-02-08 Thread Andrea Gazzarini
Hi Mathieu, what about the docs in the two infrastructures? Do they have the same numbers (numdocs / maxdocs)? Any meaningful message (error or not) in log files? Andrea On 08/02/2019 14:19, Mathieu Menard wrote: Hello, I would like to have your point of view about an observation we have

Get recent documents from solr

2019-02-08 Thread shruti suri
Hi, I want to get latest updated documents from Solr. I am indexing data from multiple view and each view has its own update date. Also I am running full-indexing job every 4 hour so can't take solrtimestamp(NOW). Is there any solr functionality by which I can achieve this. Thanks Shruti ---

Re: Ignore accent in a request

2019-02-08 Thread elisabeth benoit
Hello, We use solr 7 and use with mapping-ISOLatin1Accent.txt containing lines like # À => A "\u00C0" => "A" # Á => A "\u00C1" => "A" # Â => A "\u00C2" => "A" # Ã => A "\u00C3" => "A" # Ä => A "\u00C4" => "A" # Å => A "\u00C5" => "A" # Ā Ă Ą => "\u0100" => "A" "\u0102" => "A" "\u0104" =

Re: Solr relevancy score different on replicated nodes

2019-02-08 Thread Aman Tandon
Hi Erick, I find this thread very relevant to the people who are facing the same problem. In our case, we have a signals aggregation collection which is having total of around 8 million records. We have Solr cloud architecture(3 shards and 4 replicas) and the whole size of index is of around 2.5

Query of Death Lucene/Solr 7.6

2019-02-08 Thread Markus Jelsma
Hello (apologies for cross-posting), While working on SOLR-12743, using 7.6 on two nodes and 7.2.1 on the remaining four, we stumbled upon a situation where the 7.6 nodes quickly succumb when a 'Query-of-Death' is issued, 7.2.1 up to 7.5 are all unaffected (tested and confirmed). Following Smi

Issue with dataimport xml validation with dtd and jetty: conflict of use for user.dir variable

2019-02-08 Thread jerome . dupont
Hello, I use solr and dataimport to index xml files with a dtd. The dtd is referenced like this Previously we were using solr4 in a tomcat container. During the import process, solr tries to validate the xml file with the dtd. To find it we were defining -Duser.dir=pathToDtD and solr could find

RE: Ignore accent in a request

2019-02-08 Thread Gopesh Sharma
We have fixed this type of issue by using Synonyms by adding SynonymFilterFactory(Before Solr 7). -Original Message- From: SAUNIER Maxence Sent: Friday, February 8, 2019 3:36 PM To: solr-user@lucene.apache.org Subject: RE: Ignore accent in a request Hello, Thanks for you answer. I ha

RE: Ignore accent in a request

2019-02-08 Thread SAUNIER Maxence
Hello, Thanks for you answer. I have test : select?defType=dismax&q=je suis avarié&qf=content 90.000 results select?defType=dismax&q=je suis avarie&qf=content 60.000 results With avarié, I dont find documents with avarie and with avarie, I don't find documents with avarié. I want to find the