Re: Question on solr metrics

2020-10-27 Thread Emir Arnautović
Hi, In order to see time range metrics, you’ll need to collect metrics periodically and send it to some storage and then query/visualise. Solr has exporters for some popular backends, or you can use some cloud based solution. One such solution is our: https://sematext.com/integrations/solr-monit

Re: Question on metric values

2020-10-26 Thread Andrzej Białecki
The “requests” metric is a simple counter. Please see the documentation in the Reference Guide on the available metrics and their meaning. This counter is initialised when the replica starts up, and it’s not persisted (so if you restart this Solr node it will reset to 0). If by “frequency” you

Re: Question about solr commits

2020-10-08 Thread Erick Erickson
This is a bit confused. There will be only one timer that starts at time T when the first doc comes in. At T+ 15 seconds, all docs that have been received since time T will be committed. The first doc to hit Solr _after_ T+15 seconds starts a single new timer and the process repeats. Best, rick >

Re: Question about solr commits

2020-10-08 Thread Rahul Goswami
Shawn, So if the autoCommit interval is 15 seconds, and one update request arrives at t=0 and another at t=10 seconds, then will there be two timers one expiring at t=15 and another at t=25 seconds, but this would amount to ONLY ONE commit at t=15 since that one would include changes from both upda

Re: Question about solr commits

2020-10-07 Thread yaswanth kumar
Thank you very much both Eric and Shawn Sent from my iPhone > On Oct 7, 2020, at 10:41 PM, Shawn Heisey wrote: > > On 10/7/2020 4:40 PM, yaswanth kumar wrote: >> I have the below in my solrconfig.xml >> >> >> ${solr.Data.dir:} >> >> >> ${solr.autoCommit.maxTime:6000

Re: Question about solr commits

2020-10-07 Thread Shawn Heisey
On 10/7/2020 4:40 PM, yaswanth kumar wrote: I have the below in my solrconfig.xml ${solr.Data.dir:} ${solr.autoCommit.maxTime:6} false ${solr.autoSoftCommit.maxTime:5000} Does this mean even though we are always sending da

Re: Question about solr commits

2020-10-07 Thread Erick Erickson
Yes. > On Oct 7, 2020, at 6:40 PM, yaswanth kumar wrote: > > I have the below in my solrconfig.xml > > > > ${solr.Data.dir:} > > > ${solr.autoCommit.maxTime:6} > false > > > ${solr.autoSoftCommit.maxTime:5000} > > > > Does this mean even thoug

Re: Question on sorting

2020-07-22 Thread Saurabh Sharma
Hi, It is because field is string and numbers are getting sorted lexicographically.It has nothing to do with number of digits. Thanks Saurabh On Thu, Jul 23, 2020, 11:24 AM Srinivas Kashyap wrote: > Hello, > > I have schema and field definition as shown below: > > omitNorms="true"/> > > > /

Re: Question regarding replica leader

2020-07-20 Thread Vishal Vaibhav
So how do we recover from such state ? When I am trying addreplica , it returns me 503. Also my node has multiple replicas out of them most are dead. How do we make get rid of those dead replicas via script. ?is that a possibility? On Mon, 20 Jul 2020 at 11:00 AM, Radu Gheorghe wrote: > Hi Vish

Re: Question regarding replica leader

2020-07-19 Thread Radu Gheorghe
Hi Vishal, I think that’s true, yes. The cluster has a leader (overseer), but this particular shard doesn’t seem to have a leader (yet). Logs should give you some pointers about why this happens (it may be, for example, that each replica is waiting for the other to become a leader, because each

Re: Question regarding replica leader

2020-07-19 Thread Vishal Vaibhav
Hi any pointers on this ? On Wed, 15 Jul 2020 at 11:13 AM, Vishal Vaibhav wrote: > Hi Solr folks, > > I am using solr cloud 8.4.1 . I am using* > `/solr/admin/collections?action=CLUSTERSTATUS`*. Hitting this endpoint I > get a list of replicas in which one is active but neither of them is > lead

Re: Question about Atomic Update

2020-06-15 Thread david . davila
Tecnologías de Análisis de la Información e Investigación del Fraude Teléfono: 915828763 Extensión: 36763 De: "Erick Erickson" Para: solr-user@lucene.apache.org Fecha: 15/06/2020 14:27 Asunto: Re: Question about Atomic Update All Atomic Updates do is 1> read all the

Re: Question about Atomic Update

2020-06-15 Thread Erick Erickson
All Atomic Updates do is 1> read all the stored fields from the record being updated 2> overlay your updates 3> re-index the document. At <3> it’s exactly as though you sent the entire document again, so your observation that the whole document is re-indexed is accurate. If the fields you want

Re: question about setup for maximizing solr performance

2020-06-01 Thread Shawn Heisey
On 6/1/2020 9:29 AM, Odysci wrote: Hi, I'm looking for some advice on improving performance of our solr setup. Does anyone have any insights on what would be better for maximizing throughput on multiple searches being done at the same time? thanks! In almost all cases, adding memory will pr

Re: Question about the max num of solr node

2020-01-03 Thread Jörn Franke
Why do you want to set up so many? What are your designs in terms of volumes / no of documents etc? > Am 03.01.2020 um 10:32 schrieb Hongxu Ma : > > Hi community > I plan to set up a 128 host cluster: 2 solr nodes on each host. > But I have a little concern about whether solr can support so m

Re: Question about Luke

2019-11-20 Thread Tomoko Uchida
Hello, > Is it different from checkIndex -exorcise option? > (As far as I recently leaned, checkIndex -exorcise will delete unreadable > indices. ) If you mean desktop app Luke, "Repair" is just a wrapper of CheckIndex.exorciseIndex(). There is no difference between doing "Repair" from Luke GUI

Re: Question about startup memory usage

2019-11-14 Thread Shawn Heisey
On 11/14/2019 1:46 AM, Hongxu Ma wrote: Thank you @Shawn Heisey , you help me many times. My -xms=1G When restart solr, I can see the progress of memory increasing (from 1G to 9G, took near 10s). I have a guess: maybe solr is loading some needed files into heap memo

Re: Question about startup memory usage

2019-11-14 Thread Hongxu Ma
What's your thoughts? thanks. From: Shawn Heisey Sent: Thursday, November 14, 2019 1:15 To: solr-user@lucene.apache.org Subject: Re: Question about startup memory usage On 11/13/2019 2:03 AM, Hongxu Ma wrote: > I have a solr-cloud cluster with a big col

Re: Question about startup memory usage

2019-11-13 Thread Shawn Heisey
On 11/13/2019 2:03 AM, Hongxu Ma wrote: I have a solr-cloud cluster with a big collection, after startup (no any search/index operations), its jvm memory usage is 9GB (via top: RES). Cluster and collection info: each host: total 64G mem, two solr nodes with -xmx=15G collection: total 9B billion

Re: Question about memory usage and file handling

2019-11-11 Thread Erick Erickson
(1) no. The internal Ram buffer will pretty much limit the amount of heap used however. (2) You actually have several segments. “.cfs” stands for “Compound File”, see: https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/codecs/lucene70/package-summary.html "An optional "virtual" file co

Re: Question about memory usage and file handling

2019-11-11 Thread Shawn Heisey
On 11/11/2019 1:40 PM, siddharth teotia wrote: I have a few questions about Lucene indexing and file handling. It would be great if someone can help with these. I had earlier asked these questions on gene...@lucene.apache.org but was asked to seek help here. This mailing list (solr-user) is for

Re: Question regarding subqueries

2019-10-03 Thread Bram Biesbrouck
Hi Mikhail, You're right, I'm probably over-complicating things. I was stuck trying to combine a function in a regular query using a local variable, but Solr doesn't seem to bend the way my mind did ;-) Anyway, I worked around it using your suggestion and/or a slightly modified prefix parser plugi

Re: Question regarding subqueries

2019-10-02 Thread Mikhail Khludnev
Hello, Bram. Something like that is possible in principle, but it will take enormous efforts to tackle exact syntax. Why not something like children.fq=-parent:true ? On Wed, Oct 2, 2019 at 8:52 PM Bram Biesbrouck < bram.biesbro...@reinvention.be> wrote: > Hi all, > > I'm struggling with a littl

Re: Question about "No registered leader" error

2019-09-18 Thread Hongxu Ma
hen this error happens. Thanks again. From: Shawn Heisey Sent: Wednesday, September 18, 2019 20:21 To: solr-user@lucene.apache.org Subject: Re: Question about "No registered leader" error On 9/18/2019 6:11 AM, Shawn Heisey wrote: > On 9/17/2019 9:3

Re: Question about "No registered leader" error

2019-09-18 Thread Erick Erickson
Check whether the oom killer script was called. If so, there will be log files obviously relating to that. I've seen nodes mysteriously disappear as a result of this with no message in the regular solr logs. If that's the case, you need to increase your heap. Erick On Wed, Sep 18, 2019 at 8:21 AM

Re: Question about "No registered leader" error

2019-09-18 Thread Shawn Heisey
On 9/18/2019 6:11 AM, Shawn Heisey wrote: On 9/17/2019 9:35 PM, Hongxu Ma wrote: My questions:    *   Is this error possible caused by "long gc pause"? my solr zkClientTimeout=6 It's possible.  I can't say for sure that this is the issue, but it might be. A followup. I was thinking a

Re: Question about "No registered leader" error

2019-09-18 Thread Shawn Heisey
On 9/17/2019 9:35 PM, Hongxu Ma wrote: My questions: * Is this error possible caused by "long gc pause"? my solr zkClientTimeout=6 It's possible. I can't say for sure that this is the issue, but it might be. * If so, how can I prevent this error happen? My thoughts: using G

Re: Question: Solr perform well with thousands of replicas?

2019-09-04 Thread Hongxu Ma
warning message to user. From: Erick Erickson Sent: Monday, September 2, 2019 21:20 To: solr-user@lucene.apache.org Subject: Re: Question: Solr perform well with thousands of replicas? > why so many collection/replica: it's our customer needs, for example: ea

Re: Question: Solr perform well with thousands of replicas?

2019-09-02 Thread Erick Erickson
; > > > From: Erick Erickson > Sent: Friday, August 30, 2019 20:05 > To: solr-user@lucene.apache.org > Subject: Re: Question: Solr perform well with thousands of replicas? > > “no registered leader” is the effect of some problem usually, not

Re: Question: Solr perform well with thousands of replicas?

2019-09-02 Thread Hongxu Ma
base table mappings a collection. * this env is just a test cluster: I want to verify the max collection number solr can support stably. From: Erick Erickson Sent: Friday, August 30, 2019 20:05 To: solr-user@lucene.apache.org Subject: Re: Question: Solr perform

Re: Question: Solr perform well with thousands of replicas?

2019-08-30 Thread Erick Erickson
“no registered leader” is the effect of some problem usually, not the root cause. In this case, for instance, you could be running out of file handles and see other errors like “too many open files”. That’s just one example. One common problem is that Solr needs a lot of file handles and the sy

Re: Question: Solr perform well with thousands of replicas?

2019-08-30 Thread Jörn Franke
What is the reason for this number of replicas? Solr should work fine, but maybe it is worth to consolidate some collections to avoid also administrative overhead. > Am 29.08.2019 um 05:27 schrieb Hongxu Ma : > > Hi > I have a solr-cloud cluster, but it's unstable when collection number is big:

Re: Question: Solr perform well with thousands of replicas?

2019-08-29 Thread Hongxu Ma
To: solr-user@lucene.apache.org Subject: Re: Question: Solr perform well with thousands of replicas? On 8/28/2019 9:27 PM, Hongxu Ma wrote: > I have a solr-cloud cluster, but it's unstable when collection number is big: > 1000 replica/core per solr node. > > To sol

Re: Question: Solr perform well with thousands of replicas?

2019-08-29 Thread Shawn Heisey
On 8/28/2019 9:27 PM, Hongxu Ma wrote: I have a solr-cloud cluster, but it's unstable when collection number is big: 1000 replica/core per solr node. To solve this issue, I have read the performance guide: https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems I noted there i

Re: Question: Solr perform well with thousands of replicas?

2019-08-29 Thread Erick Erickson
There are two factors: 1> the raw number of replicas on a Solr node. 2> total resources Solr needs. You say “..it’s unstalble…”. _How_ is it unstable? What symptoms are you seeing? You might want to review: https://cwiki.apache.org/confluence/display/solr/UsingMailingLists And not as you add mo

Re: Question: Solr perform well with thousands of replicas?

2019-08-28 Thread Hendrik Haddorp
Hi, we are usually using Solr Clouds with 5 nodes and up to 2000 collections and a replication factor of 2. So we have close to 1000 cores per node. That is on Solr 7.6 but I believe 7.3 worked as well. We tuned a few caches down to a minimum as otherwise the memory usage goes up a lot. The Solr

Re: question about solrCloud joining

2019-08-23 Thread Mikhail Khludnev
Raised https://issues.apache.org/jira/browse/SOLR-13716 On Wed, Aug 21, 2019 at 10:37 AM Lisheng Wang wrote: > Hi Mikhail, > > okay. > > below is 2 requests: > > both are select from "movieDirectors" collection join "movies" collection > which has 2 shards. > > > http://localhost:8983/solr/mov

Re: question about solrCloud joining

2019-08-21 Thread Mikhail Khludnev
I'm not sure, but it might be an issue. It make sense to add negative test and assert the exception at https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/cloud/DistribJoinFromCollectionTest.java On Wed, Aug 21, 2019 at 10:37 AM Lisheng Wang wrote: > Hi Mikhail,

Re: question about solrCloud joining

2019-08-21 Thread Lisheng Wang
Hi Mikhail, okay. below is 2 requests: both are select from "movieDirectors" collection join "movies" collection which has 2 shards. http://localhost:8983/solr/movieDirectors/select?fq=%7B!join%20from%3Ddirector_id%20fromIndex%3Dmovies%20to%3Did%7Dtitle%3A%22Dunkirk%22&q=*%3A* http://localhost

Re: question about solrCloud joining

2019-08-21 Thread Mikhail Khludnev
Ok. Still hard to follow. Can you clarify which collection you run these queries on? Collection name (url segment before /select) is more significant than any port (jvm) identity. On Wed, Aug 21, 2019 at 5:14 AM Lisheng Wang wrote: > Hi Mikhail > > Thanks for your response, but question is not

Re: question about solrCloud joining

2019-08-20 Thread Lisheng Wang
Hi Mikhail Thanks for your response, but question is not related to "title:Get Out", maybe i did not describe clearly. I knew solrCloud joining is not working in index which is splited to multiple shards. but why i run "*{!join from=director_id fromIndex=movies to=id}title:"Dunkirk"*" on 8984 (

Re: question about solrCloud joining

2019-08-20 Thread Mikhail Khludnev
Hello, Lisheng. I barely follow, but couldn't the space symbol in "title:Get Out" cause the problem ? Check debugQuery and nested query in local param. On Tue, Aug 20, 2019 at 6:35 PM Lisheng Wang wrote: > Hi Erick > > Thanks for your quick response and remaining me about attachment issue. > >

Re: question about solrCloud joining

2019-08-20 Thread Lisheng Wang
Hi Erick Thanks for your quick response and remaining me about attachment issue. Yes, i run on 2 different jvms that not related to if they are on same machine or not. let me describe my scenario, i have two collection: i start 2 nodes on my laptop on 2 different JVM, ports are 8983 and 8984.

Re: question about solrCloud joining

2019-08-20 Thread Erick Erickson
None of your images came through, the mail server aggressively strips attachments. You’ll have to put them somewhere and provide a link. Given that, I’m guessing without much data so this may be totally misguided. You mention ports 8984 and 8984. Assuming those are two different Solr JVMs, the

Re: Question regarding Solr fq query

2019-06-28 Thread Saurabh Sharma
Hi, Images are not visible. Please upload on some image sharing platform and share the link. Thanks On Fri, 28 Jun, 2019, 11:00 PM Krishna Kammadanam, wrote: > Hello, > > > > I am a back-end developer working with Solr 4.0 version. > > > > I am running into so many issues, but trying to unders

Re: Question regarding negated block join queries

2019-06-17 Thread Erick Erickson
Bram: Here’s a fuller explanation that you might be interested in: https://lucidworks.com/2011/12/28/why-not-and-or-and-not/ Best, Erick > On Jun 17, 2019, at 11:32 AM, Bram Biesbrouck > wrote: > > On Mon, Jun 17, 2019 at 7:11 PM Shawn Heisey wrote: > >> On 6/17/2019 4:46 AM, Bram Biesbrou

Re: Question regarding negated block join queries

2019-06-17 Thread Bram Biesbrouck
On Mon, Jun 17, 2019 at 7:11 PM Shawn Heisey wrote: > On 6/17/2019 4:46 AM, Bram Biesbrouck wrote: > > q={!parent which=-(parentUri:*)}*:* > > Pure negative queries do not work in Lucene. Sometimes, when you do a > single-clause negative query, Solr is able to detect the problem and > automatica

Re: Question regarding negated block join queries

2019-06-17 Thread Shawn Heisey
On 6/17/2019 4:46 AM, Bram Biesbrouck wrote: q={!parent which=-(parentUri:*)}*:* Pure negative queries do not work in Lucene. Sometimes, when you do a single-clause negative query, Solr is able to detect the problem and automatically make an adjustment so the query works. This happens tran

Re: Question RE: Contents of Field Value Cache

2019-05-06 Thread benrollinger
Mikhail Khludnev-2 wrote > Hello, > Every FVC entry corresponds to to a field, but capped by max size. So, > it's > really odd that its' numbers peaked as some point of time. Note that some > caches support showItems parameter, check the doc. > > On Sat, May 4, 2019 at 11:04 AM benrollinger < > r

Re: Question RE: Contents of Field Value Cache

2019-05-04 Thread Mikhail Khludnev
Hello, Every FVC entry corresponds to to a field, but capped by max size. So, it's really odd that its' numbers peaked as some point of time. Note that some caches support showItems parameter, check the doc. On Sat, May 4, 2019 at 11:04 AM benrollinger wrote: > Good Evening, > > Running into a p

Re: Question on Solr/WordPress Integration

2019-03-01 Thread markus kalkbrenner
If you’re more familiar with PHP you can do the same using the Solarium library instead of SolrJ for Java. Once the PDFs are extracted and indexed, Drupal is an alternative to Wordpress as Frontend. Using the Serach API Solr module you can access and „present“ any existing Solr index without a

Re: Question on Solr/WordPress Integration

2019-03-01 Thread Erick Erickson
Writing a Java (SolrJ) program that traverses a filesystem and extracts the contents of PDF is actually quite simple, see: https://lucidworks.com/2012/02/14/indexing-with-solrj/ (you can ignore the RDBMS stuff). That code is a little out of date so may need some very minor tweaks. Tika (the li

Re: Question on Solr/WordPress Integration

2019-03-01 Thread Paul Buiocchi
Thank you Shawn ! Sent from Yahoo Mail on Android On Fri, Mar 1, 2019 at 12:25 PM, Paul Buiocchi wrote: Greetings,  I have a couple of questions about Solr /Wordpress integration -  First , I am not "committed to using WordPress as a front end. If there is a better front end option , I

Re: Question on Solr/WordPress Integration

2019-03-01 Thread Shawn Heisey
On 3/1/2019 10:25 AM, Paul Buiocchi wrote: I have a couple of questions about Solr /Wordpress integration - You would need to talk to the person who wrote the plugin for Wordpress that integrates with Solr. If they indicate that a question can only be answered by the Solr project, then bring

Re: Question about IndexSearcher.search()

2019-01-25 Thread Shawn Heisey
On 1/24/2019 11:11 PM, NDelt wrote: Hello. I'm trying to make sample search application using Lucene. You're on the solr-user mailing list. If you want help with Lucene, you'll need to ask your question on the java-user mailing list instead. https://lucene.apache.org/core/discussion.html T

Re: Question about Solr concept

2019-01-03 Thread Alexandre Rafalovitch
I believe the answer is yes, but specifics depends on whether you mean online or offline index creation (as in when does the content appear) and also why you want to do so. Couple of ideas: 1) If you just want to make sure all updates are visible at once, you can control that with commit strategie

Re: Question about elevations

2018-11-19 Thread Ray Niu
one more thing to add, if there are fqs, they will be evaluated as well. Edward Ribeiro 于2018年11月19日周一 下午1:24写道: > Just complementing Alessandro's answer: > 1. the elevateIds are inserted into the query, server side (a query > expansion indeed); > 2. the query is executed; > 3. elevatedIds (if f

Re: Question about elevations

2018-11-19 Thread Edward Ribeiro
Just complementing Alessandro's answer: 1. the elevateIds are inserted into the query, server side (a query expansion indeed); 2. the query is executed; 3. elevatedIds (if found) are popped up to the top of the search results via boosting; Edward On Mon, Nov 19, 2018 at 3:41 PM Alessandro Benedet

Re: Question about elevations

2018-11-19 Thread Alessandro Benedetti
As far as I remember the answer is no. You could take a deep look into the code, but as far as I remember the elevated doc Ids must be in the index to be elevated. Those ids will be added to the query built, a sort of query expansion server side. And then the search executed. Cheers - ---

Re: question for rule based replica placement

2018-09-02 Thread Wei
Thanks Erick. Suppose I have 5 hosts h1,h2,h3,h4,h5 and want to create a 5X2 solr cloud of 5 shards, 2 replicas per shard. On each host I will run two solr JVMs, each hosts a single solr core. Solr's default 'snitch' provide a 'host' tag, so I wonder if I can use it to prevent any host from have t

Re: question for rule based replica placement

2018-09-02 Thread Erick Erickson
You need to provide a "snitch" and define a rule appropriately. This is a variant of "rack awareness". Solr considers two JVMs running on the same physical host as completely separate Solr instances, so to get replicas on different hosts you need a snitch etc. Best, Erick On Sun, Sep 2, 2018 at 4

Re: Question on query time boosting

2018-08-23 Thread Kydryavtsev Andrey
Hi, Pratic I believe that your observations are correct. Score for each individual query (in your example it's wildcards query like 'concept_name:(*semantic*)^200') is calculated by a complex formulas (one of possible implementations with a good explanation is described here https://lucene.a

Re: Question about updating indexes on solrcloud with single instance solr

2018-08-20 Thread Erick Erickson
There are two choices: 1> shut down all three replicas and copy the index to each one then start them up. 2> DELETEREPLICA on two of them, update the remaining one, then issue an ADDREPLICA to get the other two back. Of the two, I'd go with <2>. When you ADDREPLICA Solr will take care of copying

Re: Question about updating indexes on solrcloud with single instance solr

2018-08-20 Thread Sushant Vengurlekar
thanks for the reply Eric I have one shard per replica but I have 3 replicas on the solrcloud. So how do I update from the standalone solr core to these 3 replicas On Mon, Aug 20, 2018 at 2:43 PM Erick Erickson wrote: > Assuming that your stand-alone indexes are a single core (i.e. not > sharde

Re: Question about updating indexes on solrcloud with single instance solr

2018-08-20 Thread Erick Erickson
Assuming that your stand-alone indexes are a single core (i.e. not sharded), then just create a single-shard collection with the appropriate schema. From there I'd shut my Solr instance down, copy the index files "to the right place" and fire it all back up. I'd do this with a single-replica SolrCl

Re: Question regarding searching Chinese characters

2018-08-14 Thread Christopher Beer
Hi all, Thanks for this enlightening thread. As it happens, at Stanford Libraries we’re currently working on upgrading from Solr 4 to 7 and we’re looking forward to using the new dictionary-based word splitting in the ICUTokenizer. We have many of the same challenges as Amanda mentioned, and th

Re: Question regarding searching Chinese characters

2018-07-24 Thread Tomoko Uchida
Hi Amanda, > do all I need to do is modify the settings from smartChinese to the ones you posted here Yes, the settings I posted should work for you, at least partially. If you are happy with the results, it's OK! But please take this as a starting point because it's not perfect. > Or do I need

Re: Question regarding searching Chinese characters

2018-07-24 Thread Amanda Shuman
Hi Tomoko, Thanks so much for this explanation - I did not even know this was possible! I will try it out but I have one question: do all I need to do is modify the settings from smartChinese to the ones you posted here: Or do I need to still do something with the SmartChineseAnalyzer

Re: Question

2018-07-23 Thread Alexandre Rafalovitch
That depends on what you mean by "unstructured" and "handle". If by "unstructured" you mean things like PDFs and MSWord - which are structured under the covers, then yes. Solr ships with Apache Tika to injest such documents (see shipped examples as well as Data Import Handler example). E.g. http:/

Re: Question

2018-07-23 Thread Andrea Gazzarini
Hi Driss, I think the answer to the first question is yes, but I guess It doesn't help you so much. Second and third questions: "It depends", you should describe better your contest, narrowing questions ad much as possibile ("how can web do It" is definitely top much generic) Best, Andrea Il lun

Re: Question regarding searching Chinese characters

2018-07-20 Thread Tomoko Uchida
Yes, while traditional - simplified transformation would be out of the scope of Unicode normalization, you would like to add ICUNormalizer2CharFilterFactory anyway :) Let me refine my example settings: Regards, Tomoko 2018年7月21日(土) 2:54 Alexandre Rafalovitch : > Would ICUNormalize

Re: Question regarding searching Chinese characters

2018-07-20 Thread Alexandre Rafalovitch
Would ICUNormalizer2CharFilterFactory do? Or at least serve as a template of what needs to be done. Regards, Alex. On 20 July 2018 at 12:40, Walter Underwood wrote: > Looks like we need a charfilter version of the ICU transforms. That could run > before the tokenizer. > > I’ve never built a

Re: Question regarding searching Chinese characters

2018-07-20 Thread Walter Underwood
Looks like we need a charfilter version of the ICU transforms. That could run before the tokenizer. I’ve never built a charfilter, but it seems like this would be a good first project for someone who wants to contribute. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.o

Re: Question regarding searching Chinese characters

2018-07-20 Thread Tomoko Uchida
Exactly. More concretely, the starting point is: replacing your analyzer to and see if the results are as expected. Then research another filters if your requirements is not met. Just a reminder: HMMChineseTokenizerFactory do not handle traditional characters as I noted previous in po

Re: Question regarding searching Chinese characters

2018-07-20 Thread Walter Underwood
I expect that this is the line that does the transformation: This mapping is a standard feature of ICU. More info on ICU transforms is in this doc, though not much detail on this particular transform. http://userguide.icu-project.org/transforms/general wunder Walter Underwood wun...@wunde

Re: Question regarding searching Chinese characters

2018-07-20 Thread Susheel Kumar
I think so. I used the exact as in github On Fri, Jul 20, 2018 at 10:12 AM, Amanda Shuman wrote: > Thanks! That does indeed look promising... This can be added on top of > Smart Chinese, right? Or is it an alternative? > > > -- > Dr. Amanda Shum

Re: Question regarding searching Chinese characters

2018-07-20 Thread Tomoko Uchida
Hi, There is ICUTransformFilter (that included Solr distribution) which also should be work for you. See the example settings: https://lucene.apache.org/solr/guide/7_4/filter-descriptions.html#icu-transform-filter Combine it with HMMChineseTokenizer. https://lucene.apache.org/solr/guide/7_4/langu

Re: Question regarding searching Chinese characters

2018-07-20 Thread Amanda Shuman
Thanks! That does indeed look promising... This can be added on top of Smart Chinese, right? Or is it an alternative? -- Dr. Amanda Shuman Post-doc researcher, University of Freiburg, The Maoist Legacy Project PhD, University of California, Santa Cru

Re: Question regarding searching Chinese characters

2018-07-20 Thread Susheel Kumar
I think CJKFoldingFilter will work for you. I put 舊小說 in index and then each of A, B or C or D in query and they seems to be matching and CJKFF is transforming the 舊 to 旧 On Fri, Jul 20, 2018 at 9:08 AM, Susheel Kumar wrote: > Lack of my chinese language knowledge but if you want, I can do quic

Re: Question regarding searching Chinese characters

2018-07-20 Thread Susheel Kumar
Lack of my chinese language knowledge but if you want, I can do quick test for you in Analysis tab if you can give me what to put in index and query window... On Fri, Jul 20, 2018 at 8:59 AM, Susheel Kumar wrote: > Have you tried to use CJKFoldingFilter https://github.com/sul-dlss/ > CJKFoldingF

Re: Question regarding searching Chinese characters

2018-07-20 Thread Susheel Kumar
Have you tried to use CJKFoldingFilter https://github.com/sul-dlss/CJKFoldingFilter. I am not sure if this would cover your use case but I am using this filter and so far no issues. Thnx On Fri, Jul 20, 2018 at 8:44 AM, Amanda Shuman wrote: > Thanks, Alex - I have seen a few of those links but

Re: Question regarding searching Chinese characters

2018-07-20 Thread Amanda Shuman
Thanks, Alex - I have seen a few of those links but never considered transliteration! We use lucene's Smart Chinese analyzer. The issue is basically what is laid out in the old blogspot post, namely this point: "Why approach CJK resource discovery differently? 2. Search results must be as scrip

Re: Question regarding searching Chinese characters

2018-07-20 Thread Alexandre Rafalovitch
This is probably your start, if not read already: https://lucene.apache.org/solr/guide/7_4/language-analysis.html Otherwise, I think your answer would be somewhere around using ICU4J, IBM's library for dealing with Unicode: http://site.icu-project.org/ (mentioned on the same page above) Specifical

Re: Question regarding TLS version for solr

2018-05-24 Thread Christopher Schultz
t; 2018-05-24 09:05:17.153 INFO > (coreLoadExecutor-7-thread-2-processing-n:9.109.122.113:8984_solr) > [c:document r:core_node1 x:document] o.a.s.u.SolrIndexConfig > IndexWriter infoStream solr logging is enabled [\] sleep: bad > character in argument What does the solr.log file say? The

Re: Question regarding TLS version for solr

2018-05-24 Thread Anchal Sharma2
n argument Thanks & Regards, - Anchal Sharma e-Pricer Development ES Team Mobile: +9871290248 -Christopher Schultz wrote: - To: solr-user@lucene.apache.org From: Christopher Schultz Date: 05/23/2018 07:29PM Subject: Re: Ques

Re: Question regarding TLS version for solr

2018-05-23 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Anchal, On 5/23/18 2:38 AM, Anchal Sharma2 wrote: > Thank you for replying .But ,I checked the java version solr using > ,and it is already version 1.8. > > @Christopher ,can you let me know what steps you followed for TLS > authentication on solr

Re: Question regarding TLS version for solr

2018-05-22 Thread Anchal Sharma2
ards, - Anchal Sharma e-Pricer Development ES Team Mobile: +9871290248 -Christopher Schultz wrote: - To: solr-user@lucene.apache.org From: Christopher Schultz Date: 05/17/2018 06:29PM Subject: Re: Question regarding TLS version for solr -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 S

Re: Question regarding TLS version for solr

2018-05-17 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Shawn, On 5/17/18 4:23 AM, Shawn Heisey wrote: > On 5/17/2018 1:53 AM, Anchal Sharma2 wrote: >> We are using solr version 5.3.0 and have been trying to enable >> security on our solr .We followed steps mentioned on site >> -https://lucene.apache

Re: Question regarding TLS version for solr

2018-05-17 Thread Shawn Heisey
On 5/17/2018 1:53 AM, Anchal Sharma2 wrote: We are using solr version 5.3.0 and have been trying to enable security on our solr .We followed steps mentioned on site -https://lucene.apache.org/solr/guide/6_6/enabling-ssl.html .But by default it picks ,TLS version 1.0,which is causing an issu

Re: question about updates to shard leaders only

2018-05-15 Thread Mark Miller
Yeah, basically ConcurrentUpdateSolrClient is a shortcut to getting multi threaded bulk API updates out of the single threaded, single update API. The downsides to this are: It is not cloud aware - you have to point it at a server, you have to add special code to see if there are any errors, you do

Re: question about updates to shard leaders only

2018-05-15 Thread Erick Erickson
bq. But don't forget a final client.add(list) after the while-loop ;-) Ha! But only "if (list.size() > 0)" And then there was the memorable time I forgot the "list.clear()" when I sent the batch and wondered why my indexing progress got slower and slower... Not to mention the time I re-used the

Re: question about updates to shard leaders only

2018-05-15 Thread Shawn Heisey
On 5/15/2018 12:12 AM, Bernd Fehling wrote: OK, I have the CloudSolrClient with SolrJ now running but it seams a bit slower compared to ConcurrentUpdateSolrClient. This was not expected. The logs show that CloudSolrClient send the docs only to the leaders. So the only advantage of CloudSolrClien

Re: question about updates to shard leaders only

2018-05-15 Thread Bernd Fehling
Am 15.05.2018 um 14:33 schrieb Erick Erickson: You might find this useful: https://lucidworks.com/2015/10/05/really-batch-updates-solr-2/ I have seen that already and can confirm it. From my observations about a 3x3 cluster with 3 server and my hardware: - have at least 6 CPUs on each server

Re: question about updates to shard leaders only

2018-05-15 Thread Erick Erickson
You might find this useful: https://lucidworks.com/2015/10/05/really-batch-updates-solr-2/ One tricky bit: Assuming docs have a random distribution amongst shards, you should batch so at least 100 docs go to each _shard_. You can see from the link that the speedup is mostly going from 1 to 100. S

Re: question about updates to shard leaders only

2018-05-15 Thread Bernd Fehling
Hi Erik, yes indeed, batching solved it. I used ConcurrentUpdateSolrClient with queue size of 1 but CloudSolrClient doesn't have this feature. I build my own queue now. Ah!!! So I obviously use default NRT but actually don't need it because I don't have any NRT data to index. A latency of se

Re: question about updates to shard leaders only

2018-05-15 Thread Erick Erickson
What did you do to solve your performance problem? Batching updates is one thing that helps performance. bq. I thought that only the leaders are under load until any commit and then replicate to the other replicas. True if (and only if) you're using PULL or TLOG replicas. When using the default

Re: question about updates to shard leaders only

2018-05-15 Thread Bernd Fehling
Thanks, solved, performance is good now. Regards, Bernd Am 15.05.2018 um 08:12 schrieb Bernd Fehling: OK, I have the CloudSolrClient with SolrJ now running but it seams a bit slower compared to ConcurrentUpdateSolrClient. This was not expected. The logs show that CloudSolrClient send the docs o

Re: question about updates to shard leaders only

2018-05-14 Thread Bernd Fehling
OK, I have the CloudSolrClient with SolrJ now running but it seams a bit slower compared to ConcurrentUpdateSolrClient. This was not expected. The logs show that CloudSolrClient send the docs only to the leaders. So the only advantage of CloudSolrClient is that it is "Cloud aware"? With Concurre

Re: question about updates to shard leaders only

2018-05-09 Thread Mark Miller
It's been a while since I've been in this deeply, but it should be something like: sendUpdateOnlyToShardLeaders will select the leaders for each shard as the load balanced targets for update. The updates may not go to the *right* leader, but only the leaders will be chosen, followers (non leader r

Re: question about updates to shard leaders only

2018-05-09 Thread Erick Erickson
You may not need to deal with any of this. The default CloudSolrClient call creates a new LBHttpSolrClient for you. So unless you're doing something custom with any LBHttpSolrClient you create, you don't need to create one yourself. Second, the default for CloudSolrClient.add() is to take the lis

  1   2   3   4   5   6   7   8   9   10   >