Changing Leadership in SolrCloud

2018-02-27 Thread zahra121
Suppose I have a node which is a leader in SolrCloud. When I block this leader's SolrCloud and Zookeeper ports by the command "firewall-cmd --remove-port=/tcp --permanent", the leader does not change automatically and this leader status remains active in solr admin UI. Thus, I decided to change t

Re: Changing Leadership in SolrCloud

2018-02-27 Thread Shawn Heisey
On 2/27/2018 1:36 AM, zahra121 wrote: Suppose I have a node which is a leader in SolrCloud. When I block this leader's SolrCloud and Zookeeper ports by the command "firewall-cmd --remove-port=/tcp --permanent", the leader does not change automatically and this leader status remains active in sol

Re: Rename solrconfig.xml

2018-02-27 Thread Shawn Heisey
On 2/27/2018 12:59 AM, Zheng Lin Edwin Yeo wrote: Regarding the core.properties, understand from the Solr guide that we need to define the "config" properties first. However, my core.properties will only be created when I create the collection from the command http://localhost:8983/solr/admin/col

Re: NRT replicas miss hits and return duplicate hits when paging solrcloud searches

2018-02-27 Thread Emir Arnautović
Hi Webster, Since you are returning all hits, returning the last page is almost as heavy for Solr as returning all documents. Maybe you should consider just returning one large page and completely avoid this issue. I agree with you that this should be handled by Solr. ES solved this issue with “

Re: Changing Leadership in SolrCloud

2018-02-27 Thread Zahra Aminolroaya
Thanks Shawn for the reply. when I try to add a document to solr I get the "no route to host" exception. this means that SolrCloud is aware of the blocking ports; However, zookeeper does not automatically change the leader! -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.htm

RE: Question on "other language" than english stemmers and using both

2018-02-27 Thread Markus Jelsma
Hello, Mixing language specific filters in the same analyzer is not going to give predictable or desirable results. Instead, create separate text_en and text_de fieldTypes and fields. See Solr's default schema.xml, it has many examples of various languages. Depending on what query parser you

Re: Changing Leadership in SolrCloud

2018-02-27 Thread Amin Raeiszadeh
i don't understand your problem clearly but solr admin ui has some bugs. to check your cloud nodes state use the CLUSTERSTATUS command: /admin/collections?action=CLUSTERSTATUS in some cases your command was done but you can't see in admin ui. On Tue, Feb 27, 2018 at 12:49 PM, Shawn Heisey wrote:

RE: Question on "other language" than english stemmers and using both

2018-02-27 Thread TG Servers
Ok thank you. Sounds like a bit more reading into the whole thing. It's just a tool for me so i didn't want to go too deep into it bit sometimes a must is a must. :) default schema.xml? I just get this managed_schema file when installing. Do you mean that one? Am 27. Februar 2018 11:12:39 vor

Re: Changing Leadership in SolrCloud

2018-02-27 Thread Zahra Aminolroaya
The leader status is active. My main question is that how I can change the leader in SolrCloud. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

RE: Question on "other language" than english stemmers and using both

2018-02-27 Thread Markus Jelsma
Maybe check the example directory, it has lots of languages configured: https://github.com/apache/lucene-solr/blob/master/solr/example/files/conf/managed-schema And be sure to check out the manual on the subject: https://lucene.apache.org/solr/guide/7_2/language-analysis.html -Original me

When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread 苗海泉
I encountered a more serious problem in the process of using solr. We use the solr version is 6.0, our daily amount of data is about 500 billion documents, create a collection every hour, the online collection of more than a thousand, 49 solr nodes. If the collection in less than 800, the speed is

RE: Question on "other language" than english stemmers and using both

2018-02-27 Thread TG Servers
Ok thanks! Thomas Am 27. Februar 2018 11:36:52 vorm. schrieb Markus Jelsma : Maybe check the example directory, it has lots of languages configured: https://github.com/apache/lucene-solr/blob/master/solr/example/files/conf/managed-schema And be sure to check out the manual on the subject: h

Re: Changing Leadership in SolrCloud

2018-02-27 Thread Shalin Shekhar Mangar
When you block communication between Zookeeper and the leader, the ZK client inside Solr will disconnect and its session will expire after the session timeout. At this point a new leader should be elected automatically. The default timeout is 30 seconds. You should be able to see the value in solr.

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread Emir Arnautović
Hi, To get more complete picture, can you tell us how many shards/replicas do you have per collection? Also what is index size on disk? Did you check GC? BTW, using 32GB heap prevents you from using compressed oops, resulting in less memory available than 31GB. Thanks, Emir -- Monitoring - Log

Re: Rename solrconfig.xml

2018-02-27 Thread Zheng Lin Edwin Yeo
Hi Shawn, Yes, I'm running SolrCloud. Meaning we have to create all the cores in the collection with the default solrconfig.xml first? Then we have to modify the core.properties, and rename the solrconfig.xml. After which, we have to reload the renamed config to ZooKeeper, then reload the collect

Re: is it appropriate to use external cache for whole shards

2018-02-27 Thread Emir Arnautović
Hi, Assuming you have some web interface, it is not uncommon to apply caching in web browser/middle layer/Solr. The question is if you can live with stale data or if you have some nice mechanism to invalidate data when needed. Solr does that “blindly” - on every commit that includes opening sear

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread 苗海泉
Thank you for reply. One collection has 25 shard one replica, one solr node has about 5T on desk. GC is checked ,and modify as follow : SOLR_JAVA_MEM="-Xms32768m -Xmx32768m " GC_TUNE=" \ -XX:+UseG1GC \ -XX:+PerfDisableSharedMem \ -XX:+ParallelRefProcEnabled \ -XX:G1HeapRegionSize=8m \ -XX:MaxGCPaus

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread 苗海泉
In addition, we found that the rate was normal when the number of collections was kept below 936 and the speed was slower and slower at 984. Therefore, we could only temporarily delete the older collection, but now we need more Online collection, there has been no good way to confuse us for a long

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread Emir Arnautović
Hi, It is hard to tell without looking more into your metrics. It seems to me that you are reaching limits of your cluster. I would doublecheck if memory is the issue. If I got it right, you have ~1120 shards per node. It takes some heap just to keep them open. If you have some caches enabled an

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread 苗海泉
Thanks for you reply again. I just said that you may have some misunderstanding, we have 49 solr nodes, each collection has 25 shards, each shard has only one replica of the data, there is no copy, and I reduce the part of the cache. If you need the metric data, I can check Come out to tell you, i

Re: Solr Phrase Count : How to get count of a phrase in a text field solr

2018-02-27 Thread aneeshkappu
Found the solution put `debug=results` at the end of solr url it will give you the phrase freq also. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread Emir Arnautović
Ah, so there are ~560 shards per node and not all nodes are indexing at the same time. Why is that? You can have better throughput if indexing on all nodes. If happy with shard size, you can create new collection with 49 shards every 2h and have everything the same and index on all nodes. Back

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread 苗海泉
Thank you, we were 49 shard 49 nodes, but later found that in this case, often disconnect between solr and zookeepr, zookeeper too many nodes caused solr instability, so reduced to 25 A follow-up performance can not keep up also need to increase back. Very slow when solr and zookeeper not found an

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread Emir Arnautović
This does not show much: only that your heap is around 75% (24-25GB). I was thinking that you should compare metrics (heap/GC as well) when running on without issues and when running with issues and see if something can be concluded. About instability: Do you run ZK on dedicated nodes? Emir --

Searching for a phrase in proximity to another token in SOLR

2018-02-27 Thread Deyan Yotsov
Hello, Is there a way to achieve something along these lines: "("john smith") josh"~12 Thank you, Deyan

Re: NRT replicas miss hits and return duplicate hits when paging solrcloud searches

2018-02-27 Thread Webster Homer
Emir, Using tlog replica types addresses my immediate problem. The secondary issue is that all of our searches show inconsistent results. These are all normal paging use cases. We regularly test our relevancy, and these differences creates confusion in the testers. Moreover, we are migrating from

Re: Searching for a phrase in proximity to another token in SOLR

2018-02-27 Thread Erick Erickson
Did you try the ComplexPhraseQueryParser? See: https://lucene.apache.org/solr/guide/6_6/other-parsers.html Best, Erick On Tue, Feb 27, 2018 at 7:23 AM, Deyan Yotsov wrote: > Hello, > > Is there a way to achieve something along these lines: > > "("john smith") josh"~12 > > Thank you, > > Deyan >

Gentle reminder RE: Object not fetched because its identifier appears to be already in processing

2018-02-27 Thread YELESWARAPU, VENKATA BHAN
Information Classification: ** Limited Access If any of you experts could help, we would greatly appreciate it. Thank you. From: YELESWARAPU, VENKATA BHAN Sent: Friday, February 23, 2018 8:30 AM To: 'd...@lucene.apache.org' ; 'solr-user@lucene.apache.org' Subject: Object not fetched because its

Re: Gentle reminder RE: Object not fetched because its identifier appears to be already in processing

2018-02-27 Thread Cassandra Targett
There is not enough information here for anyone to answer. You mention a "below message", but there is no message that we can see. If it was in an attachment to the mail, it got stripped by the mail server. If you want a response, please provide in the body of the mail details such as: the error m

Solr crashing StandardWrapperValve

2018-02-27 Thread Wael Kader
Hello, SOLR kept crashing today over and over again . I am running a single node solr instance on Cloudera with 140 GB of data. Things were working fine until today. I have a replication server that I am replicating data to but it wasn't working before and was fixed today.. so I thought maybe its

Re: Solr crashing StandardWrapperValve

2018-02-27 Thread Erick Erickson
You'd really have to talk to Cloudera for support, the version of Solr shipped with CDH isn't a standard distro. Best, Erick On Tue, Feb 27, 2018 at 8:25 AM, Wael Kader wrote: > Hello, > > SOLR kept crashing today over and over again . > I am running a single node solr instance on Cloudera with

Re: Gentle reminder RE: Object not fetched because its identifier appears to be already in processing

2018-02-27 Thread Shawn Heisey
On 2/27/2018 7:08 AM, YELESWARAPU, VENKATA BHAN wrote: While indexing job is running we are seeing the below message for all the objects. Object not fetched because its identifier appears to be already in processing This time, I am going to include you as a CC on the message.  This is not no

New payload handling 7.2

2018-02-27 Thread Markus Jelsma
Hello, Our payload handling became broken since Lucene/Solr 7.2, we sometimes get 0.0 = AveragePayloadFunction.docScore() for some but not all query clauses. We only have payloads on some terms, to signal the similarity it needs to 'punish' the term, e.g. being a article or adjective. I examin

Configuration of SOLR Cluster

2018-02-27 Thread James Keeney
I'm setting up a solr cluster in AWS cloud and I need help with the configuration of ZooKeeper. The cluster has 3 ZK nodes and 3 Solr nodes There are two behaviors that are of concern: *1 - ZK ensemble not accepting return of node* Currently, when a ZK node in the ensemble goes down the ensemble

Re:SOLR Similarity Difference

2018-02-27 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)
Hi Rick, I don't think the issue is BM25 vs TFIDF (the old similarity), it seems more due to the "matching" logic. you are asking to match: "(Action AND Technical AND Temporaries AND t/a AND CTR AND Corporation)" This (in theory) means that you want to retrieve **only** the documents that con

Defining Document Transformers in Solr Configuration

2018-02-27 Thread simon
We do quite complex data pulls from a Solr index for subsequent analytics, currently using a home-grown Python API. Queries might include a handful of pseudofields which this API rewrites to an aliased field invoking a Document Transformer in the 'fl' parameter list. For example 'numcites' is tra

Re:Defining Document Transformers in Solr Configuration

2018-02-27 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)
I don't think you can define docTrasformer in the SolrConfig at the moment, I agree it would be a cool feature. Maybe one possibility could be to use the update request processors [1], and precompute the fields at index time, it would be more expensive in disk and index time, but then it woul

Re: SOLR Similarity Difference

2018-02-27 Thread Rick Leir
Rick Did you experiment in the SolrAdmin analysis page? It would possibly tell you whether your chain is doing what you expect. Then you need to consider that boolean logic is not strictly boolean in Solr. There is a Lucidworks blog which explains this nicely; every now and then someone posts th

Re: Configuration of SOLR Cluster

2018-02-27 Thread Shawn Heisey
On 2/27/2018 10:57 AM, James Keeney wrote: > *1 - ZK ensemble not accepting return of node* > Currently, when a ZK node in the ensemble goes down the ensemble is able to > do what it should do and keeps working. However when I bring the 3rd node > back online the other two nodes reject connection r

Re: Configuration of SOLR Cluster

2018-02-27 Thread James Keeney
Shawn - First, it's good to know that this is unusual behavior. That actually helps as it lets me know that I should keep digging. Here are a couple of things that might help. In the configuration I am calling out all three ZK nodes. Here is the configuration of Solr: -DSTOP.KEY=solrrocks -DSTO

Re: When the number of collections exceeds one thousand, the construction of indexing speed drops sharply

2018-02-27 Thread 苗海泉
Thank you, I read under the memory footprint, I set 75% recovery, memory occupancy at about 76%, the other we zookeeper not on a dedicated server, perhaps because of this cause instability. What else do you recommend for me to check? 2018-02-27 22:37 GMT+08:00 Emir Arnautović : > This does not s

Re: Defining Document Transformers in Solr Configuration

2018-02-27 Thread simon
On Tue, Feb 27, 2018 at 5:34 PM, Diego Ceccarelli (BLOOMBERG/ LONDON) < dceccarel...@bloomberg.net> wrote: > I don't think you can define docTrasformer in the SolrConfig at the > moment, I agree it would be a cool feature. > > Maybe one possibility could be to use the update request processors [1]

Re: Configuration of SOLR Cluster

2018-02-27 Thread Shawn Heisey
On 2/27/2018 6:42 PM, James Keeney wrote: -DzkHost=:2181,:2181,:2181 This looks correct, except that with AWS, I have no idea whether you need the internal IP addressing or the external IP addressing.  If all of the machines involved (both servers and clients) are able to communicate on the

Re: Defining Document Transformers in Solr Configuration

2018-02-27 Thread Mikhail Khludnev
Hello, Simon. You can define a search handler where have numcites:[subquery]&numcites.fl=pmid&numcites.q={!terms f=md_c_pmid v=$row.pmid}&numcites.rows=10&numcites.logParamsList=q or something like that. On Tue, Feb 27, 2018 at 11:20 PM, simon wrote: > We do quite complex data pulls from a So

solr src 6.0 ant error

2018-02-27 Thread 苗海泉
I encountered a problem, when I was in the process of compiling solr6.0 source error, I have installed the ant and ivy, and then solr6 source code catalog Executive eclipse ant eclipse would like to generate a project error as follows " Buildfile: D: \ solr-6.0.0-src \ solr-6.0.0 \ build.xml BUILD

Re: Changing Leadership in SolrCloud

2018-02-27 Thread Zahra Aminolroaya
Thanks Shalin. our "zkClientTimeout" is 3, so the leader should be changed by now; However, the previous leader is still active. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Changing Leadership in SolrCloud

2018-02-27 Thread Shalin Shekhar Mangar
When you say it is active, I presume you mean the "state" as returned by the Cluster Status API or as shown on the UI. But is it still the leader? Are you sure the firewall rules are correct? Do you see disconnected or session expiry exceptions in the leader logs? On Wed, Feb 28, 2018 at 12:21 PM,

Re: Gentle reminder RE: Object not fetched because its identifier appears to be already in processing

2018-02-27 Thread Shawn Heisey
On 2/28/2018 12:06 AM, YELESWARAPU, VENKATA BHAN wrote: Thank you for your reply Shawn. I'm not part of that user list so I never received any emails so far. Could you please subscribe me (vyeleswar...@statestreet.com) or let me know the process? Also I would greatly appreciate if you could for