Re: [SolrCloud] No config data found after SolrCloud server restart

2018-02-08 Thread A.Guptar
Hello Shawn, I managed to finally find the issue. The zoo.cfg in Zookeeper and Solr was different and there was an autopurge option enabled in Zookeeper. After removing the autopurge option and aligning the zoo.cfg in Zookeeper and Solr, the configs and aliases no longer went missing after serve

Re: Relevancy Tuning For Solr With Apache Nutch 2.3

2018-02-08 Thread Mukhopadhyay, Aratrika
As a follow I question , I went ahead and disabled the boosting from nutch's side . Now all the documents coming in from nutch have a default value of 1.0 . I know that one of you had asked me to remove any other boosts I had including the one from nutch . When you said remove the boost what did

Re: Hard commits blocked | non-solrcloud v6.6.2

2018-02-08 Thread mmb1234
> Setting openSearcher to false on autoSoftCommit makes no sense. That was my mistake in my solrconfig.xml. Thank you for identifying it. I have corrected it. I then removed my custom element from my solrconfig.xml and both hard commit and /solr/admin/core hang issues seemed to go way for a cou

Re: High CPU and Physical Memory Usage in solr with 4000 user load

2018-02-08 Thread Shawn Heisey
On 2/8/2018 12:41 AM, rubi.hali wrote: > We are using Solr-6.6.2 with one master and 4 Slaves each having 4 CPU core > and 16GB RAM. We were doing load testing with 4000 users and 800 odd search > keywords which resulted into 95% of CPU usage in less than 3 minutes and > affected our QUERY Response

Re: Hard commits blocked | non-solrcloud v6.6.2

2018-02-08 Thread Shawn Heisey
On 2/7/2018 9:01 PM, mmb1234 wrote: > I am seeing that after some time hard commits in all my solr cores stop and > each one's searcher has an "opened at" date to be hours ago even though they > are continuing to ingesting data successfully (index size increasing > continuously). > > http://localho

lat/long (location ) field context filters for autosuggestions

2018-02-08 Thread Deepak Udapudi
Hi all, We are trying to apply context filter for autosuggestions by passing the lat/long of a particular location. We know that, we can apply the context field for a particular field which takes one value. How do we apply the same context filtering for a field which takes two values lat/long i

RE: SolrCloud: How best to do backups?

2018-02-08 Thread Davis, Daniel (NIH/NLM) [C]
I would suggest you have a separate EBS to save the backup from each server. These EBS volumes would be mounted all the time, but only modified by a backup. Then, you can create an AWS Lambda function that runs on a periodic trigger from CloudWatch, and does the following: - run the backu

Re: Facets OutOfMemoryException

2018-02-08 Thread Shawn Heisey
On 2/8/2018 5:36 AM, LOPEZ-CORTES Mariano-ext wrote: > We are just 1 field "status" in facets with a cardinality of 93. > > We realize that increasing memory will work. But, you think it's necessary? > > Thanks in advance. 2GB for 27 million docs seems a little bit small even WITHOUT facets.  You

Re: SolrCloud: How best to do backups?

2018-02-08 Thread John Bickerstaff
This article may be of some use... What isn't clear is what effect either of the two strategies mentioned would have on serving responses to queries... It would be nice if the backup was a "low priority thread" compared to the needs of the server in question, but I've never had to dig that deep b

Re: SolrCloud: How best to do backups?

2018-02-08 Thread John Bickerstaff
Hmmm... Can you (fairly quickly) reproduce this AWS environment (including the indexes)? Or does it require that several week process to provision new Solr boxes...? What happens now if one of those ec2 instances gets into trouble? Do you have autoscaling groups set up? On Thu, Feb 8, 2018 at

SolrCloud: How best to do backups?

2018-02-08 Thread Kelly, Frank
We have a large SolrCloud deployment on AWS (350m documents spread across 3 collections, each with 3 shards and 3 replicas) Running on 3 x r3.xlarge’s with the data stored on EBS drives with Provisioned IOPS Currently it’s handling 38m requests per day My question is how best should we back-up

Re: Hard commits blocked | non-solrcloud v6.6.2

2018-02-08 Thread mmb1234
> If you issue a manual commit > (http://blah/solr/core/update?commit=true) what happens? That call never returned back to client browser. So I also tried a core reload and did capture in the thread dump. That too never returned. "qtp310656974-1022" #1022 prio=5 os_prio=0 tid=0x7ef25401000

RE: Relevancy Tuning For Solr With Apache Nutch 2.3

2018-02-08 Thread Mukhopadhyay, Aratrika
Hi Allesandro , Nutch provides a certain set of fields to solr via the solr-index mapping.xml. One of the fields that nutch is providing is a field named boost. This value is calculated via the scoring-opic plugin within nutch and is an index time boost. I would just like to r

Re: Using Context field unable to get autosuggestion for zip code having '-'.

2018-02-08 Thread Alessandro Benedetti
With that configuration you want to auto suggest Office names filtering them by zip code. Not sure why you perform an ngram analysis though. How do you want to filter by zip code ? Exact Search ? Edge ngram ? Regards - --- Alessandro Benedetti Search Consultant, R&D Software En

RE: Relevancy Tuning For Solr With Apache Nutch 2.3

2018-02-08 Thread Alessandro Benedetti
uhm, not really. I am just saying that if you are running a version >=6.6.0 keep in mind that the index time boost you think you are enabling is not actually working anymore. You are now mentioning a nutch boost field... Can you elaborate that ? It may be a completely different thing... How is thi

Re: Hard commits blocked | non-solrcloud v6.6.2

2018-02-08 Thread Erick Erickson
This is very odd. Do you by any chance have custom code in place that's not closing searchers properly? If you take a heap dump, how many searchers to you have open? If you issue a manual commit (http://blah/solr/core/update?commit=true) what happens? Best, Erick On Wed, Feb 7, 2018 at 8:01 PM,

RE: Relevancy Tuning For Solr With Apache Nutch 2.3

2018-02-08 Thread Mukhopadhyay, Aratrika
Hi Allesandro , If I understand correctly then upgrading the Solr that we are running from 6.4 to 6.6 should automatically remove the nutch boost field ? Aratrika -Original Message- From: Alessandro Benedetti [mailto:a.benede...@sease.io] Sent: Thursday, February 08, 201

RE: Facets OutOfMemoryException

2018-02-08 Thread Erick Erickson
First thing I'd try is setting docValues="true". You must blow away your entire index, like I'm -rf data And reindex. On Feb 8, 2018 04:37, "LOPEZ-CORTES Mariano-ext" < mariano.lopez-cortes-...@pole-emploi.fr> wrote: > We are just 1 field "status" in facets with a cardinality of 93. > > We re

Re: Opinions on ExtractingRequestHandler

2018-02-08 Thread Charlie Hull
On 08/02/2018 11:47, Frederik Van Hoyweghen wrote: Hey everyone, What are your experiences on making (in production) use of Solr's ExtractingRequestHandler? I've been reading some mixed remarks so I was wondering what your actual experiences with it are. Personally, I feel like setting up a se

Re: Opinions on ExtractingRequestHandler

2018-02-08 Thread Sreenivas.T
Frederik, We have also used separate service, which uses tika & then use solrj to index the content. The main reason, why we went for this approach is to have flexibility to manipulate/transform data over and above what tika does. What I understand is that, if there is no other transformation nee

High CPU and Physical Memory Usage in solr with 4000 user load

2018-02-08 Thread rubi.hali
Hi We are using Solr-6.6.2 with one master and 4 Slaves each having 4 CPU core and 16GB RAM. We were doing load testing with 4000 users and 800 odd search keywords which resulted into 95% of CPU usage in less than 3 minutes and affected our QUERY Responses. There was spike in physical memory also

Using Context field unable to get autosuggestion for zip code having '-'.

2018-02-08 Thread Nareshkumar P
Hi Team, Problem: Unable to get autosuggestion for zip code having '-'. We are using the context field as part of autosuggestion search. when we try with context search, we are not getting results for zip code having '-', ex:"92240-3064". Hence we did following the change in manager-schema and

RE: Payload fields

2018-02-08 Thread Brian Yee
I also noticed a weird pattern with the results. This is from doing fl=Zone01:payload(DeliveryPayload,01) for each zone. Can't figure out why this would happen. "DeliveryPayload": "01|20180210 02|20180211 03|20180212 04|20180213 05|20180214 06|20180215 07|20180216 08|20180217 09|

RE: Relevancy Tuning For Solr With Apache Nutch 2.3

2018-02-08 Thread Alessandro Benedetti
With : boost from nutch's side. If you refer to Index Time boost, this has been deprecated time ago[1] At least from 6.6.0. [1] http://lucene.apache.org/solr/6_6_0/solr-solrj/deprecated-list.html - --- Alessandro Benedetti Search Consultant, R&D Software Engineer, Director Sea

Re: Best Practice about solr cloud schema

2018-02-08 Thread Pratik Patel
That makes it clear. Thanks a lot for your help. Pratik On Feb 7, 2018 10:33 PM, "Erick Erickson" wrote: > It can pretty much be used as-is, _except_ > > you'll find one or more entries in your request handlers like: > _text_ > > Change "_text_" to something in your schema, that's the defau

Re: Design Question

2018-02-08 Thread Deepthi P
Hi Emir, Thank you. The dictionary is static but the descriptions are long text. If i denormalize each document will have 100+ descriptions and I have 8 million records in collection. There is lot of repetition of descriptions and the index becomes large. I am trying to avoid that. Also data load

RE: Relevancy Tuning For Solr With Apache Nutch 2.3

2018-02-08 Thread Mukhopadhyay, Aratrika
Thank you Charlie. This has been very helpful. The reason one boost value is 2.0 while the other is 0.03 is simply because I wasn't sure if the boost I was applying the first place may have been too "gentle". I will start by disabling the boost from nutch's side and install quepid as per your s

Re: Clusterstatus Action

2018-02-08 Thread Chris Ulicny
Got a chance to take a look at the source on master branch for the CLUSTERSTATUS action, and it just passes the parameter on as given instead of splitting it. Opened a JIRA issue with the start of a patch: SOLR-11950 On Wed, Jan 31, 2018 at 7:53

RE: Facets OutOfMemoryException

2018-02-08 Thread LOPEZ-CORTES Mariano-ext
We are just 1 field "status" in facets with a cardinality of 93. We realize that increasing memory will work. But, you think it's necessary? Thanks in advance. -Message d'origine- De : Zisis T. [mailto:zist...@runbox.com] Envoyé : jeudi 8 février 2018 13:14 À : solr-user@lucene.apache.o

Re: Facets OutOfMemoryException

2018-02-08 Thread Zisis T.
I believe that things like the following will affect faceting memory requirements -> how many fields do you facet on -> what is the cardinality of each one of them -> What is you QPS rate but 2GB for 27M documents seems too low. Did you try to increase the memory on Solr's JVM? -- Sent from:

Re: Spellcheck collations results

2018-02-08 Thread Alessandro Benedetti
Given this configurations you may state that if no collation is returned there was no collation returning results after : - getting back a maximum of 7 corrections for mispelled terms - getting a max of 10.000 combinations of collations to extendedResults - test 3 collations against the index to ch

Re: Judging the MoreLikeThis results for relevancy

2018-02-08 Thread Alessandro Benedetti
Hi, I have been personally working a lot with the MoreLikeThis and I am close to contribute a refactor of that module ( to break up the monolithic giant facade class mostly) . First of all the MoreLikeThis handler will return the original document ( not scored) + the similar documents(scored). The

Opinions on ExtractingRequestHandler

2018-02-08 Thread Frederik Van Hoyweghen
Hey everyone, What are your experiences on making (in production) use of Solr's ExtractingRequestHandler? I've been reading some mixed remarks so I was wondering what your actual experiences with it are. Personally, I feel like setting up a separate service which is solely responsible for parsin

Facets OutOfMemoryException

2018-02-08 Thread LOPEZ-CORTES Mariano-ext
We are experimentig memory problems regarding facets filters (OutOfMemory java heap). If we disable facets, it works ok. Our infrastructure : 3 nodes Solr 2048 MB RAM 3 nodes Zookeeper 1024 MB RAM Size : 27 millions of documents Any ideas ? Thanks in advance !

Re: facet.method=uif not working in solr cloud?

2018-02-08 Thread Toke Eskildsen
On Fri, 2018-02-02 at 17:40 -0800, Wei wrote: > I tried to debug a bit and see that when executing on a cloud solr > server, although I put > facet.field=color&q=*:*&facet.method=uif&facet.mincount=1 in > the request url, at the point it reaches SimpleFacet inside > req.params it somehow has been r

Re: Relevancy Tuning For Solr With Apache Nutch 2.3

2018-02-08 Thread Charlie Hull
On 07/02/2018 21:59, Mukhopadhyay, Aratrika wrote: Hello , I am attempting to tune my results that I retrieve from solr to boost the importance of certain fields. The syntax of the query I am using is as follows : http://localhost:8983/solr/housegov_data/select?indent=on&q=QUERY&defTy