Re: Replication and soft commits for NRT searches

2015-10-14 Thread MOIS Martin (MORPHO)
Hello, the background for my question is that one of the requirements for our injection tool is that it should report that a new document has been successfully enrolled to the cluster only if it is available on all replicas. The automated integration test for this feature will submit a document

Re: Run Solr 5.3.0 as a Service on Windows using NSSM

2015-10-14 Thread Zheng Lin Edwin Yeo
Hi Adrian, The main Solr is able to startup, but the cores cannot be loaded. I can get to the admin page once in a while (not always the case), and if I can get there, it will says there's no cores available. Otherwise, the browser will just say "This page can't be displayed". If I use the URL to

Autostart Zookeeper and Solr using scripting

2015-10-14 Thread Adrian Liew
Hi, I am trying to implement some scripting to detect if all Zookeepers have started in a cluster, then restart the solr servers. Has anyone achieved this yet through scripting? I also saw there is the ZookeeperClient that is available in .NET via a nuget package. Not sure if this could be als

RE: Run Solr 5.3.0 as a Service on Windows using NSSM

2015-10-14 Thread Adrian Liew
Hi Edwin, Solr 5.3.0 seems to be working for me using NSSM. I am operating on a Windows Server 2012. I did put start -f -p 8983. Are you getting errors? Is Solr not starting up? Best regards, Adrian -Original Message- From: Zheng Lin Edwin Yeo [mailto:edwinye...@gmail.com] Sent: Thu

Re: Run Solr 5.3.0 as a Service on Windows using NSSM

2015-10-14 Thread Zheng Lin Edwin Yeo
Hi Anders, Yes, I did put the -f param for running it in foreground. I put start -f -p 8983 in the Arugments parameters in NSSM service installer. Is that the correct place to put for Solr 5.3.0? I did the same way for Solr 5.1 and it was working then. I'm using Windows 8.1. Regards, Edwin On

RE: Run Solr 5.3.0 as a Service on Windows using NSSM

2015-10-14 Thread Adrian Liew
Hi, I am trying to implement some scripting to detect if all Zookeepers have started in a cluster, then restart the solr servers. Has anyone achieved this yet through scripting? I also saw there is the ZookeeperClient that is available in .NET via a nuget package. Not sure if this could be als

Re: partial search EdgeNGramFilterFactory

2015-10-14 Thread Brian Narsi
Thank you Erick. Yes it was the default search field. So for the following SellerName: 1) cardinal healthcare products 2) cardinal healthcare 3) postoperative cardinal healthcare 4) surgical cardinal products My requirement is: q=SellerName:cardinal - all 4 records returned q=SellerName:healthca

Re: Can I use tokenizer twice ?

2015-10-14 Thread vitaly bulgakov
Steve, /You could achieve what you want by copying to another field and defining a separate analyzer for each. One would create shingles, and the other edge ngrams. / Could you please elaborate this. I am not sure I understand how to do it by using copyField. -- View this message in context

Re: Grouping facets: Possible to get facet results for each Group?

2015-10-14 Thread Peter Sturge
Yes, you are right about that - I've used pivots before and they do need to be used judiciously. Fortunately, we only ever use single-value fields, as it gives some good advantages in a heavily sharded environment. Our document structure is, by it's very nature always flat, so it could be an impedi

Re: Can I use tokenizer twice ?

2015-10-14 Thread Steve Rowe
Hi, Analyzers must have exactly one tokenizer, no more and no less. You could achieve what you want by copying to another field and defining a separate analyzer for each. One would create shingles, and the other edge ngrams. Steve > On Oct 14, 2015, at 11:58 AM, vit wrote: > > I have Sol

Re: AutoComplete Feature in Solr

2015-10-14 Thread Salman Ansari
Actually what you mentioned Alessandro is something interesting for me. I am looking to boost the ranking of some suggestions based on some dynamic criteria (let's say how frequent they are used). Do I need to update the boost field each time I request the suggestion (to capture the frequency)? If

Re: slow queries

2015-10-14 Thread Pushkar Raste
You may want to start solr with following settings to enable logging GC details. Here are some flags you might want to enable. -Xloggc:/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintHeapAtGC Onc

Re: slow queries

2015-10-14 Thread Erick Erickson
bq: They are definetly cached. The second time runs in no time. That's not what I was referring to. Submitting the same query over will certainly hit the queryResultCache and return in almost no time. What I meant was do things like vary the fq clause you have where you've set cache=false. Or var

Re: partial search EdgeNGramFilterFactory

2015-10-14 Thread Erick Erickson
try adding &debug=true to your query. The query q=SellerName:cardinal he actually parses as q=SellerName:cardinal defaultSearchField:he so I suspect you're getting on the default search field. I'm not sure EdgeNGram is what you want here though. That only grams individual tokens, so CARDINAL is g

partial search EdgeNGramFilterFactory

2015-10-14 Thread Brian Narsi
I have the following fieldtype in my schema: and the following field: With the following data: SellerName:CARDINAL HEALTH When I do the following search q:SellerName:cardinal I get back the results with SellerName: CARDINAL HEALTH (correct) or I do the search q:SellerName:

Re: slow queries

2015-10-14 Thread Lorenzo Fundaró
On 14 October 2015 at 18:18, Pushkar Raste wrote: > Consider > 1. Turning on docValues for fields you are sorting, faceting on. This will > require to reindex your data > Yes. I am considering doing this. > 2. Try using TrieInt type field you are trying to do range search on (you > may have to

Re: slow queries

2015-10-14 Thread Lorenzo Fundaró
<> "debug": { "rawquerystring": "*:*", "querystring": "*:*", "parsedquery": "(+MatchAllDocsQuery(*:*))/no_coord", "parsedquery_toString": "+*:*", " explain": { "Product:47047358": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:3223": "\n1.0 = (MATCH) MatchAllDoc

Re: slow queries

2015-10-14 Thread Pushkar Raste
Consider 1. Turning on docValues for fields you are sorting, faceting on. This will require to reindex your data 2. Try using TrieInt type field you are trying to do range search on (you may have to fiddle with precisoinStep) to balance index size vs performance. 3. If slowness is intermittent - tu

Re: are there any SolrCloud supervisors?

2015-10-14 Thread Jeff Wartes
I’m aware of two public administration tools: This was announced to the list just recently: https://github.com/bloomreach/solrcloud-haft And I’ve been working in this: https://github.com/whitepages/solrcloud_manager Both of these hook the Solrcloud client’s ZK access to inspect the cluster state

Can I use tokenizer twice ?

2015-10-14 Thread vit
I have Solr 4.2 I need to do the following: 1. white space tokenize 2. create shingles 3. use EdgeNGramFilter for each word in shingles, but not in a shingle as a string So can I do this? * * * * -- View this message in context: http://lucene.472066.n3.nabble.com/Can-I-use-tokenizer-twice

Re: Run Solr 5.3.0 as a Service on Windows using NSSM

2015-10-14 Thread Anders Thulin
Did you add the f param for running it in foreground? I noticed that the Solr service was restarted indefinetly when running it as a background service. its also needed to stop the windows service. This test worked well here (on Windows 2012): REM Test for running solr 5.3.1 as a windows service

Re: Replication and soft commits for NRT searches

2015-10-14 Thread Erick Erickson
bq: If a timeout between shard leader and replica can lead to a smaller rf value (because replication has timed out), is it possible to increase this timeout in the configuration? Why do you care? If it timed out, then the follower will no longer be active and will not serve queries. The Cloud vie

Bioinformatics search event in Cambridge UK Feb 3rd & 4th 2016

2015-10-14 Thread Charlie Hull
Hi all, We're helping to run an event in Cambridge UK next year which will be an open workshop on search for bioinformatics: http://www.ebi.ac.uk/pdbe/about/events/open-source-search-bioinformatics Do please spread the word to anyone working with biological data and open source search! It's li

Re: slow queries

2015-10-14 Thread Susheel Kumar
Hi Lorenzo, Can you provide which solr version you are using, index size on disks & hardware config (memory/processor on each machine. Thanks, Susheel On Wed, Oct 14, 2015 at 6:03 AM, Lorenzo Fundaró < lorenzo.fund...@dawandamail.com> wrote: > Hello, > > I have following conf for filters and co

RE: How to formulate query

2015-10-14 Thread Prasanna S. Dhakephalkar
Hi Susheel, Mikhail, Erick, Thanks for replies. I need to learn more. Regards, Prasanna. -Original Message- From: Susheel Kumar [mailto:susheel2...@gmail.com] Sent: Tuesday, October 13, 2015 12:54 AM To: solr-user@lucene.apache.org Subject: Re: How to formulate query Hi Prassana, Thi

Re: slow queries

2015-10-14 Thread Erick Erickson
A couple of things don't particularly make sense here: You specify edismax, q=*:* yet you specify qf= You're searching across whatever you defined as the default field in the request handler. What do you see if you attach &debug=true to the query? I think this clause is wrong: (cents_ri: [* 3000]

slow queries

2015-10-14 Thread Lorenzo Fundaró
Hello, I have following conf for filters and commits : Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57, acceptableSize=60, cleanupThread=false, timeDecay=true, autowarmCount=8, regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd) ${solr.autoCommit.max

Re: Using SimpleNaiveBayesClassifier in solr

2015-10-14 Thread Alessandro Benedetti
ahahah absolutely not, you don't sound dumb. You need only a basic knowledge of how Lucene manage IndexReaders and IndexSearchers. On 14 October 2015 at 09:08, Yewint Ko wrote: > Thank Ales and Tommaso for your replies > > So, is it like the classifier query the whole index db and load onto mem

Re: AutoComplete Feature in Solr

2015-10-14 Thread Alessandro Benedetti
using the suggester feature you can in some case rank the suggestions based on an additional numeric field. It's not your use case, you actually want to use a search handler with a well defined schema that will allow you for example to query on an edge ngram token filtered field, applying a geo dis

Re: Grouping facets: Possible to get facet results for each Group?

2015-10-14 Thread Alessandro Benedetti
mmm let's say that nested facets are a subset of Pivot Facets. if pivot faceting works with the classic flat document structure, the sub facet are working with any nested structure. So be careful about pivot faceting in a flat document with multi valued fields, because you lose the relation across

Re: Solr Pagination

2015-10-14 Thread Jan Høydahl
I have not benchmarked various number of segments at different sizes on different HW etc, so my hunch could very well be wrong for Salman’s case. I don’t know how frequent updates there is to his data either. Have you done #segments benchmarking for your huge datasets? -- Jan Høydahl, search solu

Re: Using SimpleNaiveBayesClassifier in solr

2015-10-14 Thread Yewint Ko
Thank Ales and Tommaso for your replies So, is it like the classifier query the whole index db and load onto memory first before running tokenizer against InputDocument? It sounds like if I don't close the classifier and my index is big, i might need bigger machine. Anyway to reverse the order? D

Re: catchall fields or multiple fields

2015-10-14 Thread elisabeth benoit
Thanks for your suggestion Jack. In fact we're doing geographic search (fields are country, state, county, town, hamlet, district) So it's difficult to split. Best regards, Elisabeth 2015-10-13 16:01 GMT+02:00 Jack Krupansky : > Performing a sequence of queries can help too. For example, if