Re: New comer - Benoit Vanalderweireldt

2016-02-25 Thread Erick Erickson
There are also other ways to help than coding. Documentation, Java docs, writing test cases (look at the code coverage reports and pick something not already covered). Review or comment on patches, work on the new Angular JS UI. Try installing Solr and note any ambiguous docs and suggest better on

Re: importing 4.10.2 solr cloud repository to 5.4.1

2016-02-25 Thread Neeraj Bhatt
Thanks Shawan. My index meets Atomic update req, so I want to use DIH because of its convenience I am in a solr cloud with 5 shards (with a separate zookeeper ensemble), so I will have to put 5 entity tags so that i can give 5 diff urls , one for each shard ? thanks neeraj On Wed, Feb 24, 2016 a

Re: New comer - Benoit Vanalderweireldt

2016-02-25 Thread Shawn Heisey
On 2/25/2016 4:34 PM, Benoit Vanalderweireldt wrote: > I have just joined this mailing list, I would love to contribute to Apache > SOLR (I am a certified Java developer OCA and OCP) > > Can someone guide me and assign me a first task on Jira (my username is : > b.vanalderweireldt) ? Thanks for

Re: New comer - Benoit Vanalderweireldt

2016-02-25 Thread Jan Høydahl
Hi Welcome and thanks for volunteering. Here’s our guide for newcomers: http://wiki.apache.org/solr/HowToContribute Also see the web site http://lucene.apache.org/solr/resources.html#community After reading this, please feel welcome to come back to ask further questions! When it comes to assigni

Re: Query time de-boost

2016-02-25 Thread Jack Krupansky
0.1 is a fractional boost - all intra-query boosts are multiplicative, not additive, so term^0.1 reduces the term by 90%. -- Jack Krupansky On Wed, Feb 24, 2016 at 11:29 AM, shamik wrote: > Binoy, 0.1 is still a positive boost. With title getting the highest > weight, > this won't make any diff

Re: Query time de-boost

2016-02-25 Thread Walter Underwood
Another approach is to boost everything but that content. This bq should work: *:* -ContentGroup:”Developer’s Documentation” Or a function query in the boost parameter, with an if statement. Or make ContentGroup an enum with different values for each group, and use a function query to boost by

New comer - Benoit Vanalderweireldt

2016-02-25 Thread Benoit Vanalderweireldt
Dear solr users community, I have just joined this mailing list, I would love to contribute to Apache SOLR (I am a certified Java developer OCA and OCP) Can someone guide me and assign me a first task on Jira (my username is : b.vanalderweireldt) ? Cheers Benoit

Re: Query time de-boost

2016-02-25 Thread Binoy Dalal
According to the edismax documentation, negative boosts are supported, so you should certainly give it a try. https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser On Fri, 26 Feb 2016, 03:45 shamik wrote: > Emir, I don't Solr supports a negative boosting *^-99* syntax

Re: Why QueryWeight with Custom Similarity

2016-02-25 Thread Markus, Sascha
Sorry clicked send to early :-) with the above additional code the calculation is done by the default similarity and the behaviour is as expected. I think this is an issue of the implementation but I didn't find one in jira. Should I create one? Cheers Sascha On Thu, Feb 25, 2016 at 11:12 PM,

Re: Query time de-boost

2016-02-25 Thread shamik
Emir, I don't Solr supports a negative boosting *^-99* syntax like this. I can certainly do something like: bq=(*:* -ContetGroup:"Developer's Documentation")^99 , but then I can't have my other bq parameters. This doesn't work --> bq=Source:simplecontent^10 Source:Help^20 (*:* -ContetGroup:"Devel

Re: Why QueryWeight with Custom Similarity

2016-02-25 Thread Markus, Sascha
Hi, I finally found the source of the problem I'm having with the custom similarity. The setting: - Solr 5.4.1 - the SpecialSimilarity extends ClassicSimilarity - for one field this similarity is configured. Everything else uses ClassicSimilarity because of Result: - most calculation is done by

Re: Stopping Solr JVM on OOM

2016-02-25 Thread CP Mishra
Solr & Lucene dev folks must be catching Throwable for a reason. Anyway, I am asking for solutions that I can use. On Thu, Feb 25, 2016 at 3:06 PM, Fuad Efendi wrote: > The best practice: do not ever try to catch Throwable or its descendants > Error, VirtualMachineError, OutOfMemoryError, and et

Re: Stopping Solr JVM on OOM

2016-02-25 Thread Fuad Efendi
The best practice: do not ever try to catch Throwable or its descendants Error, VirtualMachineError, OutOfMemoryError, and etc.  Never ever. Also, do not swallow InterruptedException in a loop. Few simple rules to avoid hanging application. If we follow these, there will be no question "what i

Stopping Solr JVM on OOM

2016-02-25 Thread CP Mishra
Looking at the previous threads (and in our tests), oom script specified at command line does not work as OOM exception is trapped and converted to RuntimeException. So, what is the best way to stop Solr when it gets in OOM state? The only way I see is to override multiple handlers and do System.e

Storage of internal value

2016-02-25 Thread Jens Ivar Jørdre
Hi all I am looking for ways of having the functionality of https://issues.apache.org/jira/browse/SOLR-1997 on Solr 5.X. Is there an alternate way to achieve this rather than creating the field type suggested by SOLR-1997? If not possible would

Timeout connecting to index replication master causing slave core failure (0 documents).

2016-02-25 Thread Russell McOrmond
We are running "5.4.0 1718046 - upayavira - 2015-12-04 23:16:46" on a series of index replication slaves of a single master. The master is behind a VPN connection to a slower network. There are times when that network might have timouts, and we need our applications to be robust against that type

Re: Solr 6.0

2016-02-25 Thread Renaud Delbru
Hi Shawn, On 25/02/16 14:07, Shawn Heisey wrote: The CDCR functionality is currently present in the master branch, but I do not know for sure whether it will be included in the 6.0 release. I am not involved with that feature and have no idea how stable the code is. CDCR is stable and is runnin

Re: Solr 6.0

2016-02-25 Thread Yonik Seeley
On Thu, Feb 25, 2016 at 9:07 AM, Shawn Heisey wrote: > http://yonik.com/solr-6/ For those of you in the NYC area, I'm giving a talk soon on Solr 6 (and depending on the timing, "Preview" could turn into "Overview" :-) NYC Apache Lucene/Solr Meetup Solr 6 Feature Preview Wednesday, March 9, 2016 6

Suggester in SOLR 5.4.2

2016-02-25 Thread jori.gielis
Hi all, In our setup we are using SOLR for regular search and suggestions. For the auto complete function we are using SuggestComponent. In our search index file it is possible to have titles with the same name. This works as expected for search because every title has a different subtitle and is

Re: Solr 6.0

2016-02-25 Thread Shawn Heisey
On 2/25/2016 6:32 AM, Steven White wrote: > Where can I learn more about the upcoming Solr 6.0? I understand the > release date cannot be know, but I hope the features and how it difference > from 5.x is known. Yonik (creator of Solr) has a blog post about features in the upcoming version: http:

Re: (Solr 5.5) How do beginners modify dynamic schema now that it is default?

2016-02-25 Thread Jan Høydahl
Good point. We should definitely aim for GUI support for adding field types. Perhaps also support a text-field where people can copy-paste a Schema JSON command, e.g. “add-field-type”, that will be processed just as if it was POST’ed. A cool extra feature would be to detect whether people paste

Re: both way synonyms with ManagedSynonymFilterFactory

2016-02-25 Thread Jan Høydahl
Created https://issues.apache.org/jira/browse/SOLR-8737 to handle this -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 22. feb. 2016 kl. 11.21 skrev Jan Høydahl : > > Hi > > Did you get any Further with this? > I reproduced your situation with Solr 5.5. > > Think t

Solr 6.0

2016-02-25 Thread Steven White
Hi, Where can I learn more about the upcoming Solr 6.0? I understand the release date cannot be know, but I hope the features and how it difference from 5.x is known. Thank you Steve

faceting on correlated multi-valued fields?

2016-02-25 Thread Andreas Hubold
Hi, I'm thinking about indexing articles with tags in a denormalized way as follows multiValued="true"/> stored="false" multiValued="true"/> An article can have multiple tags. Each tag has a description and an ID. The multi-valued fields tagIds and tagDescriptions have the same length a

Re: WhitespaceTokenizerFactory and PathHierarchyTokenizerFactory

2016-02-25 Thread Anil
HI, search can be any free text or ip address or path and Special characters should not be treated as text delimiters. 10.20 must return 10.20.30.112 /var/log must return /var/log/bigdata Please let me know if you need any additional details. Thanks. Regards, Anil On 25 February 2016 at 18:20

Re: WhitespaceTokenizerFactory and PathHierarchyTokenizerFactory

2016-02-25 Thread Jack Krupansky
You still haven't stated exactly what your query requirements are. In Solr you should always start with an analysis of how people will expect to query the data and then work backwards to how to store and index the data to achieve the desired queries. Note that the standard tokenizer will tokenize

multiple sources or mysql imports

2016-02-25 Thread John Blythe
hi all, i'm currently populating my documents via a mysql query. it occurred to me that i have another source of similar data that would be helpful to use that resides in the same database, but in another table. the two tables share nothing relationally so there's no joining that can occur that i

Re: Query time de-boost

2016-02-25 Thread Emir Arnautovic
Hi Shamik, You are righ boosting with values that are lower than 1 is still positive, but you can boost with negative value and that should do the trick so you can do bq=ContenGroup-local:Developer^-99 (note that it can result in negative score). If you need more than just Developer/Others you

Re: What search metrics are useful?

2016-02-25 Thread Charlie Hull
On 24/02/2016 16:20, Walter Underwood wrote: Click through rate (CTR) is fundamental. That is easy to understand and integrates well with other business metrics like conversion. CTR is at least one click anywhere in the result set (first page, second page, …). Count multiple clicks as a single su