Re: Wildcard search makes no sense!!

2014-10-02 Thread waynemailinglist
Ok I think I understand your points there. Just clarify say if the term was "Large increased" and my filters went something like: Large|increased Large|increase|increased large|increase|increased the final tokens indexed would be large|increase|increased ? Once again thanks for all the help.

Re: Wildcard search makes no sense!!

2014-10-02 Thread Erick Erickson
right, prior to 3.6, the standard way to handle wildcards was to, essentially, pre-analyze the terms that had wildcards. This works fine for simple filters, things like lowercasing for instance, but doesn't work so well for things like stemming. So you're doing what can be done at this point, but

Re: Wildcard search makes no sense!!

2014-10-02 Thread Shawn Heisey
On 10/2/2014 4:33 AM, waynemailinglist wrote: > Something that is still not clear in my mind is how this tokenising works. > For example with the filters I have when I run the analyser I get: > Field: Hello You > > Hello|You > Hello|You > Hello|You > hello|you > hello|you > > > Does this mean th

Re: Wildcard search makes no sense!!

2014-10-02 Thread waynemailinglist
Many many thanks for the replies - it was helpful for me to start understanding how this works. I'm using 3.5 so this goes to explain a lot. What I have done is if the query contains a * I make the query lowercase before sending to solr. This seems to have solved this issue given your explanation

Re: Wildcard search makes no sense!!

2014-10-01 Thread Erick Erickson
Two things: 1> what version of Solr are you using? If it's prior to 3.6, then the bits that handle applying lowercaseFilter to wildcards isn't in the code. 2> what do you see if you add &debug=query? I just tried it with your analysis chain and it seemed to work. Did you completely blow your ind

Re: Wildcard search makes no sense!!

2014-10-01 Thread Alexandre Rafalovitch
If you use "*" you use Multiterm analysis path, which is semi-hidden and is a lot more limited to the things done with normal tokens: https://wiki.apache.org/solr/MultitermQueryAnalysis The Analyzer components that are NOT multiterm aware cannot be used that way. Looking at: http://www.solr-start.

Re: Wildcard search makes no sense!!

2014-10-01 Thread waynemailinglist
I'm still stuck on this actually. I would really appreciate any pointers. If I search for : query 1: Κώστας result: Κώστας query 2: Κώστα* result: I've looked at the analyser but I don't really understand what I'm looking at if I'm honest. It gives the output: Field (name): title Field value: Κ

Re: Wildcard search makes no sense!!

2014-10-01 Thread waynemailinglist
Ahmet - many thanks - I removed the EnglishPorterFilterFactory and reindexed and this seems to behave as expected now. Jack - thanks aswell - I'm very much a noob with this, and thats a great tip. -- View this message in context: http://lucene.472066.n3.nabble.com/Wildcard-search-makes-no-sen

Re: Wildcard search makes no sense!!

2014-10-01 Thread Jack Krupansky
The presence of a wildcard in a query term short circuits some portions of the analysis process. Some token filters like lower case can still be performed on the query terms, but others, like stemming, cannot. So, either simplify the analysis (be more selective of what token filters you use), or

Re: Wildcard search makes no sense!!

2014-10-01 Thread Toke Eskildsen
On Wed, 2014-10-01 at 13:16 +0200, Wayne W wrote: > query 2: capit* > result: Capital Health > > query 3: capita* > result: You are likely using a stemmer for the field: "Capital Health" gets indexed as "capit" and "health", so there are no tokens starting with "capita". Turn off the stemmer or

Re: Wildcard search makes no sense!!

2014-10-01 Thread Ahmet Arslan
Hi, Probably you have stemmer and it is eating up Capital to capit. Thats the reason. Either remove stemmer from analyser chain or add keyword repeat filter. Ahmet On Wednesday, October 1, 2014 2:16 PM, Wayne W wrote: Hi, I don't understand this at all. We are indexing some contact names.