Re: WordDelimiterFilterFactory with Wildcards

2017-07-27 Thread Erick Erickson
bq: To me this seems like a design flaw. The Solr fieldtypes seem like they allow a developer to create types that should handle wildcards intelligently. Well, that's pretty impossible. WordDelimiter(Graph)FilterFactory is a case in point. It's designed to break up on uppercase/lowercase/numeric/n

Re: WordDelimiterFilterFactory with Wildcards

2017-07-27 Thread Webster Homer
It doesn't seem to matter what you do in the query analyzer, if you have a wildcard, it won't use it. Which is exactly the behavior I observed. the solution was to set preserveOriginal="1" and change the etl process to not strip the dashes, letting the index analyzer do that. We have a lot of lega

Re: WordDelimiterFilterFactory with Wildcards

2017-07-27 Thread Saurabh Sethi
Webster, did you try escaping the special character (assuming you did not do what Shawn did by replacing - with some other text and your indexed tokens have -)? On Thu, Jul 27, 2017 at 12:03 PM, Webster Homer wrote: > Shawn, > Thank you for that. I didn't know about that feature of the WDF. It d

Re: WordDelimiterFilterFactory with Wildcards

2017-07-27 Thread Webster Homer
Shawn, Thank you for that. I didn't know about that feature of the WDF. It doesn't help my situation but it's great to know about. Googling solr wildcard searches I found this link http://lucene.472066.n3.nabble.com/Wildcard-search-not-working-with-search-term-having-special-characters-and-digits-t

Re: WordDelimiterFilterFactory with Wildcards

2017-07-26 Thread Erick Erickson
The Admin/Analysis page is useful here. It'll show you what each bit of your query analysis chain does and may well point you to the part of the chain that's the problem. Best, Erick On Wed, Jul 26, 2017 at 11:33 AM, Webster Homer wrote: > checked the Pattern Replace it's OK. Can't use the prese

Re: WordDelimiterFilterFactory with Wildcards

2017-07-26 Thread Webster Homer
checked the Pattern Replace it's OK. Can't use the preserve original since it preserves the hyphens too, which I don't want. It would be best if it didn't touch the * at all On Wed, Jul 26, 2017 at 1:30 PM, Saurabh Sethi wrote: > My guess is PatternReplaceFilterFactory is most likely the cause.

Re: WordDelimiterFilterFactory with Wildcards

2017-07-26 Thread Saurabh Sethi
My guess is PatternReplaceFilterFactory is most likely the cause. Also, based on your query, you might want to set preserveOriginal=1 You can take one filter out at a time and see which one is altering the query. On Wed, Jul 26, 2017 at 11:13 AM, Webster Homer wrote: > 1. KeywordTokenizer - we

Re: WordDelimiterFilterFactory with Wildcards

2017-07-26 Thread Webster Homer
1. KeywordTokenizer - we want to treat the entire field as a single term to parse 2. preserveOriginal = "0" Thought about changing this to 1 3. 6.2.2 This is the fieldtype

Re: WordDelimiterFilterFactory with Wildcards

2017-07-26 Thread Saurabh Sethi
1. What tokenizer are you using? 2. Do you have preserveOriginal="1" flag set in your filter? 3. Which version of solr are you using? On Wed, Jul 26, 2017 at 10:48 AM, Webster Homer wrote: > I have several fieldtypes that use the WordDelimiterFilterFactory > > We have a fieldtype for cas numbers

WordDelimiterFilterFactory with Wildcards

2017-07-26 Thread Webster Homer
I have several fieldtypes that use the WordDelimiterFilterFactory We have a fieldtype for cas numbers. which look like 1234-12-1, numbers separated by hyphens, users often leave out the hyphens and either use spaces or just string the numbers together. The WDF seemed like a great solution especia