Re: Tokenizer question

2012-10-30 Thread Jack Krupansky
Maybe it would be simplest to use a PatternReplaceCharFilter to eliminate the ".jpg", and then use the StandardTokenizer, or use the white space tokenizer and the Word Delimiter Filter. -- Jack Krupansky -Original Message- From: RL Sent: Tuesday, October 30, 2012 3:57 AM To: solr-use

Re: Tokenizer Question

2011-07-20 Thread Jamie Johnson
Thanks, I'll try that now, I'm assuming I need to add the position increment and offset attributes? On Wed, Jul 20, 2011 at 3:44 PM, Chris Hostetter wrote: > > When the QueryParser gives hunks of text to an analyzer, and that analyzer > produces multiple terms, the query parser has to decide how

Re: Tokenizer Question

2011-07-20 Thread Chris Hostetter
When the QueryParser gives hunks of text to an analyzer, and that analyzer produces multiple terms, the query parser has to decide how to build a query out of it. if the terms have identicle position information, then it always builds an "OR" query (this is the typical synonym situation). If

Re: Tokenizer Question

2011-07-20 Thread Jamie Johnson
My use case really isn't names, I just used that as a simplification. I did look at the Synonym filter to see if I could implement a similar filter (if that was a more appropriate place to do so) but even after doing that I ended up with the same result. On Wed, Jul 20, 2011 at 12:07 PM, Kyle Lee

Re: Tokenizer Question

2011-07-20 Thread Kyle Lee
I'm not sure how to accomplish what you're asking, but have you considered using a synonyms file? This would also allow you to catch ostensibly unrelated name substitutes such as Robert -> Bob and Richard -> Dick. On Wed, Jul 20, 2011 at 10:57 AM, Jamie Johnson wrote: > I have a query which star

Re: Tokenizer question

2010-01-11 Thread rswart
Cristal clear. Thanks for your response&time! -- View this message in context: http://old.nabble.com/Tokenizer-question-tp27099119p27123281.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Tokenizer question

2010-01-11 Thread Avlesh Singh
> > If the analyzer produces multiple Tokens, but they all have the same > position then the QueryParser produces a BooleanQuery will all SHOULD > clauses. -- This is what allows simple synonyms to work. > You rock Hoss!!! This is exactly the explanation I was looking for .. it is as simple as it

Re: Tokenizer question

2010-01-11 Thread Chris Hostetter
: q=PostCode:(1078 pw)+AND+HouseNumber:(39-43) : : the resulting parsed query contains a phrase query: : : +(PostCode:1078 PostCode:pw) +PhraseQuery(HouseNumber:"39 43") This stems from some fairly fundemental behavior i nthe QueryParser ... each "chunk" of input that isn't deemed "markup (ie:

Re: Tokenizer question

2010-01-11 Thread rswart
We are using the standard query parser (so no dismax). Fieldtype is solr.TextField with the following query analyzer:

Re: Tokenizer question

2010-01-11 Thread Grant Ingersoll
And also, what query parser are you using? On Jan 11, 2010, at 2:46 PM, Grant Ingersoll wrote: > What do your FieldTypes look like for the fields in question? > > On Jan 10, 2010, at 10:05 AM, rswart wrote: > >> >> Hi, >> >> This is probably an easy question. >> >> I am doing a simple query

Re: Tokenizer question

2010-01-11 Thread Grant Ingersoll
What do your FieldTypes look like for the fields in question? On Jan 10, 2010, at 10:05 AM, rswart wrote: > > Hi, > > This is probably an easy question. > > I am doing a simple query on postcode and house number. If the housenumber > contains a minus sign like: > > q=PostCode:(1078 pw)+AND+H