WordDelimiterFilter On Friday 24 September 2010 02:42:52 Dennis Gearon wrote: > WDF is not WTF(what I think when I see WDF), right ;-) > > What is WDF? > > Dennis Gearon > > Signature Warning > ---------------- > EARTH has a Right To Life, > otherwise we all die. > > Read 'Hot, Flat, and Crowded' > Laugh at http://www.yert.com/film.php > > --- On Thu, 9/23/10, Markus Jelsma <markus.jel...@buyways.nl> wrote: > > From: Markus Jelsma <markus.jel...@buyways.nl> > > Subject: RE: Search a URL > > To: solr-user@lucene.apache.org > > Date: Thursday, September 23, 2010, 2:11 PM > > Try setting generateWordParts=1 in > > your WDF. Also, having a WhitespaceTokenizer makes little > > sense for URL's, there should be no whitespace in a URL, the > > StandardTokenizer can tokenize a URL. Anyway, the problem is > > your WDF. > > > > -----Original message----- > > From: Max Lynch <ihas...@gmail.com> > > Sent: Thu 23-09-2010 23:00 > > To: solr-user@lucene.apache.org; > > > > Subject: Search a URL > > > > Is there a tokenizer that will allow me to search for parts > > of a URL? For > > example, the search "google" would match on the data " > > http://mail.google.com/dlkjadf" > > > > This tokenizer factory doesn't seem to be sufficient: > > > > <fieldType name="text_standard" > > class="solr.TextField" > > positionIncrementGap="100"> > > <analyzer type="index"> > > <tokenizer > > class="solr.WhitespaceTokenizerFactory"/> > > <filter > > class="solr.WordDelimiterFilterFactory" > > generateWordParts="0" generateNumberParts="1" > > catenateWords="1" > > catenateNumbers="1" catenateAll="0" > > splitOnCaseChange="1"/> > > <filter > > class="solr.LowerCaseFilterFactory"/> > > <filter > > class="solr.SnowballPorterFilterFactory" > > language="English" protected="protwords.txt"/> > > </analyzer> > > <analyzer type="query"> > > <tokenizer > > class="solr.WhitespaceTokenizerFactory"/> > > > > <filter > > class="solr.WordDelimiterFilterFactory" > > generateWordParts="0" generateNumberParts="1" > > catenateWords="1" > > catenateNumbers="1" catenateAll="0" > > splitOnCaseChange="1"/> > > <filter > > class="solr.LowerCaseFilterFactory"/> > > <filter > > class="solr.SnowballPorterFilterFactory" > > language="English" protected="protwords.txt"/> > > </analyzer> > > </fieldType> > > > > Thanks. >
Markus Jelsma - Technisch Architect - Buyways BV http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350