Re: Search a URL

2010-09-24 Thread Markus Jelsma
die. > > Read 'Hot, Flat, and Crowded' > Laugh at http://www.yert.com/film.php > > --- On Thu, 9/23/10, Markus Jelsma wrote: > > From: Markus Jelsma > > Subject: RE: Search a URL > > To: solr-user@lucene.apache.org > > Date: Thursday, September 23,

RE: Search a URL

2010-09-23 Thread Dennis Gearon
e: > From: Markus Jelsma > Subject: RE: Search a URL > To: solr-user@lucene.apache.org > Date: Thursday, September 23, 2010, 2:11 PM > Try setting generateWordParts=1 in > your WDF. Also, having a WhitespaceTokenizer makes little > sense for URL's, there should be no whitespa

RE: Search a URL

2010-09-23 Thread Markus Jelsma
Try setting generateWordParts=1 in your WDF. Also, having a WhitespaceTokenizer makes little sense for URL's, there should be no whitespace in a URL, the StandardTokenizer can tokenize a URL. Anyway, the problem is your WDF.   -Original message- From: Max Lynch Sent: Thu 23-09-2010 23:00

Re: Search a URL

2010-09-23 Thread dl
LetterTokenizerFactory will use each contiguous sequence of letters and discard the rest. http, https, com, etc. would need to be a stopword. Alternatively you can try PatternTokenizerFactory with a regular expression if you are looking for a specific part of the URL. On Sep 23, 2010, at 10:59