Re: Search a URL

2010-09-24 Thread Markus Jelsma
die. > > Read 'Hot, Flat, and Crowded' > Laugh at http://www.yert.com/film.php > > --- On Thu, 9/23/10, Markus Jelsma wrote: > > From: Markus Jelsma > > Subject: RE: Search a URL > > To: solr-user@lucene.apache.org > > Date: Thursday, September 23,

RE: Search a URL

2010-09-23 Thread Dennis Gearon
e: > From: Markus Jelsma > Subject: RE: Search a URL > To: solr-user@lucene.apache.org > Date: Thursday, September 23, 2010, 2:11 PM > Try setting generateWordParts=1 in > your WDF. Also, having a WhitespaceTokenizer makes little > sense for URL's, there should be no whitespa

RE: Search a URL

2010-09-23 Thread Markus Jelsma
10 23:00 To: solr-user@lucene.apache.org; Subject: Search a URL Is there a tokenizer that will allow me to search for parts of a URL?  For example, the search "google" would match on the data " http://mail.google.com/dlkjadf"; This tokenizer factory does

Re: Search a URL

2010-09-23 Thread dl
LetterTokenizerFactory will use each contiguous sequence of letters and discard the rest. http, https, com, etc. would need to be a stopword. Alternatively you can try PatternTokenizerFactory with a regular expression if you are looking for a specific part of the URL. On Sep 23, 2010, at 10:59

Search a URL

2010-09-23 Thread Max Lynch
Is there a tokenizer that will allow me to search for parts of a URL? For example, the search "google" would match on the data " http://mail.google.com/dlkjadf"; This tokenizer factory doesn't seem to be sufficient: