Re: Indexing a word in url

2008-04-02 Thread Simon Rosenthal
I also couldn't get the exact results I wanted for indexing URL components using WordDelimeterFilter or patternTokenizer, so resorted to adding a new field ('pathparts'), plus a few lines of code to generate the tokens in our content preprocessor which submits documents to SOLR for indexing. -Si

Re: Indexing a word in url

2008-04-01 Thread Chris Hostetter
: Actually I want to use anything that is not alphabet or digit to be the : separator - anything between them will be a word (so that I can use the URL : fragment to see what is indexed about this site)...any suggestion? In addition to Mike's suggestion of trying out the WordDelimiterFilter, tak

Re: Indexing a word in url

2008-03-31 Thread Vinci
s generation on only (no catenation), and an additiona stopwords > like that excludes a few tokens like 'http'. > > -Mike > > -- View this message in context: http://www.nabble.com/Indexing-a-word-in-url-tp16397739p16411091.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Indexing a word in url

2008-03-31 Thread Mike Klaas
On 31-Mar-08, at 10:50 AM, Vinci wrote: Hi all, I would like to ask, if I want to index word in a URL, which data type and parser should I use? Depends on how you want to search it. I use WordDelimiterFilter with parts generation on only (no catenation), and an additiona stopwords li

Indexing a word in url

2008-03-31 Thread Vinci
Hi all, I would like to ask, if I want to index word in a URL, which data type and parser should I use? Thank you, Vinci -- View this message in context: http://www.nabble.com/Indexing-a-word-in-url-tp16397739p16397739.html Sent from the Solr - User mailing list archive at Nabble.com.