Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-20 Thread Diego Fernandez
to keep the StandardTokenizer (because we make use of the > > token > > types) but wanted to use the WDFF to get combinations of words that are > > split with certain characters (mainly - and /, but possibly others as > > well), > > what is the suggested way of accomplishing this? Would we just have to > > extend the JFlex file for the tokenizer and re-compile it? > > > > > > > > -- > > View this message in context: > > http://lucene.472066.n3.nabble.com/WordDelimiterFilterFactory-and-StandardTokenizer-tp4131628p4136146.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > > > >

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-20 Thread Ahmet Arslan
lit with certain characters (mainly - and /, but possibly others as well), > what is the suggested way of accomplishing this? Would we just have to > extend the JFlex file for the tokenizer and re-compile it? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/WordDelimiterFilterFactory-and-StandardTokenizer-tp4131628p4136146.html > Sent from the Solr - User mailing list archive at Nabble.com. > >

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-20 Thread Diego Fernandez
because we make use of the token > types) but wanted to use the WDFF to get combinations of words that are > split with certain characters (mainly - and /, but possibly others as well), > what is the suggested way of accomplishing this? Would we just have to > extend the JFlex file for th

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-16 Thread Shawn Heisey
On 5/16/2014 9:24 AM, aiguofer wrote: > Jack Krupansky-2 wrote >> Typically the white space tokenizer is the best choice when the word >> delimiter filter will be used. >> >> -- Jack Krupansky > > If we wanted to keep the StandardTokenizer (because we make use of the token > types) but wanted to

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-16 Thread Ahmet Arslan
we just have to extend the JFlex file for the tokenizer and re-compile it? -- View this message in context: http://lucene.472066.n3.nabble.com/WordDelimiterFilterFactory-and-StandardTokenizer-tp4131628p4136146.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-16 Thread aiguofer
3.nabble.com/WordDelimiterFilterFactory-and-StandardTokenizer-tp4131628p4136146.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-04-16 Thread Jack Krupansky
Typically the white space tokenizer is the best choice when the word delimiter filter will be used. -- Jack Krupansky -Original Message- From: Shawn Heisey Sent: Wednesday, April 16, 2014 11:03 PM To: solr-user@lucene.apache.org Subject: Re: WordDelimiterFilterFactory and

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-04-16 Thread Shawn Heisey
On 4/16/2014 8:37 PM, Bob Laferriere wrote: >> I am seeing odd behavior from WordDelimiterFilterFactory (WDFF) when >> used in conjunction with StandardTokenizerFactory (STF). >> I see the following results for the document of “wi-fi”: >> >> Index: “wi”, “fi” >> Query: “wi”,”fi”,”wifi” >> >

WordDelimiterFilterFactory and StandardTokenizer

2014-04-16 Thread Bob Laferriere
 I am seeing odd behavior from WordDelimiterFilterFactory  (WDFF) when used in conjunction with StandardTokenizerFactory (STF). If I use the following configuration: