Re: catch alls and nuances

2016-02-02 Thread John Blythe
yo, erick: thanks for the reply. yes, i was only meaning my own custom fieldType. my bad on not sticking w my original example. i've been using the StandardTokenizerFactory to break out the stream. While I understand the tokenization/stream on paper, perhaps I'm not connecting all the dots I need

Re: catch alls and nuances

2016-02-02 Thread Susheel Kumar
Hi John - You can take more close look on different options with WordDelimeterFilterFactory at https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory to see if they meet your requirement and use Analysis tab in Solr Admin UI. If still have question, you can sha

Re: catch alls and nuances

2016-02-02 Thread Erick Erickson
bq: Have now begun writing my own. I hope by that you mean defining your own , at least until you're sure that none of the zillion things you can do with an analysis chain don't suit your needs. If you haven't already looked _seriously_ at the admin/analysis page (you have to choose a core to hav

Re: catch alls and nuances

2016-02-02 Thread John Blythe
I had been using text_general at the time of my email's writing. Have tried a couple of the other stock ones (text_en, text_en_splitting, _tight). Have now begun writing my own. I began to wonder if simply doing one of the above, such as text_general, with a fuzzy distance (probably just ~1) would

Re: catch alls and nuances

2016-02-01 Thread Erick Erickson
Likely you also have WordDelimiterFilterFactory in your fieldType, that's what will split on alphanumeric transitions. So you should be able to use wildcards here, i.e. 1234L* However, that'll only work if you have preserveOriginal set in WordDelimiterFilterFactory in your indexing chain. And ju

catch alls and nuances

2016-02-01 Thread John Blythe
Hi there I have a a catch all field called 'text' that I copy my item description, manufacturer name, and the item's catalog number into. I'm having an issue with keeping the broadness of the tokenizers in place whilst still allowing some good precision in the case of very specific queries. The r