Re: AlphaNumeric search in Solr

Chris Hostetter Sun, 16 Mar 2008 23:41:31 -0700

:   I read some documentation on the WordDelimterFilter.  Just to clarify my
: thinking, I understand that if I use WordDelimiterFilter and search for a
: term like axd100 it will break it into two tokens "axd" and "100".  But then
: when I do my search should Solr match the documents containing both these
: tokens?


Text Analysis for the purposes of building and searching an inverted index 
like Lucene is all about having "complimentary" tokenization/filtering at 
indextime and at query time.

Yes, WordDelimiterFilter can help match on the types of things in your 
example, but only using it to analyzer your query strings will not provide 
a magic bullet -- you have to have also used it with the appropriate 
settings at index time in order for the correct tokens to be indexed so 
you can find them when you query.

:   In my application when I try to search for "axd 100" I get several
: documents back, but when I search for axd100 with WordDelimiterFilter on, I
: don't get back any results.  I was assuming that if WordDelimiterFilter
: breaks axd100 into two tokens - "axd" and "100", then the search should
: behave exactly as if I was searching for the string "axd 100".

Not exactly, a better approximation would be searching for the *phrase* 
"axd 100" ... but it really depends on how you configure it.

Bottom line: if you don't have control over how your index is built, and 
it isn't build using any configuration of the WordDelimiterFilter, you are 
better off avoiding it at query time.




-Hoss

Re: AlphaNumeric search in Solr

Reply via email to