Re: AW: SOLR Indexing/Querying

2007-05-31 Thread Walter Underwood
I solved something similar to this by creating a "stemmer" for part numbers. Variations like "-BN" on the end can be treated as inflections in the part number language, similar to plurals in English. I used a set of regexes to match and transform, in some cases generating multiple "root" part numb

Re: AW: SOLR Indexing/Querying

2007-05-31 Thread Chris Hostetter
: It looks alot like using Solr's standard "WordDelimiterFilter" (see the : sample schema.xml) does what you need. WordDelimiterFilter will only get you so far. it can split the indexed text of "3555LHP" into tokens "3555" and "LHP"; and the user entered "D3555" into the tokens "D" and "3555" --

AW: SOLR Indexing/Querying

2007-05-31 Thread Burkamp, Christian
Hi there, It looks alot like using Solr's standard "WordDelimiterFilter" (see the sample schema.xml) does what you need. It splits on alphabetical to numeric boundaries and on the various kinds of intra word delimiters like "-", "_" or ".". You can decide whether the parts are put together agai