I solved something similar to this by creating a "stemmer" for part
numbers. Variations like "-BN" on the end can be treated as inflections
in the part number language, similar to plurals in English.
I used a set of regexes to match and transform, in some cases generating
multiple "root" part numb
: It looks alot like using Solr's standard "WordDelimiterFilter" (see the
: sample schema.xml) does what you need.
WordDelimiterFilter will only get you so far. it can split the indexed
text of "3555LHP" into tokens "3555" and "LHP"; and the user entered
"D3555" into the tokens "D" and "3555" --
Hi there,
It looks alot like using Solr's standard "WordDelimiterFilter" (see the sample
schema.xml) does what you need.
It splits on alphabetical to numeric boundaries and on the various kinds of
intra word delimiters like "-", "_" or ".". You can decide whether the parts
are put together agai