Take a close look at WordDelimiterFilterFactory, it's designed to deal
with things like part numbers, phone numbers and the like, and the
example you gave is in the same class of problem I think. It'll take
a bit to get your head around what it does, but it'll perfom better
than regexes, assuming you can get what you need out of it.
And the admin/analysis page will help you _greatly_ in understanding
what the effects of the various parameters are.

Best,
Erick

On Wed, Mar 22, 2017 at 11:06 AM, Mark Johnson
<mjohn...@emersonecologics.com> wrote:
> Is it possible to configure Solr to treat text that matches a regex as a
> phrase?
>
> I have a database full of products, and the Title and Description fields
> are text_en, tokenized via the StandardTokenizerFactory. This works in most
> cases, but a number of products have names like:
>
>  - Vitamin A
>  - Vitamin-A
>  - Vitamin B12
>  - Vitamin B-12
> ...and so on
>
> I have a regex that will match all of the permutations and would like to
> configure the field type so that anything that matches the regex pattern is
> treated as a single token, instead of being broken up by spaces, etc. Is
> that possible?
>
> --
> *This message is intended only for the use of the individual or entity to
> which it is addressed and may contain information that is privileged,
> confidential and exempt from disclosure under applicable law. If you have
> received this message in error, you are hereby notified that any use,
> dissemination, distribution or copying of this message is prohibited. If
> you have received this communication in error, please notify the sender
> immediately and destroy the transmitted information.*

Reply via email to