On Thu, Jan 20, 2011 at 4:08 PM, shm <s...@dbc.dk> wrote:
> Hi, I have a little problem regarding indexing, that i don't know
> how to solve, i need to index the same data in different ways
> into the same field. The problem is a normalization problem, and
> here is an example:
>
> I have a special character \uA732, which i need to normalize in
> two different ways for phrase searching. So if i encounter this
> character in, for example, title field I would like it to result
> in these two phrase fields:
>
>        raw data = "\uA732lborg"
>        phrase.title= "ålborg"
>        phrase.title= "aalborg"
[...]

You could use a multi-valued field along with a
ScriptTransformer in the DataImportHandler.
Read in the raw data, call a ScriptTransformer
to do the normalisation, and store both output
versions in the multi-valud field (or, you could
store it in two separate fields, if you prefer).

Regards,
Gora

Reply via email to