On Thu, Jan 20, 2011 at 4:08 PM, shm <s...@dbc.dk> wrote: > Hi, I have a little problem regarding indexing, that i don't know > how to solve, i need to index the same data in different ways > into the same field. The problem is a normalization problem, and > here is an example: > > I have a special character \uA732, which i need to normalize in > two different ways for phrase searching. So if i encounter this > character in, for example, title field I would like it to result > in these two phrase fields: > > raw data = "\uA732lborg" > phrase.title= "ålborg" > phrase.title= "aalborg" [...]
You could use a multi-valued field along with a ScriptTransformer in the DataImportHandler. Read in the raw data, call a ScriptTransformer to do the normalisation, and store both output versions in the multi-valud field (or, you could store it in two separate fields, if you prefer). Regards, Gora