Is there some set pattern to how these words occur or do they occur randomly in the text, i.e., somewhere it'll be "subtitle" and somewhere "s u b t i t l e"?
On Tue, 23 Feb 2016, 05:01 Francisco Andrés Fernández <fra...@gmail.com> wrote: > Hi all, > I'm extracting some text from pdf. As result, some important words end with > spaces between characters. I know they are words but, don't know how to > make Solr detect and index them. > For example, I could have the word "Subtitle" that I want to detect, > written like "S u b t i t l e". If I would parse the text with a standard > tokenizer, the word will be lost. > How could I make Solr detect this type of word occurrence? > Many thanks, > > Francisco > -- Regards, Binoy Dalal