Hi, I'm an infant in Solr/Lucene family, just a couple of months old.
We are trying to find a way to combine words into a single compound word at index and query time. E.g. if the document has "sea bird" in it, it should be indexed as seabird and any query having sea bird in it should also look for seabird not only in qf but also in pf, pf2, pf3 fields. Well, we are using edismax query parser. Our problem is not at index time, we have achieved it by writing our own token filter, but at query time. Our token filter takes a dictionary in the form of "prefix,suffix" in the file and keeps emitting regular and compound tokens as it encounters them. We configured our own filter at query time but figured that at query time individual clauses like field:sea , field:bird etc are created first and then sent to the analyzer. First of all, can someone please confirm if this part of my understanding is correct? So, we are forced to emit sea and bird as individual tokens because we are not getting them in sequence at all. Is it possible to achieve this by other means than pre-processing query before sending it to solr? Can a CharFilter be used instead, are they applied before creating query clauses? I can keep providing more details as necessary. This mail has already crossed TL;DR limits for many :) Parvesh Garg http://www.zettata.com +91 963 222 5540