I am trying to use solr.SynonymFilterFactory on a faceted field in Solr 1.3. I am using Solr to index resources from a media library. The data is coming from various sources, some of which I do not have control over. I need to be able to map resource types in the data to common terms for faceting. For example: video,audio => digital media film,laser disc, vhs video => other
I am using solr.KeywordTokenizerFactory for the analyzer, but Solr will not treat multiple words as a single token. A single word to single word map (i.e. film => other) works perfectly . A single to double word map (i.e. film => other stuff) becomes 2 terms which is unfit for faceting. A double word to single word map (i.e. vhs video => videotape) doesn't seem to match at all. I've tried this with and without the tokenizerFactory="solr.KeywordTokenizerFactory" attribute in the synonm filter element. I've tried to escape the space in the synonm file (i.e. video => digital\bmedia). Is it possible to use the synonm filter to map multi-word terms for a facteted field? If so, what am I missing?