Hi Alex, I implemented something similar using the rules described in this page:
http://en.wikipedia.org/wiki/American_and_British_English_spelling_differences The idea is to normalize the British spelling form to the American form during indexing and query using a tokenizer that takes in a word and if matched to one of the rules, returns the converted form. My rules were modeled as a chain of transformations. Each transformation had a set of (pattern, action) pairs. The transformations were: a. word_replacement (such as artefact => artifact) - in this case the source word would directly be normalized into the specified target word. b) prefix rules (eg anae => ane for anemic) - in this case the prefix characters of the word, if matched, would be transformed into the target. c) suffix rules (eg tre => ter for center) - similar to prefix rules except it works on suffix. d) infix rules (eg moeb => meb for ameba) - replaces characters in the middle of the word. I cannot share the actual rules, but they should be relatively simple to figure out from the wiki page, if you want to go that route. HTM Sujit On Aug 7, 2012, at 7:08 AM, Alexander Cougarman wrote: > Dear friends, > > Is there a downloadable synonym file for American-British words? This page > has some, for example the VarCon file, but it's not in the Solr synonym.txt > file. > > We need something that can normalize words like "center" to "centre". The > VarCon file has it, but it's in the wrong format. > > Thank you in advance :) > > Sincerely, > Alex >