kkieser, It just occurred to me that Solr might actually fit the bill. Your scenario is definitely not present a use of Solr that is typical at all, but a novel use of Solr I am about to describe could totally get what you want.
A Solr index is composed of "documents" which are typically similar to a user document or database record or something like that. But in your case, the "document" would be one word that's either one of your good word or bad words. You could have a boolean indicating which type, and you could index it several ways including phonetically. When you want to compare a document to see if it matches any words, you use Solr's More-Like-This feature, configured appropriately, to tell you what matching documents (e.g. naughty words) get matched. You could even facet on the naughty boolean to know how many of each. What I described is definitely not a task for Endeca. ~ David Smiley ----- Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Endeca-vs-Solr-tp832826p833019.html Sent from the Solr - User mailing list archive at Nabble.com.