: Otherwise, I'd do it via copy fields. Your first field is your main : field and is analyzed as before. Your second field does the profanity : detection and simply outputs a single token at the end, safe/unsafe.
you don't even need custom code for this ... copyFiled all your text into a 'has_profanity' field where you use a suitable Tokenizer followed by the KeepWordsTokenFilter that only keeps profane words and then a PatternReplaceTokenFilter that matches .* and replaces it with "HELL_YEA" ... now a search for "is_profane:HELL_YEA" finds all profane docs, with the added bonus that the scores are based on how many profane words occur in the doc. it could be used as a filter query (probably negated) as needed. -Hoss