On Thu, Feb 11, 2010 at 10:49 AM, Grant Ingersoll wrote:
>
> Otherwise, I'd do it via copy fields. Your first field is your main field
> and is analyzed as before. Your second field does the profanity detection
> and simply outputs a single token at the end, safe/unsafe.
>
> How long are your
In an UpdateRequestProcessor (processing an AddUpdateCommand), I have
a SolrInputDocument with a field 'content' that has termVectors="true"
in schema.xml. Is it possible to get access to that field's term
vector in the URP?
on how to
implement this efficiently with Lucene/Solr.
mike
On Thu, Jan 28, 2010 at 4:31 PM, Otis Gospodnetic
wrote:
>
> How about this crazy idea - a custom TokenFilter that stores the safe flag in
> ThreadLocal?
>
>
>
> ----- Original Message
> > From: M
We'd like to implement a profanity detector for documents during indexing.
That is, given a file of profane words, we'd like to be able to mark a
document as safe or not safe if it contains any of those words so that we
can have something similar to google's safe search.
I'm trying to figure out