n’t.
>
> Thanks
>
> Sid
>
> Sent from my iPhone
>
> > On Jan 8, 2018, at 4:38 PM, John Blythe wrote:
> >
> > you could use the keepwords functionality. have a field that only keeps
> > profanity and then you can query against that field having its default
>
> you could use the keepwords functionality. have a field that only keeps
> profanity and then you can query against that field having its default
> value vs. profane text
>
> --
> John Blythe
>
>> On Mon, Jan 8, 2018 at 3:12 PM, Sadiki Latty wrote:
>>
>>
overkill
hence why I was thinking the list. The data being inserted is from sources that
we have “control” over. This requirement is simply for the worst case scenario
that we miss something. We might also want to allow this profanity which is why
we need to flag it rather than strip it all
gards,
Markus
-Original message-
> From:Davis, Daniel (NIH/NLM) [C]
> Sent: Monday 8th January 2018 23:12
> To: solr-user@lucene.apache.org
> Subject: RE: Profanity
>
> Fun topic. Same complicated issues as normal search:
>
> Multilingual support? Is &quo
Fun topic. Same complicated issues as normal search:
Multilingual support?Is "Merde" profanity too, or just in French.
Multi-word synonyms? Does "God Damn" becomes "goddamn", or do you treat
"Damn" and &
text input field for 'profanity' and set another boolean field
if it matches or doesn't. If you are using a list of words - or an SVM or
another machine learning algorithm - to detect provanity is up to you.
Cheers,
Markus
-Original message-
> From:Sadiki Latty
> Sen
you could use the keepwords functionality. have a field that only keeps
profanity and then you can query against that field having its default
value vs. profane text
--
John Blythe
On Mon, Jan 8, 2018 at 3:12 PM, Sadiki Latty wrote:
> Hey
>
> I would like to find a solution to flag
Hey
I would like to find a solution to flag (at index-time) profanity. Optimally,
it would be good if it function similar to stopwords in the sense that I can
have a predefined list that is read and if token is on the list that document
is 'flagged' in a different field. Does anyo
A problem is that your profanity list will not stop growing, and with
each new word you will want to rescrub the index.
We had a thousand-word NOT clause in every query (a filter query would
be true for 99% of the index) until we switched to another
arrangement.
Another small problem was that I
: Otherwise, I'd do it via copy fields. Your first field is your main
: field and is analyzed as before. Your second field does the profanity
: detection and simply outputs a single token at the end, safe/unsafe.
you don't even need custom code for this ... copyFiled all your te
On Thu, Feb 11, 2010 at 10:49 AM, Grant Ingersoll wrote:
>
> Otherwise, I'd do it via copy fields. Your first field is your main field
> and is analyzed as before. Your second field does the profanity detection
> and simply outputs a single token at the end, safe/unsafe.
>
On Jan 28, 2010, at 4:46 PM, Mike Perham wrote:
> We'd like to implement a profanity detector for documents during indexing.
> That is, given a file of profane words, we'd like to be able to mark a
> document as safe or not safe if it contains any of those words so that we
&
> - A TokenFilter would allow me to tap into the existing analysis pipeline so
> I get the tokens for free but I can't access the document.
https://issues.apache.org/jira/browse/SOLR-1536
On Fri, Jan 29, 2010 at 12:46 AM, Mike Perham wrote:
> We'd like to implement a pro
ike Perham
> > To: solr-user@lucene.apache.org
> > Sent: Thu, January 28, 2010 4:46:54 PM
> > Subject: implementing profanity detector
> >
> > We'd like to implement a profanity detector for documents during indexing.
> > That is, given a file of profane words, we&
ginal Message
>> From: Mike Perham
>> To: solr-user@lucene.apache.org
>> Sent: Thu, January 28, 2010 4:46:54 PM
>> Subject: implementing profanity detector
>>
>> We'd like to implement a profanity detector for documents during indexing.
>> That is,
r-user@lucene.apache.org
> Sent: Thu, January 28, 2010 4:46:54 PM
> Subject: implementing profanity detector
>
> We'd like to implement a profanity detector for documents during indexing.
> That is, given a file of profane words, we'd like to be able to mark a
> document as safe or no
We'd like to implement a profanity detector for documents during indexing.
That is, given a file of profane words, we'd like to be able to mark a
document as safe or not safe if it contains any of those words so that we
can have something similar to google's safe search.
I'm
17 matches
Mail list logo