[ 
https://issues.apache.org/jira/browse/SOLR-14787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291072#comment-17291072
 ] 

Gus Heck commented on SOLR-14787:
---------------------------------

[~jbernste] This operates at a token level not a document level. Fields and 
joins would filter at a document level. In the simple equals case the payload 
might be "noun" or "verb" string and you could search for documents where the 
word "set" was used as a "NOUN". One could also perhaps score tokens for 
"offensiveness" (or something else) and then encode that as a payload and match 
(or avoid matches) only if the tokens were more offensive than X... or 
vice-versa (that analysis could be context sensitive NLP based stuff). These 
sorts of things likely slow down and inflate the index but enable detailed 
token by token functionality not otherwise available.

> Inequality support in Payload Check query parser
> ------------------------------------------------
>
>                 Key: SOLR-14787
>                 URL: https://issues.apache.org/jira/browse/SOLR-14787
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Kevin Watters
>            Assignee: Gus Heck
>            Priority: Major
>             Fix For: master (9.0)
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The goal of this ticket/pull request is to support a richer set of matching 
> and filtering based on term payloads.  This patch extends the 
> PayloadCheckQueryParser to add a new local param for "op"
> The value of OP could be one of the following
>  * gt - greater than
>  * gte - greater than or equal
>  * lt - less than
>  * lte - less than or equal
> default value for "op" if not specified is to be the current behavior of 
> equals.
> Additionally to the operation you can specify a threshold local parameter
> This will provide the ability to search for the term "cat" so long as the 
> payload has a value of greater than 0.75.  
> One use case is to classify a document into various categories with an 
> associated confidence or probability that the classification is correct.  
> That can be indexed into a delimited payload field.  The searches can find 
> and match documents that were tagged with the "cat" category with a 
> confidence of greater than 0.5.
> Example Document
> {code:java}
> { 
>   "id":"doc_1",
>   "classifications_payload":["cat|0.75 dog|2.0"]
> }
> {code}
> Example Syntax
> {code:java}
> {!payload_check f=classifications_payload payloads='1' op='gt' 
> threshold='0.5'}cat  {code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to