yashmayya opened a new pull request, #13139:
URL: https://github.com/apache/pinot/pull/13139

   - The JSON index currently doesn't support nested (within `AND` / `OR`) 
exclusive predicates (`NOT IN`, `!=`, `IS NULL`) with the `JSON_MATCH` filter 
operator.
   - The reason for this is the existing semantics for exclusive predicates 
that are inconsistent and confusing.
   - A document like:
     - ```
        [
          {
            "key1": "value1",
            "key2": 1
          },
          {
            "key1": "value2",
            "key2": 2
          }
        ]
       ```
       matches `"$[*].key2" = 1`, but not `"$[*].key2" != 2` for example.
   - Essentially, the wildcard array access for inclusive predicates means that 
we match the array when ANY value matches, but for exclusive predicates, we 
match the array when ALL values match.
   - Furthermore, a document like:
     - ```
        [
          {
             "key1": "value1",
          }
        ]
       ```
       matches `"$[0].key2" != 2` which is counterintuitive since the object 
doesn’t contain `key2` at all.
   - The reason for both of these inconsistencies and confusing semantics is 
the way that exclusive predicates are implemented. Exclusive predicates are 
currently handled by finding the [flattened doc 
ID](https://docs.pinot.apache.org/basics/indexing/json-index#example)s of the 
corresponding inclusive predicate, mapping them back to the corresponding 
unflattened doc IDs, and then inverting / flipping the resultant bitmap. And 
that is also the reason why nested exclusive predicates aren’t supported 
currently since the intermediate flattened doc ID results can’t be combined 
with flattened doc ID results from other child predicates while maintaining the 
same semantics.
   - This patch fixes the above inconsistencies so that `[*]` now means the 
same for both exclusive and inclusive predicates, and documents where a key 
doesn't exist aren't matched for the exclusive predicates `NOT IN`, `!=`. It 
also adds support for nested exclusive predicates.
   - The new implementation for `NOT IN`, `!=` make use of similar logic to the 
implementation for the regex and range predicates 
(https://github.com/apache/pinot/pull/12568).
   - Essentially, we first get all the values for a corresponding key. Then, we 
iterate through the values - and for each value that matches the predicate, we 
add the list of corresponding flattened doc IDs to the result.
   - The `IS NULL` predicate is handled by inverting the flattened doc ID 
bitmap of the corresponding `IS NOT NULL` predicate.
   - Suggested labels: `feature`, `backward-incompat`, `release-notes`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to