wirybeaver commented on issue #12774:
URL: https://github.com/apache/pinot/issues/12774#issuecomment-2041665938

   since json index internally store the flatten doc id, it will be tricky 
during transformation but is still doable with additional information about the 
flatten doc id length per docID. The idea below.
   Mutable Index
   
   | docID              | 0 | 1 | 2 | 3 | 4  |
   |--------------------|---|---|---|---|----|
   | flattenDocIDLength | 3 | 5 | 7 | 2 | 10 |
   
   Let say the sortedDocID list in immutable index is [2, 3, 4, 1, 0].
   
   | docID              | 2 | 3 | 4  | 1 | 0 |
   |--------------------|---|---|----|---|---|
   | flattenDocIDLength | 7 | 2 | 10 | 5 | 3 |
   
   let say we can compute the array sortPos to reflect the index position in 
the sortedDocID. Use the example above.
   sortPos = [4, 3, 0, 1, 2]. Given the mutableDocID, we would know the 
position in the sortedDocID is sortPos[mutableDocID]. (assume the docid offset 
is 0 in the segment for easier explain)
   
   The flattenDocID start offset of the originalDocID would be changed from 
prefixSumOfMutableDocID[originalDocID] to 
prefixSumOfSortedDocID[sortPos[originalDocID]].
   
   Take the docID 3 for example, it's flatten ids falls in [3+5+7, 3+5+7+2) the 
mutable index. Would be shifted to [7, 7+2). 
   
   Given a flatten doc id of the mutable segment, we can use binary search over 
the prefixSumOfMutableDocID to induct the docID position in MutableSegment and 
then compute the flattenDocID in the immutable segment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to