benwtrent commented on issue #14758:
URL: https://github.com/apache/lucene/issues/14758#issuecomment-4253615422

   Why can't we use another field? I know its weird for a format to be aware of 
another field. But presumably, these labels already exist in the data corpus 
and are stored as another field already (likely for filtering on text data, 
etc.). It seems really weird to effectively index the field twice when we 
likely already have the field.
   
   Something like 
   
   ```
   HnswKnnFormat(String partitionField,...)
   ```
   
   Then during index building, it would create bitsets for each value present 
in `paritionField`, and use those bitsets to create the N graphs. 
   
   The biggest concerns would be:
   
    - How to get that field value
    - What cardinality limits should be enforced
    - Should there also be an "all" Graph?
   
   
   Actually, thinking more...I wonder if the writer could just be given an 
bitset providers or something...
   
   Purely brainstorming, maybe what I am writing here is too extreme.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to