shaie commented on PR #841:
URL: https://github.com/apache/lucene/pull/841#issuecomment-1151926798

   > So unless I'm totally missing something, I think we should keep the 
`matches(byte[])` method.
   
   I don't think you're missing something, but perhaps to give a different POV, 
I am thinking about users who will want to extend the API. As a user, it feels 
more intuitive to deal with a `long[]` than ` byte[]`. Take a look at the 
impls, they are much simpler and intuitive (IMO). It also _feels_ less expert 
and approachable.
   
   Another point which may change our minds about it -- when we give users a 
`byte[]`, we force them to know how the data was encoded. The two current 
`FSM`s need to iterate on the array with `Long.BYTES` increments. If we decided 
to encode the data in the BDV differently (maybe compress it), then a `byte[]` 
API will need to be accompanied with this information as well.
   
   So again, **purely from an API perspective**, we tell the user "You give us 
`long[]` at indexing time, we'll give it you back at aggregation time". It's 
simple, readable, intuitive.
   
   Perhaps we should include a comparison of these two variants in our 
benchmark? E.g. if there's no downside to using `long[]` then I assume we'll be 
fine with keeping it? But if it performs slower, than we can stick with the 
`byte[]` to keep things well optimized.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to