shaie commented on PR #841: URL: https://github.com/apache/lucene/pull/841#issuecomment-1153077923
> I feel like having 2 `matches` functions in this case would make the API unnecessarily complex As I wrote before "_After we decide whether to stick w/ the long[] or byte[] API we'll remove the unneeded variant._". We won't have 2 APIs in the end. > [...] and then benchmark to see if the byte-based approach is _actually_ more optimal We will need to benchmark this of course, but I thought about this a bit and I feel like `long[]` will perform better for these reasons: 1. The `FSM` will store a `long[]` and will iterate it instead of `byte[]`. Less array accesses and since it happens **for every** facet set in every matching docs, I feel like it's going to be more efficient than re-iterating the `byte[]` 2. Likewise, `MFSC` will convert the BDV values **once** to `long[]` and then all `FSMs` will be given that `long[]` to match. It just feels like more optimal, unless you have a single `FSM`, but I don't think that will be a common case. Anyway, let's benchmark it, but with the analysis above, I also agree we should actually start with the `long[]` API, and replace it with a `byte[]` one only if actually performs better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org