[ https://issues.apache.org/jira/browse/LUCENE-10297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Feng Guo updated LUCENE-10297: ------------------------------ Description: We already have a bitset optimization for low cardinality fields, but the optimization only works on extremly low cardinality fields (doc count > 1/16 total doc), for medium cardinality case like 32/128 can rarely get this optimization. In [https://github.com/apache/lucene-solr/pull/1538], we made some effort to use readLELongs to speed up BKD id blocks, but did not get a obvious gain on this approach. I think the reason could be that we were trying to optimize the unsorted situation (typically happens for high cardinality fields) and the bottleneck of queries on high cardinality fields is {{visitDocValues}} but not {{{}readDocIds{}}}. (Not sure, i'm doing some more benchmark on this) However, medium cardinality fields may be tempted for this optimization because they need to read lots of ids for each term. The basic idea is that we can compute the delta of the sorted ids and encode/decode them like what we do in {{{}StoredFieldsInts{}}}. I benchmarked the optimization by mocking some random longPoint and querying them with {{{}PointInSetQuery{}}}. As expected, the medium cardinality fields got spped up and high cardinality fields get even results. *Benchmark Result* |doc count|field cardinality|query point|baseline(ms)|candidate(ms)|diff percentage|baseline(QPS)|candidate(QPS)|diff percentage| |100000000|32|1|19|16|-15.79%|52.63|62.50|18.75%| |100000000|32|2|34|14|-58.82%|29.41|71.43|142.86%| |100000000|32|4|76|22|-71.05%|13.16|45.45|245.45%| |100000000|32|8|139|42|-69.78%|7.19|23.81|230.95%| |100000000|32|16|279|82|-70.61%|3.58|12.20|240.24%| |100000000|128|1|17|11|-35.29%|58.82|90.91|54.55%| |100000000|128|8|75|23|-69.33%|13.33|43.48|226.09%| |100000000|128|16|126|25|-80.16%|7.94|40.00|404.00%| |100000000|128|32|245|50|-79.59%|4.08|20.00|390.00%| |100000000|128|64|528|97|-81.63%|1.89|10.31|444.33%| |100000000|1024|1|3|2|-33.33%|333.33|500.00|50.00%| |100000000|1024|8|13|8|-38.46%|76.92|125.00|62.50%| |100000000|1024|32|31|19|-38.71%|32.26|52.63|63.16%| |100000000|1024|128|120|67|-44.17%|8.33|14.93|79.10%| |100000000|1024|512|480|133|-72.29%|2.08|7.52|260.90%| |100000000|8192|1|3|3|0.00%|333.33|333.33|0.00%| |100000000|8192|16|18|15|-16.67%|55.56|66.67|20.00%| |100000000|8192|64|19|14|-26.32%|52.63|71.43|35.71%| |100000000|8192|512|69|43|-37.68%|14.49|23.26|60.47%| |100000000|8192|2048|236|134|-43.22%|4.24|7.46|76.12%| |100000000|1048576|1|3|2|-33.33%|333.33|500.00|50.00%| |100000000|1048576|16|18|19|5.56%|55.56|52.63|-5.26%| |100000000|1048576|64|17|17|0.00%|58.82|58.82|0.00%| |100000000|1048576|512|34|32|-5.88%|29.41|31.25|6.25%| |100000000|1048576|2048|89|93|4.49%|11.24|10.75|-4.30%| was: We already have a bitset optimization for low cardinality fields, but the optimization only works on extremly low cardinality fields (doc count > 1/16 total doc), for medium cardinality case like 32/128 can rarely get this optimization. In [https://github.com/apache/lucene-solr/pull/1538], we made some effort to use readLELongs to speed up BKD id blocks, but did not get a obvious gain on this approach. I think the reason could be that we were trying to optimize the unsorted situation (typically happens for high cardinality fields) and the bottleneck of queries on high cardinality fields is {{visitDocValues}} but not {{readDocIds}}. However, medium cardinality fields may be tempted for this optimization because they need to read lots of ids for each term. The basic idea is that we can compute the delta of the sorted ids and encode/decode them like what we do in {{{}StoredFieldsInts{}}}. I benchmarked the optimization by mocking some random longPoint and querying them with {{{}PointInSetQuery{}}}. As expected, the medium cardinality fields got spped up and high cardinality fields get even results. *Benchmark Result* |doc count|field cardinality|query point|baseline(ms)|candidate(ms)|diff percentage|baseline(QPS)|candidate(QPS)|diff percentage| |100000000|32|1|19|16|-15.79%|52.63|62.50|18.75%| |100000000|32|2|34|14|-58.82%|29.41|71.43|142.86%| |100000000|32|4|76|22|-71.05%|13.16|45.45|245.45%| |100000000|32|8|139|42|-69.78%|7.19|23.81|230.95%| |100000000|32|16|279|82|-70.61%|3.58|12.20|240.24%| |100000000|128|1|17|11|-35.29%|58.82|90.91|54.55%| |100000000|128|8|75|23|-69.33%|13.33|43.48|226.09%| |100000000|128|16|126|25|-80.16%|7.94|40.00|404.00%| |100000000|128|32|245|50|-79.59%|4.08|20.00|390.00%| |100000000|128|64|528|97|-81.63%|1.89|10.31|444.33%| |100000000|1024|1|3|2|-33.33%|333.33|500.00|50.00%| |100000000|1024|8|13|8|-38.46%|76.92|125.00|62.50%| |100000000|1024|32|31|19|-38.71%|32.26|52.63|63.16%| |100000000|1024|128|120|67|-44.17%|8.33|14.93|79.10%| |100000000|1024|512|480|133|-72.29%|2.08|7.52|260.90%| |100000000|8192|1|3|3|0.00%|333.33|333.33|0.00%| |100000000|8192|16|18|15|-16.67%|55.56|66.67|20.00%| |100000000|8192|64|19|14|-26.32%|52.63|71.43|35.71%| |100000000|8192|512|69|43|-37.68%|14.49|23.26|60.47%| |100000000|8192|2048|236|134|-43.22%|4.24|7.46|76.12%| |100000000|1048576|1|3|2|-33.33%|333.33|500.00|50.00%| |100000000|1048576|16|18|19|5.56%|55.56|52.63|-5.26%| |100000000|1048576|64|17|17|0.00%|58.82|58.82|0.00%| |100000000|1048576|512|34|32|-5.88%|29.41|31.25|6.25%| |100000000|1048576|2048|89|93|4.49%|11.24|10.75|-4.30%| > Speed up medium cardinality fields with readLongs and SIMD > ---------------------------------------------------------- > > Key: LUCENE-10297 > URL: https://issues.apache.org/jira/browse/LUCENE-10297 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs > Reporter: Feng Guo > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We already have a bitset optimization for low cardinality fields, but the > optimization only works on extremly low cardinality fields (doc count > 1/16 > total doc), for medium cardinality case like 32/128 can rarely get this > optimization. > In [https://github.com/apache/lucene-solr/pull/1538], we made some effort to > use readLELongs to speed up BKD id blocks, but did not get a obvious gain on > this approach. I think the reason could be that we were trying to optimize > the unsorted situation (typically happens for high cardinality fields) and > the bottleneck of queries on high cardinality fields is {{visitDocValues}} > but not {{{}readDocIds{}}}. (Not sure, i'm doing some more benchmark on this) > However, medium cardinality fields may be tempted for this optimization > because they need to read lots of ids for each term. The basic idea is that > we can compute the delta of the sorted ids and encode/decode them like what > we do in {{{}StoredFieldsInts{}}}. I benchmarked the optimization by mocking > some random longPoint and querying them with {{{}PointInSetQuery{}}}. As > expected, the medium cardinality fields got spped up and high cardinality > fields get even results. > *Benchmark Result* > |doc count|field cardinality|query point|baseline(ms)|candidate(ms)|diff > percentage|baseline(QPS)|candidate(QPS)|diff percentage| > |100000000|32|1|19|16|-15.79%|52.63|62.50|18.75%| > |100000000|32|2|34|14|-58.82%|29.41|71.43|142.86%| > |100000000|32|4|76|22|-71.05%|13.16|45.45|245.45%| > |100000000|32|8|139|42|-69.78%|7.19|23.81|230.95%| > |100000000|32|16|279|82|-70.61%|3.58|12.20|240.24%| > |100000000|128|1|17|11|-35.29%|58.82|90.91|54.55%| > |100000000|128|8|75|23|-69.33%|13.33|43.48|226.09%| > |100000000|128|16|126|25|-80.16%|7.94|40.00|404.00%| > |100000000|128|32|245|50|-79.59%|4.08|20.00|390.00%| > |100000000|128|64|528|97|-81.63%|1.89|10.31|444.33%| > |100000000|1024|1|3|2|-33.33%|333.33|500.00|50.00%| > |100000000|1024|8|13|8|-38.46%|76.92|125.00|62.50%| > |100000000|1024|32|31|19|-38.71%|32.26|52.63|63.16%| > |100000000|1024|128|120|67|-44.17%|8.33|14.93|79.10%| > |100000000|1024|512|480|133|-72.29%|2.08|7.52|260.90%| > |100000000|8192|1|3|3|0.00%|333.33|333.33|0.00%| > |100000000|8192|16|18|15|-16.67%|55.56|66.67|20.00%| > |100000000|8192|64|19|14|-26.32%|52.63|71.43|35.71%| > |100000000|8192|512|69|43|-37.68%|14.49|23.26|60.47%| > |100000000|8192|2048|236|134|-43.22%|4.24|7.46|76.12%| > |100000000|1048576|1|3|2|-33.33%|333.33|500.00|50.00%| > |100000000|1048576|16|18|19|5.56%|55.56|52.63|-5.26%| > |100000000|1048576|64|17|17|0.00%|58.82|58.82|0.00%| > |100000000|1048576|512|34|32|-5.88%|29.41|31.25|6.25%| > |100000000|1048576|2048|89|93|4.49%|11.24|10.75|-4.30%| -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org