[jira] (LUCENE-10297) Speed up medium cardinality fields with readLongs and SIMD

Feng Guo (Jira) Thu, 09 Dec 2021 08:18:13 -0800


    [ https://issues.apache.org/jira/browse/LUCENE-10297 ]



    Feng Guo deleted comment on LUCENE-10297:
    -----------------------------------

was (Author: gf2121):
Several side findings and thoughts that need confirm:

* This optimization can proberbly make storage lager because we store all delta 
16 bits if any big delta found. Since the {{readInts}} method in 
{{StoredFieldsInts}} treated the 128 ints as a block to get auto-vectorization. 
I wonder if we can compute the bitsReqired for each 128 int instead of 512 
int?This can probably reduce the storage size, but as the cost of this, we can 
not reuse the logic in  {{StoredFieldsInts}}. I'm not sure which way is better 
here or there can be another way that can trade off the storage and speed 
better?

* The {{readInts8}} method in {{StoredFieldsInts}} seems not the most efficient 
implementaion —— It seems the following method runs faster. By using this 
method, the task [cardinality=32, query point=16] took decreased from 82ms to 
you 67ms In the benchmark mentioned above. Maybe we should use this for 8bit 
ints in this scene?

{code:java}
// write side
out.writeByte((byte) 1);
out.writeVInt(docIds[start]);
for (int i = 0; i < count; i++) {
  out.writeByte((byte) delta[i]);
}

// read side
int previous = in.readVInt();
in.readBytes(scratch, 0, count);
visitor.visit(previous);
for (int i = 1; i < count; i++) {
  previous += scratch[i] & 0xff;
  visitor.visit(previous);
}
{code}



> Speed up medium cardinality fields with readLongs and SIMD
> ----------------------------------------------------------
>
>                 Key: LUCENE-10297
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10297
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>            Reporter: Feng Guo
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We introduced a bitset optimization for extremly low cardinality fields in 
> [LUCENE-10233|https://issues.apache.org/jira/browse/LUCENE-10233], but medium 
> cardinality fields (like 32/128) can rarely trigger this optimization, I'm 
> trying to find out a way to speed up them.
> In [https://github.com/apache/lucene-solr/pull/1538], we made some effort to 
> use readLELongs to speed up BKD id blocks, but did not get a obvious gain on 
> this approach. I think the reason could be that we were trying to optimize 
> the unsorted situation (typically happens for high cardinality fields) and 
> the bottleneck of queries on high cardinality fields is {{visitDocValues}} 
> but not {{{}readDocIds{}}}. _(Not sure, i'm doing some more benchmark on 
> this)_
> However, medium cardinality fields may be tempted for this optimization 
> because they need to read lots of ids for each term. The basic idea is that 
> we can compute the delta of the sorted ids and encode/decode them like what 
> we do in {{{}StoredFieldsInts{}}}. I benchmarked the optimization by mocking 
> some random longPoint and querying them with {{{}PointInSetQuery{}}}. As 
> expected, the medium cardinality fields got spped up and high cardinality 
> fields get even results.
> *Benchmark Result*
> |doc count|field cardinality|query point|baseline(ms)|candidate(ms)|diff 
> percentage|baseline(QPS)|candidate(QPS)|diff percentage|
> |100000000|32|1|19|16|-15.79%|52.63|62.50|18.75%|
> |100000000|32|2|34|14|-58.82%|29.41|71.43|142.86%|
> |100000000|32|4|76|22|-71.05%|13.16|45.45|245.45%|
> |100000000|32|8|139|42|-69.78%|7.19|23.81|230.95%|
> |100000000|32|16|279|82|-70.61%|3.58|12.20|240.24%|
> |100000000|128|1|17|11|-35.29%|58.82|90.91|54.55%|
> |100000000|128|8|75|23|-69.33%|13.33|43.48|226.09%|
> |100000000|128|16|126|25|-80.16%|7.94|40.00|404.00%|
> |100000000|128|32|245|50|-79.59%|4.08|20.00|390.00%|
> |100000000|128|64|528|97|-81.63%|1.89|10.31|444.33%|
> |100000000|1024|1|3|2|-33.33%|333.33|500.00|50.00%|
> |100000000|1024|8|13|8|-38.46%|76.92|125.00|62.50%|
> |100000000|1024|32|31|19|-38.71%|32.26|52.63|63.16%|
> |100000000|1024|128|120|67|-44.17%|8.33|14.93|79.10%|
> |100000000|1024|512|480|133|-72.29%|2.08|7.52|260.90%|
> |100000000|8192|1|3|3|0.00%|333.33|333.33|0.00%|
> |100000000|8192|16|18|15|-16.67%|55.56|66.67|20.00%|
> |100000000|8192|64|19|14|-26.32%|52.63|71.43|35.71%|
> |100000000|8192|512|69|43|-37.68%|14.49|23.26|60.47%|
> |100000000|8192|2048|236|134|-43.22%|4.24|7.46|76.12%|
> |100000000|1048576|1|3|2|-33.33%|333.33|500.00|50.00%|
> |100000000|1048576|16|18|19|5.56%|55.56|52.63|-5.26%|
> |100000000|1048576|64|17|17|0.00%|58.82|58.82|0.00%|
> |100000000|1048576|512|34|32|-5.88%|29.41|31.25|6.25%|
> |100000000|1048576|2048|89|93|4.49%|11.24|10.75|-4.30%|



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] (LUCENE-10297) Speed up medium cardinality fields with readLongs and SIMD

Reply via email to