Jackie-Jiang commented on PR #11557:
URL: https://github.com/apache/pinot/pull/11557#issuecomment-1718853082

   Added a benchmark and here are the results:
   
   Cardinality | Percentage | DivideBS | PlainBS | Scan | Divide vs Plain | 
Divide vs Scan
   -- | -- | -- | -- | -- | -- | --
   100 | 1 | 7659947.834 | 7193881.873 | 2590606.363 | 1.064786435 | 2.956816575
   100 | 2 | 4527399.93 | 3702032.137 | 2297514.754 | 1.222949927 | 1.970564029
   100 | 4 | 2298448.048 | 2219638.52 | 884871.898 | 1.035505569 | 2.59749242
   100 | 8 | 1624385.705 | 1188408.158 | 730888.462 | 1.366858427 | 2.22248098
   100 | 16 | 789605.698 | 561158.422 | 466697.802 | 1.407099434 | 1.691899329
   100 | 32 | 451038.526 | 270074.742 | 414345.056 | 1.670050752 | 1.08855776
   100 | 64 | 249953.379 | 145491.186 | 306353.403 | 1.717996711 | 0.8158988167
   100 | 100 | 193388.569 | 81558.625 | 252231.273 | 2.371160242 | 0.766711307
   1000 | 1 | 673035.271 | 604210.035 | 53069.877 | 1.113909455 | 12.68205824
   1000 | 2 | 331272.455 | 296964.057 | 56142.286 | 1.115530473 | 5.900587215
   1000 | 4 | 192433.934 | 154232.986 | 53796.693 | 1.247683385 | 3.577058798
   1000 | 8 | 120568.939 | 67105.009 | 46336.664 | 1.796720406 | 2.602020271
   1000 | 16 | 71785.317 | 33548.834 | 41131.107 | 2.139726138 | 1.745280452
   1000 | 32 | 42476.434 | 13445.678 | 33515.651 | 3.159114327 | 1.26736115
   1000 | 64 | 22895.909 | 7281.486 | 26235.134 | 3.144400607 | 0.8727193465
   1000 | 100 | 16424.202 | 4369.1 | 25628.041 | 3.759172827 | 0.6408684144
   10000 | 1 | 56327.402 | 36695.237 | 5402.282 | 1.53500581 | 10.42659417
   10000 | 2 | 34808.144 | 19821.462 | 5119.112 | 1.756083583 | 6.799644938
   10000 | 4 | 17176.683 | 8525.903 | 4878.576 | 2.014646777 | 3.520839483
   10000 | 8 | 7889.65 | 4484.064 | 4283.167 | 1.759486484 | 1.842013165
   10000 | 16 | 4291.993 | 2175.905 | 3454.031 | 1.97250937 | 1.242604076
   10000 | 32 | 2414.918 | 1111.743 | 2897.232 | 2.172190875 | 0.833525931
   10000 | 64 | 1514.006 | 569.737 | 1955.941 | 2.657377 | 0.7740550456
   10000 | 100 | 1319.788 | 380.992 | 2126.729 | 3.464083235 | 0.6205717795
   100000 | 1 | 3995.728 | 2898.882 | 500.751 | 1.378368626 | 7.979470835
   100000 | 2 | 2205.853 | 1299.111 | 496.293 | 1.697971151 | 4.4446587
   100000 | 4 | 1309.562 | 752.165 | 453.279 | 1.741056816 | 2.889085971
   100000 | 8 | 759.235 | 355.94 | 416.158 | 2.133042086 | 1.824391217
   100000 | 16 | 413.461 | 168.189 | 338.497 | 2.45831178 | 1.221461342
   100000 | 32 | 210.711 | 74.033 | 234.106 | 2.846176705 | 0.9000666365
   100000 | 64 | 103.367 | 38.865 | 134.722 | 2.659642352 | 0.7672614718
   100000 | 100 | 70.296 | 24.18 | 96.128 | 2.90719603 | 0.7312749667
   1000000 | 1 | 330.142 | 184.781 | 47.947 | 1.786666378 | 6.88556114
   1000000 | 2 | 147.552 | 89.482 | 48.158 | 1.648957332 | 3.063914614
   1000000 | 4 | 75.214 | 48.161 | 38.61 | 1.561720064 | 1.948044548
   1000000 | 8 | 39.693 | 23.755 | 30.018 | 1.670932435 | 1.322306616
   1000000 | 16 | 23.65 | 12.317 | 23.324 | 1.920110416 | 1.013977019
   1000000 | 32 | 12.367 | 5.821 | 13.327 | 2.124549047 | 0.9279657837
   1000000 | 64 | 6.577 | 2.959 | 7.576 | 2.222710375 | 0.8681362196
   1000000 | 100 | 4.734 | 1.932 | 5.345 | 2.450310559 | 0.8856875585
   
   As shown above, divide binary search always outperforms plain binary search. 
When the lookup value percentage is very high (over half of the dictionary), 
scan start outperforming divide binary search, but not by much.
   Based on this, I've decided to keep all 3 algorithms and use divide binary 
search as the default


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to