neoremind opened a new pull request, #16280: URL: https://github.com/apache/lucene/pull/16280
## Background In #13863, `ByteBuffersDataOutput.writeString()` was optimized to avoid allocating `BytesRef` and copying bytes to the dest buffer, instead it encoded directly in place. Indeed, it requires two passes over the input string chars: first `calcUTF16toUTF8Length` to get the VInt length prefix, then `UTF16toUTF8` for the utf8 encoding. The opportunity is: for short strings, we can save that first pass. ## What this PR does This PR adds a single-pass fast path for short strings (charCount <= 42) where the max UTF-8 byte length is `42 * 3 = 126`, it always fits as 1-byte VInt. So we know the VInt prefix size without needing to go over the string chars upfront. Reserve 1 byte, encode directly into the dest buffer, then backfill the length. For strings that don't hit the shortcut, fall to existing logic. To my understanding, this could benefit stored fields writes of short strings like business related keywords, IDs, titles, etc. Plus short strings like field infos, codec metadata, segment names, etc. ## Benchmarks I added a JMH benchmark comparing the new impl against the current across ASCII, CJK, and Latin-extended strings at various lengths, see [here](https://github.com/apache/lucene/compare/main...neoremind:lucene:bbo_writestring_fast_path_bench?expand=1#diff-d6793a43f462bc34205113c143695761c1fbe50e7197494de9bc4686569fc8c6R451 ) for keeping the current impl to do apple-to-apple compare. Target written byte size matches stored fields chunk sizes: 80KB (BEST_SPEED default), 480KB (BEST_COMPRESSION default), and 2MB (imagine customized larger chunk in store fields .fdt). The benchmark uses a resettable `ByteBuffersDataOutput` starting with 1KB blocks to mimic real-world workload. Results show notable gains on short strings with no regressions on medium/long/very large strings (only acceptable jitter as I saw) which fall to the unchanged logic. Throughput in ops/s. Each run writes target written byte size into the buffer. Measured on EC2 m5.2xlarge. <details> <summary> See detailed results </summary> ``` Benchmark (stringType) (targetBytes) Mode Cnt Score Error Units ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_1 81920 thrpt 15 1924.154 ± 3.998 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_1 491520 thrpt 15 325.054 ± 0.712 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_1 2097152 thrpt 15 77.335 ± 0.249 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_10 81920 thrpt 15 5127.397 ± 124.657 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_10 491520 thrpt 15 894.737 ± 4.701 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_10 2097152 thrpt 15 206.414 ± 2.523 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_20 81920 thrpt 15 7907.056 ± 28.022 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_20 491520 thrpt 15 1374.817 ± 4.420 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_20 2097152 thrpt 15 325.101 ± 0.932 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_30 81920 thrpt 15 9654.601 ± 40.498 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_30 491520 thrpt 15 1764.192 ± 6.306 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_30 2097152 thrpt 15 416.434 ± 1.790 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_40 81920 thrpt 15 10563.802 ± 30.043 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_40 491520 thrpt 15 1891.552 ± 4.140 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_40 2097152 thrpt 15 449.588 ± 4.443 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_medium 81920 thrpt 15 9263.776 ± 98.204 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_medium 491520 thrpt 15 1514.433 ± 0.863 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_medium 2097152 thrpt 15 356.831 ± 0.588 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_long 81920 thrpt 15 12117.442 ± 424.084 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_long 491520 thrpt 15 2114.019 ± 2.865 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_long 2097152 thrpt 15 503.861 ± 5.616 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_vlarge 81920 thrpt 15 11603.539 ± 28.604 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_vlarge 491520 thrpt 15 2050.525 ± 1.159 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl ascii_vlarge 2097152 thrpt 15 519.435 ± 5.892 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_1 81920 thrpt 15 3598.613 ± 27.463 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_1 491520 thrpt 15 589.760 ± 2.930 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_1 2097152 thrpt 15 142.267 ± 1.822 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_10 81920 thrpt 15 6516.930 ± 155.093 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_10 491520 thrpt 15 1124.501 ± 51.999 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_10 2097152 thrpt 15 268.392 ± 10.699 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_20 81920 thrpt 15 7444.068 ± 28.467 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_20 491520 thrpt 15 1251.821 ± 63.880 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_20 2097152 thrpt 15 316.346 ± 4.879 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_30 81920 thrpt 15 7735.062 ± 33.040 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_30 491520 thrpt 15 1369.589 ± 23.248 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_30 2097152 thrpt 15 310.114 ± 12.392 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_40 81920 thrpt 15 7861.299 ± 44.006 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_40 491520 thrpt 15 1426.798 ± 1.373 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_40 2097152 thrpt 15 328.560 ± 8.392 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_medium 81920 thrpt 15 5302.579 ± 67.898 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_medium 491520 thrpt 15 829.204 ± 5.262 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_medium 2097152 thrpt 15 210.442 ± 0.308 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_long 81920 thrpt 15 5704.934 ± 119.140 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_long 491520 thrpt 15 934.739 ± 31.456 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_long 2097152 thrpt 15 211.968 ± 3.531 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_vlarge 81920 thrpt 15 6736.329 ± 244.534 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_vlarge 491520 thrpt 15 927.611 ± 12.725 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl cjk_vlarge 2097152 thrpt 15 231.230 ± 4.009 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_1 81920 thrpt 15 2330.881 ± 32.202 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_1 491520 thrpt 15 398.409 ± 5.090 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_1 2097152 thrpt 15 93.175 ± 1.428 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_10 81920 thrpt 15 4296.039 ± 48.292 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_10 491520 thrpt 15 748.831 ± 5.288 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_10 2097152 thrpt 15 178.731 ± 2.817 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_20 81920 thrpt 15 4953.465 ± 80.963 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_20 491520 thrpt 15 859.932 ± 27.221 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_20 2097152 thrpt 15 206.179 ± 6.109 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_30 81920 thrpt 15 5053.684 ± 232.941 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_30 491520 thrpt 15 878.187 ± 10.097 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_30 2097152 thrpt 15 208.340 ± 1.234 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_40 81920 thrpt 15 4932.669 ± 9.067 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_40 491520 thrpt 15 962.194 ± 57.633 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_40 2097152 thrpt 15 216.052 ± 2.011 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_medium 81920 thrpt 15 3523.366 ± 14.522 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_medium 491520 thrpt 15 593.160 ± 3.174 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_medium 2097152 thrpt 15 138.684 ± 0.154 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_long 81920 thrpt 15 3652.496 ± 86.858 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_long 491520 thrpt 15 630.856 ± 23.506 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_long 2097152 thrpt 15 152.758 ± 5.463 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_vlarge 81920 thrpt 15 4227.879 ± 7.569 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_vlarge 491520 thrpt 15 633.812 ± 1.601 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl latin_ext_vlarge 2097152 thrpt 15 148.096 ± 0.526 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl mixed 81920 thrpt 15 2610.423 ± 8.035 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl mixed 491520 thrpt 15 526.189 ± 11.442 ops/s ByteBuffersDataOutputWriteStringBenchmark.newImpl mixed 2097152 thrpt 15 117.501 ± 5.147 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_1 81920 thrpt 15 1449.904 ± 0.730 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_1 491520 thrpt 15 237.547 ± 0.981 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_1 2097152 thrpt 15 55.849 ± 0.035 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_10 81920 thrpt 15 3632.715 ± 7.330 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_10 491520 thrpt 15 608.009 ± 1.032 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_10 2097152 thrpt 15 143.089 ± 0.086 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_20 81920 thrpt 15 5513.255 ± 16.047 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_20 491520 thrpt 15 939.471 ± 0.893 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_20 2097152 thrpt 15 221.746 ± 0.437 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_30 81920 thrpt 15 6810.637 ± 33.651 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_30 491520 thrpt 15 1180.119 ± 2.552 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_30 2097152 thrpt 15 276.847 ± 0.688 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_40 81920 thrpt 15 7800.776 ± 14.315 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_40 491520 thrpt 15 1310.465 ± 2.490 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_40 2097152 thrpt 15 311.610 ± 0.348 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_medium 81920 thrpt 15 9042.239 ± 37.124 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_medium 491520 thrpt 15 1470.004 ± 5.105 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_medium 2097152 thrpt 15 346.409 ± 0.763 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_long 81920 thrpt 15 10884.157 ± 32.714 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_long 491520 thrpt 15 2047.124 ± 3.786 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_long 2097152 thrpt 15 485.906 ± 0.356 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_vlarge 81920 thrpt 15 11570.370 ± 10.070 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_vlarge 491520 thrpt 15 2070.484 ± 1.673 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl ascii_vlarge 2097152 thrpt 15 506.705 ± 11.358 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_1 81920 thrpt 15 2732.453 ± 18.110 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_1 491520 thrpt 15 473.930 ± 11.438 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_1 2097152 thrpt 15 109.360 ± 2.644 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_10 81920 thrpt 15 4078.860 ± 229.551 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_10 491520 thrpt 15 729.199 ± 42.046 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_10 2097152 thrpt 15 163.849 ± 0.211 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_20 81920 thrpt 15 4728.439 ± 108.248 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_20 491520 thrpt 15 756.027 ± 28.522 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_20 2097152 thrpt 15 180.958 ± 11.565 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_30 81920 thrpt 15 4945.852 ± 123.435 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_30 491520 thrpt 15 853.268 ± 4.967 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_30 2097152 thrpt 15 199.801 ± 0.083 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_40 81920 thrpt 15 5080.684 ± 114.575 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_40 491520 thrpt 15 872.155 ± 0.935 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_40 2097152 thrpt 15 198.099 ± 5.012 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_medium 81920 thrpt 15 5114.304 ± 16.729 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_medium 491520 thrpt 15 836.790 ± 3.880 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_medium 2097152 thrpt 15 193.791 ± 14.359 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_long 81920 thrpt 15 5636.091 ± 96.048 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_long 491520 thrpt 15 899.898 ± 4.430 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_long 2097152 thrpt 15 211.120 ± 0.845 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_vlarge 81920 thrpt 15 6610.988 ± 368.882 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_vlarge 491520 thrpt 15 897.061 ± 15.893 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl cjk_vlarge 2097152 thrpt 15 226.848 ± 9.797 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_1 81920 thrpt 15 1707.395 ± 20.488 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_1 491520 thrpt 15 290.791 ± 0.661 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_1 2097152 thrpt 15 68.084 ± 0.438 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_10 81920 thrpt 15 2562.599 ± 27.365 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_10 491520 thrpt 15 437.844 ± 3.480 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_10 2097152 thrpt 15 103.573 ± 0.355 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_20 81920 thrpt 15 2849.567 ± 5.463 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_20 491520 thrpt 15 488.922 ± 4.148 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_20 2097152 thrpt 15 114.500 ± 0.159 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_30 81920 thrpt 15 3112.005 ± 104.903 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_30 491520 thrpt 15 519.170 ± 1.386 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_30 2097152 thrpt 15 125.173 ± 4.172 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_40 81920 thrpt 15 3159.485 ± 13.467 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_40 491520 thrpt 15 545.461 ± 10.699 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_40 2097152 thrpt 15 129.708 ± 4.595 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_medium 81920 thrpt 15 3521.568 ± 4.052 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_medium 491520 thrpt 15 604.327 ± 17.521 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_medium 2097152 thrpt 15 138.913 ± 0.268 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_long 81920 thrpt 15 3583.787 ± 28.151 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_long 491520 thrpt 15 619.880 ± 9.109 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_long 2097152 thrpt 15 156.162 ± 0.251 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_vlarge 81920 thrpt 15 4230.539 ± 11.689 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_vlarge 491520 thrpt 15 636.914 ± 1.179 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl latin_ext_vlarge 2097152 thrpt 15 147.291 ± 0.189 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl mixed 81920 thrpt 15 2569.503 ± 34.528 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl mixed 491520 thrpt 15 471.877 ± 13.853 ops/s ByteBuffersDataOutputWriteStringBenchmark.prevImpl mixed 2097152 thrpt 15 111.679 ± 0.714 ops/s ``` </details> ### 80KB target (BEST_SPEED chunk size) | String Type | New | Prev | Delta | |---|---|---|---| | ascii_1 | 1924 | 1478 | **+30%** | | ascii_10 | 5127 | 3633 | **+41%** | | ascii_20 | 7907 | 5513 | **+43%** | | ascii_30 | 9655 | 6811 | **+42%** | | ascii_40 | 10564 | 7801 | **+35%** | | ascii_medium | 9264 | 9042 | +2% | | ascii_long | 12117 | 10884 | +11% | | ascii_vlarge | 11604 | 11570 | 0% | | cjk_1 | 3599 | 2732 | **+32%** | | cjk_10 | 6517 | 4079 | **+60%** | | cjk_20 | 7444 | 4728 | **+57%** | | cjk_30 | 7735 | 4946 | **+56%** | | cjk_40 | 7861 | 5081 | **+55%** | | cjk_medium | 5303 | 5114 | +4% | | cjk_long | 5705 | 5636 | +1% | | cjk_vlarge | 6736 | 6611 | +2% | | latin_ext_1 | 2331 | 1707 | **+37%** | | latin_ext_10 | 4296 | 2563 | **+68%** | | latin_ext_20 | 4953 | 2850 | **+74%** | | latin_ext_30 | 5054 | 3112 | **+62%** | | latin_ext_40 | 4933 | 3159 | **+56%** | | latin_ext_medium | 3523 | 3522 | 0% | | latin_ext_long | 3652 | 3584 | +2% | | latin_ext_vlarge | 4228 | 4231 | 0% | | mixed | 2610 | 2570 | +2% | ### 480KB target (BEST_COMPRESSION chunk size) | String Type | New | Prev | Delta | |---|---|---|---| | ascii_1 | 325 | 238 | **+37%** | | ascii_10 | 895 | 608 | **+47%** | | ascii_20 | 1375 | 939 | **+46%** | | ascii_30 | 1764 | 1180 | **+49%** | | ascii_40 | 1892 | 1310 | **+44%** | | ascii_medium | 1514 | 1470 | +3% | | ascii_long | 2114 | 2047 | +3% | | ascii_vlarge | 2051 | 2070 | −1% | | cjk_1 | 590 | 474 | **+24%** | | cjk_10 | 1125 | 729 | **+54%** | | cjk_20 | 1252 | 756 | **+66%** | | cjk_30 | 1370 | 853 | **+61%** | | cjk_40 | 1427 | 872 | **+64%** | | cjk_medium | 829 | 837 | −1% | | cjk_long | 935 | 900 | +4% | | cjk_vlarge | 928 | 897 | +3% | | latin_ext_1 | 398 | 291 | **+37%** | | latin_ext_10 | 749 | 438 | **+71%** | | latin_ext_20 | 860 | 489 | **+76%** | | latin_ext_30 | 878 | 519 | **+69%** | | latin_ext_40 | 962 | 545 | **+76%** | | latin_ext_medium | 593 | 604 | −2% | | latin_ext_long | 631 | 620 | +2% | | latin_ext_vlarge | 634 | 637 | 0% | | mixed | 526 | 472 | **+12%** | ### 2MB target (larger workload) | String Type | New | Prev | Delta | |---|---|---|---| | ascii_1 | 77 | 56 | **+38%** | | ascii_10 | 206 | 143 | **+44%** | | ascii_20 | 325 | 222 | **+47%** | | ascii_30 | 416 | 277 | **+50%** | | ascii_40 | 450 | 312 | **+44%** | | ascii_medium | 357 | 346 | +3% | | ascii_long | 504 | 486 | +4% | | ascii_vlarge | 519 | 507 | +3% | | cjk_1 | 142 | 109 | **+30%** | | cjk_10 | 268 | 164 | **+64%** | | cjk_20 | 316 | 181 | **+75%** | | cjk_30 | 310 | 200 | **+55%** | | cjk_40 | 329 | 198 | **+66%** | | cjk_medium | 210 | 194 | +9% | | cjk_long | 212 | 211 | 0% | | cjk_vlarge | 231 | 227 | +2% | | latin_ext_1 | 93 | 68 | **+37%** | | latin_ext_10 | 179 | 104 | **+73%** | | latin_ext_20 | 206 | 115 | **+80%** | | latin_ext_30 | 208 | 125 | **+66%** | | latin_ext_40 | 216 | 130 | **+67%** | | latin_ext_medium | 139 | 139 | 0% | | latin_ext_long | 153 | 156 | −2% | | latin_ext_vlarge | 148 | 147 | +1% | | mixed | 118 | 112 | **+5%** | ## More thoughts I initially attempted a more aggressive approach: adding a second fast path also for 2-byte VInt (charCount 128–5461) and a `calcVIntSizeForUTF8Length` utility method with early-exit scanning for ambiguous ranges. This showed strong wins for almost all setups but for configurations with larger block sizes or larger target written size (enlarged docs per chunk or chunk size). But for the default settings (80KB chunk / 1024 docs), there is one ~5% regression on `ascii_medium`, plus it introduced extra branches, more complex logic. So I kept it simple: only the 1-byte VInt fast path. The code is straightforward, easy to read, and no regressions for all cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
