zacharymorn commented on PR #12194: URL: https://github.com/apache/lucene/pull/12194#issuecomment-1491226817
Thanks @mikemccand for the review! Yes I did run the full 20 iterations with `enwiki-20130102-lines.txt` corpus. I also tried to just run a single `AndHighNotMonth` task to find out which one was giving 800% improvement, but the highest improvement I saw was around 150% with this single task: ``` AndHighNotMonth: +its -monthPostings:apr # freq=1160703 ``` ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 182.12 (28.8%) 84.26 (8.4%) -53.7% ( -70% - -23%) 0.000 AndHighNotMonth 64.22 (8.4%) 160.51 (59.1%) 149.9% ( 75% - 237%) 0.000 ``` Not sure if that 800% comes from JVM's JIT compilation actually? For the above task, being able to leverage skip-data helps a lot to skip much more than what a typical block of 128 docs would allow: ``` First doc in block: 0 | Last doc in block: 127 | Furthest skip entry doc: -1 First doc in block: 128 | Last doc in block: 255 | Furthest skip entry doc: 524287 | Number of continuous matching docs: 524032 First doc in block: 524288 | Last doc in block: 524415 | Furthest skip entry doc: 589823 | Number of continuous matching docs: 65408 First doc in block: 589824 | Last doc in block: 589951 | Furthest skip entry doc: 655359 | Number of continuous matching docs: 65408 First doc in block: 655360 | Last doc in block: 655487 | Furthest skip entry doc: 720895 | Number of continuous matching docs: 65408 First doc in block: 720896 | Last doc in block: 721023 | Furthest skip entry doc: 786431 | Number of continuous matching docs: 65408 First doc in block: 786432 | Last doc in block: 786559 | Furthest skip entry doc: 851967 | Number of continuous matching docs: 65408 First doc in block: 851968 | Last doc in block: 852095 | Furthest skip entry doc: 917503 | Number of continuous matching docs: 65408 First doc in block: 917504 | Last doc in block: 917631 | Furthest skip entry doc: 983039 | Number of continuous matching docs: 65408 First doc in block: 983040 | Last doc in block: 983167 | Furthest skip entry doc: 991231 | Number of continuous matching docs: 8064 First doc in block: 991232 | Last doc in block: 991359 | Furthest skip entry doc: 992255 | Number of continuous matching docs: 896 First doc in block: 992256 | Last doc in block: 992383 | Furthest skip entry doc: 993279 | Number of continuous matching docs: 896 First doc in block: 993280 | Last doc in block: 993407 | Furthest skip entry doc: 994303 | Number of continuous matching docs: 896 First doc in block: 994304 | Last doc in block: 994431 | Furthest skip entry doc: 995327 | Number of continuous matching docs: 896 First doc in block: 995328 | Last doc in block: 995455 | Furthest skip entry doc: 996351 | Number of continuous matching docs: 896 First doc in block: 996352 | Last doc in block: 996479 | Furthest skip entry doc: 997375 | Number of continuous matching docs: 896 First doc in block: 997376 | Last doc in block: 997503 | Furthest skip entry doc: 998399 | Number of continuous matching docs: 896 First doc in block: 998400 | Last doc in block: 998527 | Furthest skip entry doc: -1 First doc in block: 998528 | Last doc in block: 998655 | Furthest skip entry doc: -1 First doc in block: 998656 | Last doc in block: 998783 | Furthest skip entry doc: -1 First doc in block: 998784 | Last doc in block: 998911 | Furthest skip entry doc: -1 First doc in block: 998912 | Last doc in block: 999039 | Furthest skip entry doc: -1 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org