mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1180836771
This is the comment that stack overflows during conversion: ``` A note on, and output from contrib/benchmark: I'm getting really poor results compared to my own test and live enviroment stats. At query time I expected maximum 1/6th time spent in InstantiatedIndex than RAMDirectory, but it turns out that in the be\ nchmarker the speed is almost the same as RAMDirectory. Retrieving documents is only 1/5th of the speed rather than maximum 1/60th as expected. Investigated the code a bit and noticed that ReadTask creates a new instance of IndexReader and IndexSearcher for each query. Could this be the reason? Memory consumption is 3x of a RAMDirectory, but half of the memory is spent on keeping the Document instances in heap. Perhaps it would be interesting to use the same persistency for these as in the Direc\ tory implementations. The merge factor sweet spot is around 2500, where it turns out to be a little bit faster than the RAMDirectory sweet spot. At defualt 10 InstantiatedIndex consumes about 5x more time than a RAMDirectory. \ If I fix the locklessness as suggested in previous comment, it most probably will be much faster than a RAMDirectory at any setting. /** * The sweet spot for this implementation is at 2500. * <p/> * Benchmark output: * <pre> * ------------> Report sum by Prefix (MAddDocs) and Round (8 about 8 out of 160153) * Operation round mrg buf cmpnd runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem * MAddDocs_20000 0 10 10 true 1 20000 81,4 245,68 200 325 152 268 156 928 * MAddDocs_20000 - 1 1000 10 true - - 1 - - 20000 - - 494,1 - - 40,47 - 247 119 072 - 347 025 408 * MAddDocs_20000 2 10 100 true 1 20000 104,8 190,81 233 895 552 363 720 704 * MAddDocs_20000 - 3 2000 100 true - - 1 - - 20000 - - 527,2 - - 37,94 - 266 136 448 - 378 273 792 * MAddDocs_20000 4 10 10 false 1 20000 103,2 193,75 222 089 792 378 273 792 * MAddDocs_20000 - 5 3000 10 false - - 1 - - 20000 - - 545,2 - - 36,69 - 237 917 152 - 378 273 792 * MAddDocs_20000 6 10 100 false 1 20000 102,7 194,67 237 018 976 378 273 792 * MAddDocs_20000 - 7 4000 100 false - - 1 - - 20000 - - 535,8 - - 37,33 - 309 680 640 - 501 968 896 * </pre> * * @see org.apache.lucene.index.IndexWriterInterface#setMergeFactor(int) */ public void setMergeFactor(int mergeFactor) { I would not pay to much attention to the numbers below until I've got the benchmarker under control, but here are the stats: Output from InstantiatedIndex: [java] ------------> Report Sum By (any) Name (19 about 160153 out of 160153) [java] Operation round mrg buf cmpnd runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem [java] Rounds_8 0 10 10 true 1 25142792 19?842,0 1?267,15 291?055?680 377?163?776 [java] Populate - - - - - - - - - - - - - - - - - - 8 - - 20003 - - 148,1 - 1?080,73 - 249?711?264 - 354?926?592 [java] CreateIndex - - - - 8 1 1?142,9 0,01 178?670?624 322?181?120 [java] MAddDocs_20000 - - - - - - - - - - - - - - - - 8 - - 20000 - - 148,0 - 1?080,72 - 249?706?256 - 354?926?592 [java] AddDoc - - - - 160000 1 156,2 1?024,02 228?890?976 339?588?384 [java] Optimize - - - - - - - - - - - - - - - - - - 8 - - - - 1 - - 8?000,0 - - 0,00 - 249?679?056 - 354?926?592 [java] CloseIndex - - - - 8 1 2?666,7 0,00 249?689?056 354?926?592 [java] OpenReader - - - - - - - - - - - - - - - - - 16 - - - - 1 - 16?000,0 - - 0,00 - 246?507?072 - 354?926?592 [java] SearchSameRdr_5000 - - - - 8 5000 806,6 49,59 250?121?728 354?926?592 [java] CloseReader - - - - - - - - - - - - - - - - - 16 - - - - 1 - 16?000,0 - - 0,00 - 249?146?336 - 354?971?648 [java] WarmNewRdr_50 - - - - 8 1000000 3?118?908,5 2,57 249?616?272 354?926?592 [java] SrchNewRdr_500 - - - - - - - - - - - - - - - - 8 - - - 500 - - 806,5 - - 4,96 - 252?762?128 - 354?926?592 [java] SrchTrvNewRdr_300 - - - - 8 335500 135?891,9 19,75 250?484?240 354?926?592 [java] SrchTrvRetNewRdr_100 - - - - - - - - - - - - - - 8 - - 209216 - 267?326,0 - - 6,26 - 245?991?776 - 354?926?592 [java] SearchSameRdr_5000_2500/sec_Par - - - - 8 5000 1?163,3 34,39 250?892?304 355?016?704 [java] WarmNewRdr_50_25/sec_Par - - - - - - - - - - - - - 8 - - 1000000 - 507?872,0 - - 15,75 - 250?855?648 - 355?016?704 [java] SrchNewRdr_50_25/sec_Par - - - - 8 50 25,5 15,69 254?289?584 355?016?704 [java] SrchTrvNewRdr_300_150/sec_Par - - - - - - - - - - - 8 - - 335500 - 177?807,2 - - 15,10 - 251?699?584 - 355?016?704 [java] SrchTrvRetNewRdr_100_50/sec_Par - - - - 8 232076 117?106,6 15,85 252?423?376 355?016?704 Output from RAMDirectory: [java] ------------> Report Sum By (any) Name (19 about 160153 out of 160153) [java] Operation round mrg buf cmpnd runCnt recsPerRun rec/s elapsedSec avgUsedMem avgTotalMem [java] Rounds_8 0 10 10 true 1 25142792 36?177,3 694,99 119?427?680 182?538?240 [java] Populate - - - - - - - - - - - - - - - - - - 8 - - 20003 - - 482,0 - - 331,99 - 114?288?472 - 140?156?416 [java] CreateIndex - - - - 8 1 2?666,7 0,00 48?867?204 124?752?384 [java] MAddDocs_20000 - - - - - - - - - - - - - - - - 8 - - 20000 - - 499,2 - - 320,51 - 111?734?320 - 135?969?280 [java] AddDoc - - - - 160000 1 604,9 264,49 90?860?048 130?812?488 [java] Optimize - - - - - - - - - - - - - - - - - - 8 - - - - 1 - - - 0,7 - - 11,48 - 123?532?104 - 140?156?416 [java] CloseIndex - - - - 8 1 8?000,0 0,00 114?288?472 140?156?416 [java] OpenReader - - - - - - - - - - - - - - - - - 16 - - - - 1 - - 197,5 - - 0,08 - 113?600?096 - 143?475?712 [java] SearchSameRdr_5000 - - - - 8 5000 1?209,4 33,07 115?720?920 143?314?944 [java] CloseReader - - - - - - - - - - - - - - - - - 16 - - - - 1 - 16?000,0 - - 0,00 - 102?590?368 - 145?079?552 [java] WarmNewRdr_50 - - - - 8 1000000 65?734,9 121,70 105?734?472 143?314?944 [java] SrchNewRdr_500 - - - - - - - - - - - - - - - - 8 - - - 500 - - 417,4 - - 9,58 - 104?480?168 - 146?795?008 [java] SrchTrvNewRdr_300 - - - - 8 335500 133?532,3 20,10 116?353?456 146?795?008 [java] SrchTrvRetNewRdr_100 - - - - - - - - - - - - - - 8 - - 209216 - 60?686,3 - - 27,58 - 124?211?040 - 146?795?008 [java] SearchSameRdr_5000_2500/sec_Par - - - - 8 5000 1?596,0 25,06 114?145?856 146?844?160 [java] WarmNewRdr_50_25/sec_Par - - - - - - - - - - - - - 8 - - 1000000 - 105?678,9 - - 75,70 - 104?830?320 - 146?844?160 [java] SrchNewRdr_50_25/sec_Par - - - - 8 50 25,5 15,70 107?417?728 146?844?160 [java] SrchTrvNewRdr_300_150/sec_Par - - - - - - - - - - - 8 - - 335500 - 178?635,6 - - 15,02 - 116?779?312 - 146?835?968 [java] SrchTrvRetNewRdr_100_50/sec_Par - - - - 8 232076 100?569,2 18,46 111?881?152 146?819?584 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org