mayya-sharipova edited a comment on pull request #11: URL: https://github.com/apache/lucene/pull/11#issuecomment-805296177
I've run indexing benchmarking using [luceneutil](https://github.com/mikemccand/luceneutil). And here are the results: - indexing time in ms - baseline: master branch - candidate: this PR | Dataset | Baseline | Candidate | Difference | | :--- | ---: | ---: | ---: | | wikimedium500k | 98387 | 106189 | 7.9% | | wikimedium1m | 174246 | 177075 | 1.6% | | wikimedium10m | 1356184 | 1359149 | 0.2% | --- [wikimedium1m profiler](https://gist.github.com/mayya-sharipova/b4c8f47165a4bde8d2625487d2132319) | CPU profile % samples, Baseline | CPU profile % samples, Candidate | | :--- | :--- | |0.73% 783 `IndexingChain$PerField#invert` | 0.80% 864 `IndexingChain#getOrAddPerField`| | | 0.61% 658 `IndexingChain$FieldSchema#<init>` | | | 0.58% 633 `IndexingChain#processField` | --- [wikimedium10m profiler](https://gist.github.com/mayya-sharipova/68cf6d543863029777ad3028c662ccd1): Extracting from CPU profiler everything related to `IndexingChain`, we can see that in **Candidate** there is an overhead spent on `assertSameSchema` that is a part of `processDocument`. | CPU profile % samples, Baseline | CPU profile % samples, Candidate | | :--- | :--- | |0.90% 8259 `IndexingChain$PerField#invert` | 1.00% 9162 `IndexingChain#getOrAddPerField`| | 0.65% 5956 `IndexingChain#getOrAddField` | 0.89% 8091 `IndexingChain#processDocument` | | 0.56% 5161 `IndexingChain#processField` | 0.69% 6255 `IndexingChain$PerField#invert` | | | 0.55% 5044 `IndexingChain$FieldSchema#<init>` | | | 0.52% 4744 `IndexingChain$FieldSchema#assertSameSchema` | cc @jpountz -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org