glawson0 commented on pull request #157: URL: https://github.com/apache/lucene/pull/157#issuecomment-865954873
For additional testing I've built a few indexes and played some queries against them. I didn't find any changes to latency or memory usage. Depending on the data set used there was impact on results. The English data sets didn't show any impact. Using it with our Japanese analyzer had a larger impact. In one index 10% of queries were impacted by the change with those queries matching more documents. Previously those documents had lost tokens from `FlattenGraphFilter`. The primary cause of these dropped tokens was graphs with gaps in them. The `JapaneseTokenizer` would create graphs that `FilteringTokenFilter`s would remove tokens from and then `FlattenGraphFilter` would incorrectly flatten them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org