[GitHub] [lucene] mikemccand commented on pull request #15: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation

2022-10-09 Thread GitBox
mikemccand commented on PR #15: URL: https://github.com/apache/lucene/pull/15#issuecomment-1272530502 I have not looked closely at this PR but it sounds very useful (enabling ICU transformations pre-tokenization), looks like the requested change from @uschindler was addressed, and `precommi

[GitHub] [lucene] mdmarshmallow opened a new pull request, #11841: GITHUB-11761 (part 2): Fix unit tests to cleany work with new TierMer…

2022-10-09 Thread GitBox
mdmarshmallow opened a new pull request, #11841: URL: https://github.com/apache/lucene/pull/11841 …gePolicy delete pct default ### Description We recently changed the default delete percentage of `TieredMergePolicy` that broke 2 unit tests. To fix this, we originally change the

[GitHub] [lucene] mdmarshmallow commented on pull request #11831: GITHUB-11761: Move minimum TieredMergePolicy delete percentage from 2…

2022-10-09 Thread GitBox
mdmarshmallow commented on PR #11831: URL: https://github.com/apache/lucene/pull/11831#issuecomment-1272635717 These new tests did fail, I created a new PR (tied to the same issue as this one) to more cleanly fix the unit tests without having to change the default percentage: https://github

[GitHub] [lucene] rmuir commented on pull request #15: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation

2022-10-09 Thread GitBox
rmuir commented on PR #15: URL: https://github.com/apache/lucene/pull/15#issuecomment-1272840381 @mikemccand i had 2 remaining concerns: 1. legal, there is a lot of copied/forked ICU code here and any of that should be done correctly 2. maintenance: consequence of the above, it is actu