dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423575442
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software F
gf2121 commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423547781
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnumFrame.java:
##
@@ -89,6 +89,9 @@ final class IntersectTermsEnumFrame {
final
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851433406
Could we consider not changing `MemorySegmentIndexInput` for java 19 and
java20? As a preview feature , it seems reasonable that we only do
optimizations in higher versions, and they ar
kaivalnp opened a new pull request, #12922:
URL: https://github.com/apache/lucene/pull/12922
Discovered in #12921, and introduced in #12679
The first issue is that we weren't advancing the `VectorScorer`
[here](https://github.com/apache/lucene/blob/cf13a9295052288b748ed8f279f05ee26f3
gf2121 commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423505676
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnumFrame.java:
##
@@ -89,6 +89,9 @@ final class IntersectTermsEnumFrame {
final
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423470044
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Fo
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423469570
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseKatakanaUppercaseFilter.java:
##
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Fo
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423469570
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseKatakanaUppercaseFilter.java:
##
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Fo
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423469570
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseKatakanaUppercaseFilter.java:
##
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Fo
gf2121 commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423493276
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnum.java:
##
@@ -198,6 +204,7 @@ private IntersectTermsEnumFrame pushFrame(int state)
gf2121 commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423493276
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnum.java:
##
@@ -198,6 +204,7 @@ private IntersectTermsEnumFrame pushFrame(int state)
dweiss commented on issue #12907:
URL: https://github.com/apache/lucene/issues/12907#issuecomment-1851380728
I agree, I think it's a poor solution to something that is probably not a
problem in the first place...
--
This is an automated message from the Apache Git Service.
To respond to t
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423482123
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Fo
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423470044
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Fo
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423469570
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseKatakanaUppercaseFilter.java:
##
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Fo
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851319229
> I have a better idea. Lets keep the 2 different method, but do another
trick:
With this great idea, the performance comes back!
java21
```
Benchmark
dungba88 commented on code in PR #12879:
URL: https://github.com/apache/lucene/pull/12879#discussion_r1423403578
##
lucene/core/src/java/org/apache/lucene/util/fst/ReadWriteDataOutput.java:
##
@@ -56,14 +66,59 @@ public long ramBytesUsed() {
public void freeze() {
froz
dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423402789
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software F
dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423384044
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseKatakanaUppercaseFilter.java:
##
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software F
dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423383461
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseKatakanaUppercaseFilter.java:
##
@@ -0,0 +1,83 @@
+package org.apache.lucene.analysis.ja;
+
dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423382747
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software F
dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423381320
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software F
dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423380431
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software F
dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423380431
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software F
easyice opened a new issue, #12921:
URL: https://github.com/apache/lucene/issues/12921
### Description
Seems to be related to a https://github.com/apache/lucene/pull/12679
### Gradle command to reproduce
./gradlew :lucene:core:test --tests
"org.apache.lucene.search.TestF
dungba88 commented on code in PR #12885:
URL: https://github.com/apache/lucene/pull/12885#discussion_r1423374521
##
lucene/analysis/kuromoji/src/test/org/apache/lucene/analysis/ja/TestJapaneseReadingFormFilter.java:
##
@@ -88,6 +88,11 @@ protected TokenStreamComponents createCom
dungba88 commented on code in PR #12885:
URL: https://github.com/apache/lucene/pull/12885#discussion_r1423370558
##
lucene/analysis/kuromoji/src/test/org/apache/lucene/analysis/ja/TestJapaneseReadingFormFilter.java:
##
@@ -88,6 +88,11 @@ protected TokenStreamComponents createCom
dungba88 commented on code in PR #12885:
URL: https://github.com/apache/lucene/pull/12885#discussion_r1423365449
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseReadingFormFilter.java:
##
@@ -43,10 +43,38 @@ public JapaneseReadingFormFilter(TokenStream
gsmiller merged PR #12920:
URL: https://github.com/apache/lucene/pull/12920
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
dungba88 commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1423317624
##
lucene/core/src/java/org/apache/lucene/util/ArrayUtil.java:
##
@@ -330,15 +330,36 @@ public static int[] growExact(int[] array, int newLength)
{
return copy;
daixque commented on PR #12915:
URL: https://github.com/apache/lucene/pull/12915#issuecomment-1851177452
Hi @mikemccand and @kojisekig, thank you for your reviews.
I updated some codes along with the comments and add lines to module-info
and resources to make `gradle check` green.
--
T
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423277326
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseKatakanaUppercaseFilter.java:
##
@@ -0,0 +1,83 @@
+package org.apache.lucene.analysis.ja;
+
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423277099
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,65 @@
+package org.apache.lucene.analysis.ja;
R
daixque commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423277455
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,65 @@
+package org.apache.lucene.analysis.ja;
+
kojisekig commented on PR #12915:
URL: https://github.com/apache/lucene/pull/12915#issuecomment-1851116639
From a Japanese perspective, the necessity sounds reasonable. Thank you for
the contribution!
--
This is an automated message from the Apache Git Service.
To respond to the message,
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851115440
Thanks for the detailed description, got it! looks better :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851098019
Hi @easyice,
I have a better idea. Lets keep the 2 different method, but do another trick:
- The base class DataInput implements the public outer loop as a final
implementation,
epotyom commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1851065973
@mikemccand ,
> can you open a new issue for the followon tasks?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Git
epotyom commented on PR #12679:
URL: https://github.com/apache/lucene/pull/12679#issuecomment-1851062374
I see random test failures that could be related to this change:
```
> java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds
for length 123
> at
uschindler commented on issue #12907:
URL: https://github.com/apache/lucene/issues/12907#issuecomment-1851052636
I don't understand the problem they'd like to solve. The DirectoryScanner
class of ant is able to find those loops. Otherwise Gradle or Ant would have
the same problem.
Se
epotyom opened a new issue, #12919:
URL: https://github.com/apache/lucene/issues/12919
### Description
In #12180 we added TaxonomyReader#getBulkOrdinals method.
Opening separate issue for 2 remaining tasks from #12180:
1. (#12862) Add `Facets#getBulkSpecificValues` method
dweiss commented on issue #12907:
URL: https://github.com/apache/lucene/issues/12907#issuecomment-1850924293
I asked Infra, here: https://issues.apache.org/jira/browse/INFRA-25269
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu
uschindler commented on issue #12907:
URL: https://github.com/apache/lucene/issues/12907#issuecomment-1850883618
On Policeman Jenkins I would be able to pass sysprops using the shell
script, but that's non-standard. By defat it works with SSH automatically.

(thank you @singh264), copied below:
T
jpountz commented on code in PR #12908:
URL: https://github.com/apache/lucene/pull/12908#discussion_r1422609478
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene90/Lucene90RWPostingsFormat.java:
##
@@ -75,7 +75,11 @@ public FieldsConsumer fieldsConsumer
jpountz commented on PR #12908:
URL: https://github.com/apache/lucene/pull/12908#issuecomment-1850255060
> OK, let me try feeding LineFileDocs into this test case.
FTR I will look into it, but it's probably best done in a follow-up PR
rather than this one, let's merge this PR first?
mikemccand commented on code in PR #12908:
URL: https://github.com/apache/lucene/pull/12908#discussion_r1422599871
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene90/Lucene90RWPostingsFormat.java:
##
@@ -75,7 +75,11 @@ public FieldsConsumer fieldsConsu
daixque opened a new pull request, #12915:
URL: https://github.com/apache/lucene/pull/12915
### Description
Sutegana (捨て仮名) is small letter of hiragana and katakana in Japanese. In the
old Japanese text, sutegana (捨て仮名) is not used unlikely to modern one. For
example:
- "ストップウ
jpountz commented on code in PR #12908:
URL: https://github.com/apache/lucene/pull/12908#discussion_r1422549348
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene90/Lucene90RWPostingsFormat.java:
##
@@ -75,7 +75,11 @@ public FieldsConsumer fieldsConsumer
mikemccand commented on code in PR #12908:
URL: https://github.com/apache/lucene/pull/12908#discussion_r1422541990
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_codecs/lucene90/Lucene90RWPostingsFormat.java:
##
@@ -75,7 +75,11 @@ public FieldsConsumer fieldsConsu
mikemccand commented on code in PR #12908:
URL: https://github.com/apache/lucene/pull/12908#discussion_r1422538272
##
lucene/core/src/java/org/apache/lucene/util/fst/FST.java:
##
@@ -109,10 +109,20 @@ public enum INPUT_TYPE {
// Increment version to change it
private sta
mikemccand commented on PR #12908:
URL: https://github.com/apache/lucene/pull/12908#issuecomment-1850181921
> Hmmm... I agree we can't expect `BasePostingsFormatTestCase` to catch all
bw compat problems, but the `TestLucene90PostingsFormat` from this PR writes
data in the 9.8 format of the
jpountz commented on PR #12908:
URL: https://github.com/apache/lucene/pull/12908#issuecomment-1850184681
OK, let me try feeding LineFileDocs into this test case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
ChrisHegarty commented on PR #12888:
URL: https://github.com/apache/lucene/pull/12888#issuecomment-1850178914
relates: #12753
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
stefanvodita commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1422523818
##
lucene/core/src/java/org/apache/lucene/util/ArrayUtil.java:
##
@@ -330,15 +330,36 @@ public static int[] growExact(int[] array, int newLength)
{
return c
stefanvodita commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1422523360
##
lucene/core/src/test/org/apache/lucene/util/hnsw/HnswGraphTestCase.java:
##
@@ -757,6 +757,30 @@ public void testRamUsageEstimate() throws IOException {
l
jpountz commented on code in PR #12908:
URL: https://github.com/apache/lucene/pull/12908#discussion_r1422523889
##
lucene/core/src/java/org/apache/lucene/util/fst/FST.java:
##
@@ -109,10 +109,20 @@ public enum INPUT_TYPE {
// Increment version to change it
private static
mikemccand commented on PR #12900:
URL: https://github.com/apache/lucene/pull/12900#issuecomment-1850151623
I've confirmed the new (failing) BWC test from #12901 now passes with this
PR. I'll review ...
--
This is an automated message from the Apache Git Service.
To respond to the messag
gf2121 commented on code in PR #12912:
URL: https://github.com/apache/lucene/pull/12912#discussion_r1422514730
##
lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestBackwardsCompatibility.java:
##
@@ -2265,4 +2268,47 @@ public void testReadNMinusTwoSegmentInfos
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1422507453
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException {
}
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1422507453
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException {
}
mikemccand commented on issue #12901:
URL: https://github.com/apache/lucene/issues/12901#issuecomment-1850135758
> Actually I can un-@ignore at least in main. I'll go do that.
D'oh! No, I cannot -- it will still fail in main, 9.x and 9.9.x until we
get the fix (#12900) in.
--
This
jpountz commented on PR #12908:
URL: https://github.com/apache/lucene/pull/12908#issuecomment-1850129755
Hmmm... I agree we can't expect `BasePostingsFormatTestCase` to catch all bw
compat problems, but the `TestLucene90PostingsFormat` from this PR writes data
in the 9.8 format of the terms
mikemccand commented on issue #12901:
URL: https://github.com/apache/lucene/issues/12901#issuecomment-1850129073
OK this is done -- I pushed the new BWC test case (@Ignore'd) to 9.9.x, 9.x
and main.
Actually I can un-@Ignore at least in main. I'll go do that.
--
This is an automa
mikemccand merged PR #12913:
URL: https://github.com/apache/lucene/pull/12913
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
easyice commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1422494949
##
lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java:
##
@@ -303,6 +304,30 @@ public byte readByte(long pos) throws IOException {
}
}
mikemccand commented on PR #12908:
URL: https://github.com/apache/lucene/pull/12908#issuecomment-1850117651
> It's actually a bad news that all tests pass here, as this means that our
`BasePostingsFormatTestCase` is not good enough to uncover the recent problem
with `Terms#intersect`... So
1 - 100 of 157 matches
Mail list logo