dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423585285
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseKatakanaUppercaseFilter.java:
##
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software F
dungba88 commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1423583689
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,65 @@
+package org.apache.lucene.analysis.ja;
+
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1423606084
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws IOE
s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1423637103
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws IOE
uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1851570021
> Could we consider not changing `MemorySegmentIndexInput` for java 19 and
java20? As a preview feature , it seems reasonable that we only do
optimizations in higher versions, and the
gf2121 commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423704366
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -495,7 +495,7 @@ public boolean seekExact(BytesRef target) throws
IOEx
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1423722848
##
lucene/core/src/java/org/apache/lucene/store/ByteBuffersIndexInput.java:
##
@@ -205,6 +205,12 @@ public void readLongs(long[] dst, int offset, int length)
throw
easyice commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1423744250
##
lucene/core/src/java/org/apache/lucene/store/ByteBuffersIndexInput.java:
##
@@ -205,6 +205,12 @@ public void readLongs(long[] dst, int offset, int length)
throws I
shubhamvishu commented on issue #12918:
URL: https://github.com/apache/lucene/issues/12918#issuecomment-1851719328
NiceThis would be really helpful!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
mikemccand commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423806163
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnum.java:
##
@@ -198,6 +204,7 @@ private IntersectTermsEnumFrame pushFrame(int st
mikemccand commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423808624
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnum.java:
##
@@ -198,9 +199,11 @@ private IntersectTermsEnumFrame pushFrame(int s
mikemccand commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423822953
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -495,7 +495,7 @@ public boolean seekExact(BytesRef target) throws
mikemccand commented on code in PR #12900:
URL: https://github.com/apache/lucene/pull/12900#discussion_r1423827461
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -495,7 +495,7 @@ public boolean seekExact(BytesRef target) throws
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1851816034
Since we will respin shortly 9.9.1, should we do this specifically for 9.9.1
now? And leave this issue open for failing the (some) build when the generated
FSTs are stale?
slow-J opened a new issue, #12923:
URL: https://github.com/apache/lucene/issues/12923
### Description
ForUtil generated from the gradle task is out of sync with spotless, forcing
a run of ./gradlew tidy.
This is not a large issue, but the code should be consistent in my opinion.
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1851827035
> Hmm is there a top-level `gradle` target to do this ...
Yes, yes there is: `./gradlew regenerate`.
--
This is an automated message from the Apache Git Service.
To respo
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1851829283
I'll open a PR shortly with the regenerated FSTs ... seems only `kuromoji`
and `nori` build FSTs.
--
This is an automated message from the Apache Git Service.
To respond to the
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1851845215
Hmm this is a bit of a spooky crab. If I use OpenJDK 17 (`openjdk full
version "17.0.9+8"`) to `./gradlew regenerate` on current `branch_9_9_0` I get
this horrifying failure:
mikemccand opened a new pull request, #12924:
URL: https://github.com/apache/lucene/pull/12924
This is the result of top-level `./gradlew regenerate` to rewrite all
generated stuff in our source tree. The only resulting `git diff` were the
Nori and Kuromoji FST dictionaries.
Note th
jpountz opened a new pull request, #12925:
URL: https://github.com/apache/lucene/pull/12925
This commit adds coverage to `Terms#intersect` to `CheckIndex` and indexes
`LineFileDocs` in `BasePostingsFormatTestCase` to get some coverage with
real-world data.
With this change, `TestLuce
jpountz commented on PR #12900:
URL: https://github.com/apache/lucene/pull/12900#issuecomment-1851928485
FWIW I confirmed that this change makes the new test in #12925 pass.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
gf2121 commented on PR #12900:
URL: https://github.com/apache/lucene/pull/12900#issuecomment-1851927851
Thanks for review and great advice @mikemccand !
> I think we should merge this to main, 9x and 99x and let's let CI chew on
it for a bit (day or so) before cutting 9.9.1?
+1
slow-J commented on issue #12923:
URL: https://github.com/apache/lucene/issues/12923#issuecomment-1851934034
Ah ok, answering myself: running the command: `./gradlew generateForUtil`
will apply spotless
--
This is an automated message from the Apache Git Service.
To respond to the mes
slow-J commented on issue #12923:
URL: https://github.com/apache/lucene/issues/12923#issuecomment-1851936071
An actual valid nuance related to regeneration:
running `./gradlew regenerate`
will modify
```
modified:
lucene/analysis/kuromoji/src/resources/org/apache/lucene/a
gf2121 merged PR #12900:
URL: https://github.com/apache/lucene/pull/12900
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
dweiss commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1851942814
./gradlew regenerate regenerates everything, everywhere. I think you can be
more selective by passing -p (project path).
--
This is an automated message from the Apache Git Service
dweiss commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1851945354
https://github.com/apache/lucene/blob/main/help/regeneration.txt#L64-L76
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Git
benwtrent commented on code in PR #12922:
URL: https://github.com/apache/lucene/pull/12922#discussion_r1423943933
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -255,6 +255,11 @@ static VectorSimilarityScorer fromAcceptDocs(
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1851980193
Thanks @dweiss -- I regenerated everything, but only the two dicts were `git
diff`, which I took to be a good sign (we haven't missed to regenerated any of
the other many things)
mikemccand commented on PR #12925:
URL: https://github.com/apache/lucene/pull/12925#issuecomment-1851987006
> With this change, `TestLucene90PostingsFormat` now exhibits #12895.
Oooh that's aweosme! I'll review.
--
This is an automated message from the Apache Git Service.
To respon
jpountz merged PR #12925:
URL: https://github.com/apache/lucene/pull/12925
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852004695
> Hmm this is a bit of a spooky crab. If I use OpenJDK 17 (`openjdk full
version "17.0.9+8"`) to `./gradlew regenerate` on current `branch_9_9_0` I get
this horrifying failure:
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852013823
Hi, I added the implementation for `ByteBufferIndexInput`, Unfortunately,
the benchmark shows a bit regression:
java17
```
Benchmark
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852016535
I checked that file, there's no such special characters? Or do I miss
something. I only checked main branch
--
This is an automated message from the Apache Git Service.
To
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852023328
> Can you post in which exact task name this happened?
Thanks @uschindler. It happened when I ran `./gradlew regenerate` on 9.9.x
branch with OpenJDK 17 (`openjdk full ver
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852025005
I regenrated locally, no issues:
```
openjdk version "17.0.9" 2023-10-17
OpenJDK Runtime Environment Temurin-17.0.9+9 (build 17.0.9+9)
OpenJDK 64-Bit Server VM Temurin
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852026627
And you're right -- I don't see any special characters here:
https://github.com/apache/lucene/blob/main/gradle/generation/extract-jdk-apis/ExtractJdkApis.java#L192
Not sure
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852033091
I can explain why you see a difference with Java 17 vs 21. Did the job fail
on `:lucene:core:generateJdkApiJar21`. If yes the following happened:
All those three tasks are
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852034874
Hmm actually I think there may be a zero-width space character U+200B on
this line before the opening (?
https://github.com/apache/lucene/blob/branch_9_9/gradle/generation/extra
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852037166
OK I gotta drop off ... will try to root cause this later today.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852036299
This is where it configures the language version:
https://github.com/apache/lucene/blob/e0f4321b40fd06a556ff4a11f137a3fc0f67b5bb/gradle/generation/extract-jdk-apis.gradle#L46-L48
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852036564
> may be a zero-width space character
on `main` and `branch_9_9`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852042380
> Hmm actually I think there may be a zero-width space character U+200B on
this line before the opening (?
https://github.com/apache/lucene/blob/branch_9_9/gradle/generation/extr
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852045851
Let me commit a fix for that file to all 3 branches. It really has a hidden
character.
And there's a second problem: When invoking the JVM it does not pass a
character set
zhaih commented on code in PR #12909:
URL: https://github.com/apache/lucene/pull/12909#discussion_r1424011845
##
lucene/core/src/test/org/apache/lucene/util/automaton/TestNFARunAutomaton.java:
##
@@ -73,6 +73,37 @@ public void testWithRandomRegex() {
}
}
+ public void
zhaih commented on code in PR #12910:
URL: https://github.com/apache/lucene/pull/12910#discussion_r1424022318
##
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##
@@ -34,14 +34,14 @@
public class NeighborArray {
private final boolean scoresDescOrder;
zhaih commented on code in PR #12910:
URL: https://github.com/apache/lucene/pull/12910#discussion_r1424029197
##
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##
@@ -201,9 +225,69 @@ private int descSortFindRightMostInsertionPoint(float
newScore, int boun
zhaih commented on code in PR #12910:
URL: https://github.com/apache/lucene/pull/12910#discussion_r1424031699
##
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##
@@ -51,45 +51,61 @@ public NeighborArray(int maxSize, boolean descOrder) {
*/
public vo
jpountz opened a new pull request, #12927:
URL: https://github.com/apache/lucene/pull/12927
Real-world data exhibits patterns that are taken advantage of by the
compression logic, but also hardly reproducible in a randomized way. This makes
this new test introduce interesting coverage.
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852090987
OK, I fixed the file and also enforced UTF-8 when executing the JDK version.
The general problem was: If I enforce locally another charset than UTF-8 (see
the sysprops passed to
msokolov opened a new pull request, #12928:
URL: https://github.com/apache/lucene/pull/12928
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-
msokolov commented on issue #12896:
URL: https://github.com/apache/lucene/issues/12896#issuecomment-1852094137
I think it's simply that the test writer chooses to flush randomly, creating
two segments instead of one. I was able to fix by adding a call to
forceMerge(1). Opened https://github
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852094139
Fixed in 10387f136ffa88ad9e86c526aa52908829a01ad3.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
UR
zhaih commented on code in PR #12910:
URL: https://github.com/apache/lucene/pull/12910#discussion_r1424036233
##
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##
@@ -51,45 +51,61 @@ public NeighborArray(int maxSize, boolean descOrder) {
*/
public vo
zhaih commented on code in PR #12910:
URL: https://github.com/apache/lucene/pull/12910#discussion_r1424037388
##
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##
@@ -51,45 +51,61 @@ public NeighborArray(int maxSize, boolean descOrder) {
*/
public vo
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852101176
Sorry for this My fault, looks like while copypasting Stackoverflow code
I introduced the hidden character.
--
This is an automated message from the Apache Git Service.
msokolov commented on PR #12928:
URL: https://github.com/apache/lucene/pull/12928#issuecomment-1852113022
heh, got an unrelated test fail there:
org.apache.lucene.search.TestByteVectorSimilarityQuery > testApproximate
FAILED
java.lang.UnsupportedOperationException
at
msokolov merged PR #12928:
URL: https://github.com/apache/lucene/pull/12928
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
msokolov commented on PR #12928:
URL: https://github.com/apache/lucene/pull/12928#issuecomment-1852116447
I also cherry-picked to branch_9x
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
msokolov closed issue #12896: Reproducible failure of
TestParentBlockJoinByteKnnVectorQuery.testScoringWithMultipleChildren
URL: https://github.com/apache/lucene/issues/12896
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us
jpountz opened a new pull request, #12929:
URL: https://github.com/apache/lucene/pull/12929
This replaces `StringField`/`SortedDocValuesField` with `KeywordField` and
`IntPoint`/`NumericDocValuesField` with `IntField`.
--
This is an automated message from the Apache Git Service.
To re
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852141369
To come back to the original issue: There is no easy way to fail the build,
because the FST files are heavy to generate and stay in the resources folder.
It is hard to detect a p
uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852165482
There is no lambda capturing problem. I have no idea why it complains. It
really looks like fully inlined. It seems that it is not happy about those
ByteBuffers at all.
`ix()`
jpountz commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852167681
> There is no easy way to fail the build, because the FST files are heavy to
generate and stay in the resources folder. It is hard to detect a problem if
you don't regenerate them,
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852173027
> > There is no easy way to fail the build, because the FST files are heavy
to generate and stay in the resources folder. It is hard to detect a problem if
you don't regenerate t
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852253188
Even if i used the copy code approach(avoid to using lambda, for test
purpose), it was only 15%-20% faster.
like this:
```
@Override
public void readGroupVInts(lo
uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852257293
I have the feeling that for direct buffers (this is what MMap and NIO use,
the getInt() seems more expensive than the sequential reads.
--
This is an automated message from the Apac
uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1852261778
If you have an on-heap ByteBuffer (like for ByteBuffersIndexInput), it
executes completely different code when reading from the underlying data
structure..
--
This is an automated
msokolov commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1424209861
##
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws I
slow-J opened a new pull request, #12930:
URL: https://github.com/apache/lucene/pull/12930
More detail in https://github.com/apache/lucene/issues/12918.
Changing PFOR encoding to FOR for doc blocks in #12741, required bumping the
codec version. The codec upgrade process has numerous movi
dweiss commented on issue #12923:
URL: https://github.com/apache/lucene/issues/12923#issuecomment-1852401144
I think it's because of changes in the fst code. So seems legitimate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
s1monw commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1852410786
@mikemccand I agree we should not add this to sort but rather tread it the
same way we treat the softDeletes field. it's essentially the same thing from
an IW perspective. I will go ahead
dweiss commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852414564
> Sorry for this My fault, looks like while copypasting Stackoverflow
code I introduced the hidden character.
Maybe apache rat check should actually scan for those, it'
mikemccand commented on PR #12924:
URL: https://github.com/apache/lucene/pull/12924#issuecomment-1852434257
Thanks @ChrisHegarty and @dweiss -- I'll merge to 9.9.x, 9.x and main
shortly. Hmm, rather, I'll regen on the other two branches (not certain FST
format is identical everywhere), and
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852429248
> So it may fail in _some_ configurations, e.g. when your default
characterset is not fitting the encoding of that special character.
Argh! I was worried about this. I mu
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852430721
> > Sorry for this My fault, looks like while copypasting Stackoverflow
code I introduced the hidden character.
>
> Maybe apache rat check should actually scan for
rmuir commented on PR #12924:
URL: https://github.com/apache/lucene/pull/12924#issuecomment-1852434323
if it regenerates successfully and the test suite passes afterwards (which
it seems they do), that's basically it
--
This is an automated message from the Apache Git Service.
To respond
mikemccand merged PR #12924:
URL: https://github.com/apache/lucene/pull/12924
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
slow-J commented on issue #12923:
URL: https://github.com/apache/lucene/issues/12923#issuecomment-1852471268
> I think it's because of changes in the fst code. So seems legitimate.
Thanks @dweiss, does this mean that we should regenerate these 2
`TokenInfoDictionary`?
Or does it s
dweiss commented on issue #12923:
URL: https://github.com/apache/lucene/issues/12923#issuecomment-1852482593
See https://github.com/apache/lucene/pull/12924 - I think it's something
that fixes your problem. Also, take a look at:
https://github.com/apache/lucene/blob/main/help/regeneration
mikemccand commented on PR #12924:
URL: https://github.com/apache/lucene/pull/12924#issuecomment-1852489631
OK I regenerated these two FSTs on 9.9.x, 9.x, and main.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
kaivalnp commented on code in PR #12922:
URL: https://github.com/apache/lucene/pull/12922#discussion_r1424377622
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -255,6 +255,11 @@ static VectorSimilarityScorer fromAcceptDocs(
kaivalnp commented on code in PR #12922:
URL: https://github.com/apache/lucene/pull/12922#discussion_r1424385137
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -255,6 +255,11 @@ static VectorSimilarityScorer fromAcceptDocs(
slow-J closed issue #12923: [Minor] Regeneration investigation
URL: https://github.com/apache/lucene/issues/12923
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe
slow-J commented on issue #12923:
URL: https://github.com/apache/lucene/issues/12923#issuecomment-1852542026
> See #12924 - I think it's something that fixes your problem. Also, take a
look at:
> https://github.com/apache/lucene/blob/main/help/regeneration.txt unless
you have already - t
mikemccand opened a new issue, #12932:
URL: https://github.com/apache/lucene/issues/12932
### Description
I don't think our monster tests even run?
I think we should run them at least for the 9.9.1 release. I'm especially
interested in ensuring `Test2BTerms` is happy with our
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852574170
> > Sorry for this My fault, looks like while copypasting Stackoverflow
code I introduced the hidden character.
>
> Maybe apache rat check should actually scan for
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852574526
> I will commit an push the change to 3 branches.
Thanks @uschindler.
And thank you @rmuir for helping me fix my home dev box to put the default
charset back to UTF-
mikemccand commented on PR #12926:
URL: https://github.com/apache/lucene/pull/12926#issuecomment-1852610006
> This found bugs in `DirectPostingsFormat`
Whoa, awesome!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
mikemccand commented on code in PR #12926:
URL: https://github.com/apache/lucene/pull/12926#discussion_r1424442691
##
lucene/codecs/src/java/org/apache/lucene/codecs/memory/DirectPostingsFormat.java:
##
@@ -1119,12 +1122,14 @@ public DirectIntersectTermsEnum(CompiledAutomaton
c
mikemccand commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852640427
> > There is no easy way to fail the build, because the FST files are heavy
to generate and stay in the resources folder. It is hard to detect a problem if
you don't regenerate t
uschindler commented on issue #12911:
URL: https://github.com/apache/lucene/issues/12911#issuecomment-1852647559
> > > There is no easy way to fail the build, because the FST files are
heavy to generate and stay in the resources folder. It is hard to detect a
problem if you don't regenerate
Tony-X commented on code in PR #12909:
URL: https://github.com/apache/lucene/pull/12909#discussion_r1424472195
##
lucene/core/src/test/org/apache/lucene/util/automaton/TestNFARunAutomaton.java:
##
@@ -73,6 +73,37 @@ public void testWithRandomRegex() {
}
}
+ public voi
mikemccand opened a new pull request, #12933:
URL: https://github.com/apache/lucene/pull/12933
Closes #12911
This just adds specific unit tests for the binary FST for Nori's and
Kuromoji's `TokenInfoDictionary`. I had to promote some APIs from private ->
package private for test vis
mikemccand commented on code in PR #12915:
URL: https://github.com/apache/lucene/pull/12915#discussion_r1424520399
##
lucene/analysis/kuromoji/src/java/org/apache/lucene/analysis/ja/JapaneseHiraganaUppercaseFilter.java:
##
@@ -0,0 +1,65 @@
+package org.apache.lucene.analysis.ja;
slow-J opened a new issue, #12934:
URL: https://github.com/apache/lucene/issues/12934
### Description
I was looking at a very old issue for a failing unit test with an ant test
command. When I tried to run it in gradle (incorrectly), I was greeted with a
message that stood out to me:
benwtrent commented on code in PR #12922:
URL: https://github.com/apache/lucene/pull/12922#discussion_r1424532683
##
lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java:
##
@@ -255,6 +255,11 @@ static VectorSimilarityScorer fromAcceptDocs(
mikemccand commented on PR #12912:
URL: https://github.com/apache/lucene/pull/12912#issuecomment-1852755681
Oh thanks for working on this @gf2121 -- sorry that we both did it at the
same time!
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
uschindler commented on PR #12914:
URL: https://github.com/apache/lucene/pull/12914#issuecomment-1852773264
> I don't get it -- why aren't we seeing that `TestBackwardsCompatibility`
times out every time we run it?
The magic is here:
https://github.com/apache/lucene/blob/2acf76e9e2f9
uschindler commented on PR #12933:
URL: https://github.com/apache/lucene/pull/12933#issuecomment-1852783783
It would be better to give a more detailed Gradle line like:
```sh
./gradlew :lucene:analysis:nori:regenerate
```
The test is not the nicest looking thing, but I acc
1 - 100 of 112 matches
Mail list logo