uschindler commented on PR #12899:
URL: https://github.com/apache/lucene/pull/12899#issuecomment-1848910834
Are we sure that this did not affect already existing indexes, e.g. while
merging?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
mikemccand commented on PR #12899:
URL: https://github.com/apache/lucene/pull/12899#issuecomment-1848925405
> Are we sure that this did not affect already existing indexes, e.g. while
merging?
Great question @uschindler ("blast radius") -- I'm pretty sure newly 9.9.0
written indices
mikemccand commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1848934191
OK I think I understand why we see the bug on 9.8.x indices but NOT 9.9.x.
indices. And I'm quite sure blast radius of the bug is contained solely to
read-time, i.e. newly writt
mikemccand commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1848934726
> > I don't think this assumption is valid @gf2121? Because that floor data
first contains the file pointer of the on-disk block that this prefix points to
(in MSB order as of 9.
mikemccand opened a new issue, #12901:
URL: https://github.com/apache/lucene/issues/12901
### Description
Spinoff from #12895 where we inadvertently introduced read-time exceptions
in `MultiTermQueries` (e.g. `WildcardQuery` `*fo*`) in 9.9.0 when reading
pre-9.9.0 written indices.
mikemccand commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1848936838
OK I opened #12901 to create a test -- can we do that, first, and then
confirm #12900 indeed fixes it, then push the test, then push the fix? I can
try to work on creating that
mikemccand commented on PR #12900:
URL: https://github.com/apache/lucene/pull/12900#issuecomment-1848936948
OK I opened #12901 to create a better BWC test revealing these read-time
exceptions -- can we do that, first, and then confirm #12900 indeed fixes it,
then push the test, then push th
benwtrent commented on issue #12895:
URL: https://github.com/apache/lucene/issues/12895#issuecomment-1848961163
> and so sorry for introducing this bug!
I am with @mikemccand on this @gf2121!
The optimization proved valuable in benchmarks and all testing indicated it
was good t
dungba88 commented on PR #12879:
URL: https://github.com/apache/lucene/pull/12879#issuecomment-1848988292
Note: We already merged #12624 , but this PR can be used as benchmark
candidate for https://github.com/apache/lucene/issues/12884
--
This is an automated message from the Apache Git S
dungba88 commented on issue #12884:
URL: https://github.com/apache/lucene/issues/12884#issuecomment-1848988593
We can use https://github.com/apache/lucene/pull/12879 as a benchmark
candidate to compare against the current baseline.
--
This is an automated message from the Apache Git Servi
dungba88 commented on PR #12831:
URL: https://github.com/apache/lucene/pull/12831#issuecomment-184998
One of the point I'm unsure about this PR is that there is now 2 (obscured)
ways to construct the FST, one using the DataInput+FSTStore and one using
FSTReader returned by the on-heap D
dungba88 opened a new issue, #12902:
URL: https://github.com/apache/lucene/issues/12902
### Description
Most of the use cases with FST seems to be writing the FST eventually to a
DataOutput (is it IndexOutput?). In that case we are currently writing the FST
to an on heap DataOutput (
dungba88 commented on issue #12902:
URL: https://github.com/apache/lucene/issues/12902#issuecomment-1849015292
A point for reference, Tantivy saves the metadata to the end of file, and it
will first jump to the end to know the size and starting node. But we couldn't
do it as one file might
dungba88 commented on code in PR #12844:
URL: https://github.com/apache/lucene/pull/12844#discussion_r1421782211
##
lucene/core/src/java/org/apache/lucene/util/ArrayUtil.java:
##
@@ -330,15 +330,36 @@ public static int[] growExact(int[] array, int newLength)
{
return copy;
uschindler commented on code in PR #12841:
URL: https://github.com/apache/lucene/pull/12841#discussion_r1421801910
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java:
##
@@ -140,10 +155,12 @@ public void init() throws Exception {
}
uschindler commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1849057360
Hi, I was busy this Sunday, sorry for delay. Will check tomorrow.
In general: My idea is to have a single static utility method (like the
default one) in the util class that doe
jpountz commented on issue #12901:
URL: https://github.com/apache/lucene/issues/12901#issuecomment-1849073645
IMO one thing to improve here is that we changed file formats for the MSB
vlong optimization by bumping the file format version number, changing the
writer to write the new version
ChrisHegarty commented on PR #12903:
URL: https://github.com/apache/lucene/pull/12903#issuecomment-1849079207
For reviewers or other interested parties. I run the test as follows:
```
export CI=true; export x=; while ./gradlew --no-build-cache cleanTest
:lucene:core:test --rerun --
jpountz opened a new pull request, #12904:
URL: https://github.com/apache/lucene/pull/12904
I just noticed that the move from FOR to PFOR did all the work to make the
old format (FOR) writeable, but missed keeping an instance of
`BasePostingsFormatTestCase` for this format.
--
This is an
jpountz commented on code in PR #12903:
URL: https://github.com/apache/lucene/pull/12903#discussion_r1421819068
##
lucene/core/src/java/org/apache/lucene/util/MSBRadixSorter.java:
##
@@ -214,6 +214,12 @@ protected int getBucket(int i, int k) {
* @see #buildHistogram
*/
ChrisHegarty commented on code in PR #12903:
URL: https://github.com/apache/lucene/pull/12903#discussion_r1421825869
##
lucene/core/src/test/org/apache/lucene/util/bkd/TestDocIdsWriter.java:
##
@@ -150,4 +154,18 @@ public Relation compare(byte[] minPackedValue, byte[]
maxPacked
ChrisHegarty commented on code in PR #12903:
URL: https://github.com/apache/lucene/pull/12903#discussion_r1421825998
##
lucene/core/src/java/org/apache/lucene/util/MSBRadixSorter.java:
##
@@ -214,6 +214,12 @@ protected int getBucket(int i, int k) {
* @see #buildHistogram
Tony-X opened a new issue, #12906:
URL: https://github.com/apache/lucene/issues/12906
### Description
as part of https://github.com/apache/lucene/pull/12688,
I'm trying to develop an algorithm that can intersect FST and FSA
efficiently. For FSA this means we can leverage the sort
easyice commented on PR #12841:
URL: https://github.com/apache/lucene/pull/12841#issuecomment-1849275229
> 3rd party implementations and our own MMap/NIO/... would just call this
static method if they support random access or call super(), if they can't.
Thanks @uschindler , it's ok!,
dweiss commented on code in PR #12903:
URL: https://github.com/apache/lucene/pull/12903#discussion_r1422008879
##
lucene/core/src/test/org/apache/lucene/util/bkd/TestDocIdsWriter.java:
##
@@ -150,4 +154,18 @@ public Relation compare(byte[] minPackedValue, byte[]
maxPackedValue)
dweiss commented on PR #12903:
URL: https://github.com/apache/lucene/pull/12903#issuecomment-1849478590
> export CI=true; export x=; while ./gradlew --no-build-cache cleanTest
:lucene:core:test --rerun --tests
"org.apache.lucene.util.bkd.TestDocIdsWriter.testCrash"; do echo $x | wc -c;
exp
jpountz commented on issue #12901:
URL: https://github.com/apache/lucene/issues/12901#issuecomment-1849481891
> A benefit of the latter approach is that we keep existing tests in place,
specifically BasePostingsFormatTestCase
Actually, it seems like we lost the test as part of refacto
dweiss opened a new issue, #12907:
URL: https://github.com/apache/lucene/issues/12907
### Description
These stack traces at the end of jenkins runs are super annoying:
```
...
Archiving artifacts
hudson.FilePath$ValidateAntFileMask$1Cancel
at
hudson.FilePath$Val
28 matches
Mail list logo