jpountz commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2262631387
Things got a bit better later on
(https://github.com/apache/lucene/pull/13585#issuecomment-2246112137), but your
reading is correct that some queries get slower. This seems to especially
epotyom commented on code in PR #13559:
URL: https://github.com/apache/lucene/pull/13559#discussion_r1699871277
##
lucene/core/src/java/org/apache/lucene/util/BitSet.java:
##
@@ -92,6 +92,12 @@ public void clear() {
*/
public abstract int nextSetBit(int index);
+ /**
+
dungba88 commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2262677876
The `advance` will keep reducing the array size and we will generally
advance small steps ahead right? Then I think exponential search makes sense.
I'll try to use `IntArrayDocIdSetIter
mikemccand commented on PR #13576:
URL: https://github.com/apache/lucene/pull/13576#issuecomment-2262801593
This looks correct to me too -- we know the max number of skip levels for
this particular segment and should size the arrays based on that, not based on
the global worst case max ever
mikemccand commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2262812551
Nice pop in the nightly benchmarks from this!
[`OrHighMedium`](https://home.apache.org/~mikemccand/lucenebench/OrHighMed.html)
jumped. Even
[`Phrase`](https://home.apache.org/~mike
bugmakerr commented on PR #13576:
URL: https://github.com/apache/lucene/pull/13576#issuecomment-2262823557
@mikemccand ok, I will close this PR. Btw, I think when we can do the same
optimization for the skip reader, and we usually need to read old indices. I
want to know if I should o
jpountz commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2262842204
Hmm,
[`CombinedHighHigh`](https://home.apache.org/~mikemccand/lucenebench/CombinedHighHigh.html)
is angry. I had not benchmarked it while developping, I'll check it out.
Some spee
gsmiller merged PR #13559:
URL: https://github.com/apache/lucene/pull/13559
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
gsmiller commented on PR #13559:
URL: https://github.com/apache/lucene/pull/13559#issuecomment-2263190898
@epotyom just merged. Thanks for the change! I just now noticed you put the
changes entry under 10.0 but I don't see any reason we can't backport to 9.12.
I'm going to backport now and
msokolov commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2263206986
I'm going to merge as-is and we can follow up with the additional safety
measure
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
msokolov merged PR #13581:
URL: https://github.com/apache/lucene/pull/13581
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
jpountz commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2263216720
I found the problem with `CombinedHighHigh`, the logic for lazily decoding
frequencies was broken and we'd decode the whole block of frequencies on every
freq() calls. It's now fixed so
gsmiller commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2263219440
I _think_ exponential search will only outperform binary search in this case
if we expect the next target to be relatively close to the "min" we're
constantly "pushing up" (thanks to yo
msokolov commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2263229682
Also backported to 9x
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
mikemccand commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2263239458
Phew, thanks for catching the performance regression and tracking it down
@jpountz. GO BENCHMARKING!
--
This is an automated message from the Apache Git Service.
To respond to the
jpountz commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2263244861
If `DocIdSetIterator#advance` gets called on large increments, then there
are only so many calls that can be done because the doc ID space is quickly
exhausted. However, if you only adva
gsmiller commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2263350507
Ah yeah, OK thanks @jpountz. Makes sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
gsmiller opened a new pull request, #13625:
URL: https://github.com/apache/lucene/pull/13625
After merging #13559 I noticed an opportunity to remove some redundant code
in the `nextSetBit` implementations.
--
This is an automated message from the Apache Git Service.
To respond to the mess
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700548368
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700548368
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700550981
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700550122
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700551363
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700552308
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on PR #13604:
URL: https://github.com/apache/lucene/pull/13604#issuecomment-2263589226
@tveasey Thank you for your detailed feedback, it was addressed in the last
commit.
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
benwtrent commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700553805
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
benwtrent merged PR #13613:
URL: https://github.com/apache/lucene/pull/13613
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
msokolov closed issue #13580: Are we properly accounting for
`NeighborArray.rwlock`?
URL: https://github.com/apache/lucene/issues/13580
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
msokolov commented on issue #13580:
URL: https://github.com/apache/lucene/issues/13580#issuecomment-2264000352
There's still a question about whether to add more centralized lock
enforcement since it is up to the caller to decide how to place locks around
`OnHeapHnswGraph.getNeighbors()` bu
gsmiller commented on PR #13625:
URL: https://github.com/apache/lucene/pull/13625#issuecomment-2264023189
@epotyom this minor refactoring occurred to me after merging your recent
work. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
benwtrent commented on issue #13571:
URL: https://github.com/apache/lucene/issues/13571#issuecomment-2264033141
I dug a little bit into this. I tried protecting by putting `synchronized`
on `getNextSequenceNumber` that didn't work.
I tried putting `synchronize` on the DW when it flush
aoli-al commented on issue #13571:
URL: https://github.com/apache/lucene/issues/13571#issuecomment-2264062711
Thanks for confirming this! Yes, I found the bug extremely tricky to trigger
while trying to reproduce.
Making `DocumentsWriterFlushControl:obtainAndLock` synchronized will m
epotyom commented on code in PR #13625:
URL: https://github.com/apache/lucene/pull/13625#discussion_r1700906253
##
lucene/core/src/java/org/apache/lucene/util/SparseFixedBitSet.java:
##
@@ -337,34 +337,23 @@ private int firstDoc(int i4096, int i4096upper) {
@Override
pub
uschindler commented on PR #13570:
URL: https://github.com/apache/lucene/pull/13570#issuecomment-2264150313
Do we have a backport PR? Should I work on it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1700987077
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1700999887
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMap.java:
##
@@ -291,11 +306,35 @@ public SynonymMap build() throws IOException {
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1700987077
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1701014637
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1701025192
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1701032411
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
benwtrent commented on issue #13127:
URL: https://github.com/apache/lucene/issues/13127#issuecomment-2264272195
OK, I added a bunch of logging and it seems like the issue is around
`DWPTP#getAndLock`.
I can see the following occurring, new DWPTs being created, each with the
first ge
benwtrent commented on issue #13127:
URL: https://github.com/apache/lucene/issues/13127#issuecomment-2264278817
Yeah, looking at `markForFullFlush`, it seems like we mark the generation to
gather that `seqNo`, then unlock DWFC, and this allows new DWPT to be returned
with the old generation
benwtrent commented on issue #13127:
URL: https://github.com/apache/lucene/issues/13127#issuecomment-2264281268
Another option is continue to lock after the deleteQueue generation creation
until the DWPT are removed.
--
This is an automated message from the Apache Git Service.
To respond
david-sitsky opened a new issue, #13626:
URL: https://github.com/apache/lucene/issues/13626
### Description
One of our internal users hit this error when merging their index after
loading all documents which contain some vector fields. I couldn't reproduce
this myself. Is there a c
benwtrent commented on issue #13127:
URL: https://github.com/apache/lucene/issues/13127#issuecomment-2264299983
Ah, another option is to switch the logic that is used to mark it free and
send it back to the freeList. We can check if the deleteQueue is advanced, and
just unlock it instead of
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701070122
##
lucene/core/src/c/dotProduct.h:
##
@@ -0,0 +1,4 @@
+
+int32_t vdot8s_sve(int8_t* vec1[], int8_t* vec2, int32_t limit);
+int32_t vdot8s_neon(int8_t* vec1[], int8_t*
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701070880
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701071456
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
benwtrent opened a new pull request, #13627:
URL: https://github.com/apache/lucene/pull/13627
There is a tricky race condition with DWPT threads. It is possible that a
flush starts by advancing the deleteQueue (in charge of creating seqNo). Thus,
the referenced deleteQueue, there should be
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701088228
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701099939
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701088228
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701104686
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
gsmiller merged PR #13625:
URL: https://github.com/apache/lucene/pull/13625
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
gsmiller commented on PR #13625:
URL: https://github.com/apache/lucene/pull/13625#issuecomment-2264403347
Thanks @epotyom for the feedback!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701126777
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701126777
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2264420154
> But I think it makes the build more straightforward: it builds native as
you expect, if you want to use different compiler set `CC` env vars etc
differently.
I still get compila
dungba88 commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2264454968
@jpountz I was reading `IntArrayDocIdSetIterator`, it is a private class
only exposed through `IntArrayDocIdSet`. I think we need to extend the
capability here (storing the score, havin
vsop-479 commented on code in PR #13596:
URL: https://github.com/apache/lucene/pull/13596#discussion_r1701382048
##
lucene/test-framework/src/java/org/apache/lucene/tests/index/PerThreadPKLookup.java:
##
@@ -97,5 +111,82 @@ public int lookup(BytesRef id) throws IOException {
60 matches
Mail list logo