vsop-479 commented on code in PR #13596:
URL: https://github.com/apache/lucene/pull/13596#discussion_r1701382048
##
lucene/test-framework/src/java/org/apache/lucene/tests/index/PerThreadPKLookup.java:
##
@@ -97,5 +111,82 @@ public int lookup(BytesRef id) throws IOException {
dungba88 commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2264454968
@jpountz I was reading `IntArrayDocIdSetIterator`, it is a private class
only exposed through `IntArrayDocIdSet`. I think we need to extend the
capability here (storing the score, havin
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2264420154
> But I think it makes the build more straightforward: it builds native as
you expect, if you want to use different compiler set `CC` env vars etc
differently.
I still get compila
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701126777
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701126777
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
gsmiller commented on PR #13625:
URL: https://github.com/apache/lucene/pull/13625#issuecomment-2264403347
Thanks @epotyom for the feedback!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
gsmiller merged PR #13625:
URL: https://github.com/apache/lucene/pull/13625
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701104686
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701088228
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701099939
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701088228
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
benwtrent opened a new pull request, #13627:
URL: https://github.com/apache/lucene/pull/13627
There is a tricky race condition with DWPT threads. It is possible that a
flush starts by advancing the deleteQueue (in charge of creating seqNo). Thus,
the referenced deleteQueue, there should be
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701071456
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701070880
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701070122
##
lucene/core/src/c/dotProduct.h:
##
@@ -0,0 +1,4 @@
+
+int32_t vdot8s_sve(int8_t* vec1[], int8_t* vec2, int32_t limit);
+int32_t vdot8s_neon(int8_t* vec1[], int8_t*
benwtrent commented on issue #13127:
URL: https://github.com/apache/lucene/issues/13127#issuecomment-2264299983
Ah, another option is to switch the logic that is used to mark it free and
send it back to the freeList. We can check if the deleteQueue is advanced, and
just unlock it instead of
david-sitsky opened a new issue, #13626:
URL: https://github.com/apache/lucene/issues/13626
### Description
One of our internal users hit this error when merging their index after
loading all documents which contain some vector fields. I couldn't reproduce
this myself. Is there a c
benwtrent commented on issue #13127:
URL: https://github.com/apache/lucene/issues/13127#issuecomment-2264281268
Another option is continue to lock after the deleteQueue generation creation
until the DWPT are removed.
--
This is an automated message from the Apache Git Service.
To respond
benwtrent commented on issue #13127:
URL: https://github.com/apache/lucene/issues/13127#issuecomment-2264278817
Yeah, looking at `markForFullFlush`, it seems like we mark the generation to
gather that `seqNo`, then unlock DWFC, and this allows new DWPT to be returned
with the old generation
benwtrent commented on issue #13127:
URL: https://github.com/apache/lucene/issues/13127#issuecomment-2264272195
OK, I added a bunch of logging and it seems like the issue is around
`DWPTP#getAndLock`.
I can see the following occurring, new DWPTs being created, each with the
first ge
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1701032411
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1701025192
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1701014637
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1700987077
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1700999887
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMap.java:
##
@@ -291,11 +306,35 @@ public SynonymMap build() throws IOException {
msfroh commented on code in PR #13054:
URL: https://github.com/apache/lucene/pull/13054#discussion_r1700987077
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymMapDirectory.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation
uschindler commented on PR #13570:
URL: https://github.com/apache/lucene/pull/13570#issuecomment-2264150313
Do we have a backport PR? Should I work on it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
epotyom commented on code in PR #13625:
URL: https://github.com/apache/lucene/pull/13625#discussion_r1700906253
##
lucene/core/src/java/org/apache/lucene/util/SparseFixedBitSet.java:
##
@@ -337,34 +337,23 @@ private int firstDoc(int i4096, int i4096upper) {
@Override
pub
aoli-al commented on issue #13571:
URL: https://github.com/apache/lucene/issues/13571#issuecomment-2264062711
Thanks for confirming this! Yes, I found the bug extremely tricky to trigger
while trying to reproduce.
Making `DocumentsWriterFlushControl:obtainAndLock` synchronized will m
benwtrent commented on issue #13571:
URL: https://github.com/apache/lucene/issues/13571#issuecomment-2264033141
I dug a little bit into this. I tried protecting by putting `synchronized`
on `getNextSequenceNumber` that didn't work.
I tried putting `synchronize` on the DW when it flush
gsmiller commented on PR #13625:
URL: https://github.com/apache/lucene/pull/13625#issuecomment-2264023189
@epotyom this minor refactoring occurred to me after merging your recent
work. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
msokolov closed issue #13580: Are we properly accounting for
`NeighborArray.rwlock`?
URL: https://github.com/apache/lucene/issues/13580
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific c
msokolov commented on issue #13580:
URL: https://github.com/apache/lucene/issues/13580#issuecomment-2264000352
There's still a question about whether to add more centralized lock
enforcement since it is up to the caller to decide how to place locks around
`OnHeapHnswGraph.getNeighbors()` bu
benwtrent merged PR #13613:
URL: https://github.com/apache/lucene/pull/13613
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
benwtrent commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700553805
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) u
mayya-sharipova commented on PR #13604:
URL: https://github.com/apache/lucene/pull/13604#issuecomment-2263589226
@tveasey Thank you for your detailed feedback, it was addressed in the last
commit.
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700552308
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700551363
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700550122
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700550981
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700548368
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
mayya-sharipova commented on code in PR #13604:
URL: https://github.com/apache/lucene/pull/13604#discussion_r1700548368
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/quantization/KMeans.java:
##
@@ -0,0 +1,348 @@
+/*
+ * Licensed to the Apache Software Foundation (
gsmiller opened a new pull request, #13625:
URL: https://github.com/apache/lucene/pull/13625
After merging #13559 I noticed an opportunity to remove some redundant code
in the `nextSetBit` implementations.
--
This is an automated message from the Apache Git Service.
To respond to the mess
gsmiller commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2263350507
Ah yeah, OK thanks @jpountz. Makes sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
jpountz commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2263244861
If `DocIdSetIterator#advance` gets called on large increments, then there
are only so many calls that can be done because the doc ID space is quickly
exhausted. However, if you only adva
mikemccand commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2263239458
Phew, thanks for catching the performance regression and tracking it down
@jpountz. GO BENCHMARKING!
--
This is an automated message from the Apache Git Service.
To respond to the
msokolov commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2263229682
Also backported to 9x
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
gsmiller commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2263219440
I _think_ exponential search will only outperform binary search in this case
if we expect the next target to be relatively close to the "min" we're
constantly "pushing up" (thanks to yo
jpountz commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2263216720
I found the problem with `CombinedHighHigh`, the logic for lazily decoding
frequencies was broken and we'd decode the whole block of frequencies on every
freq() calls. It's now fixed so
msokolov merged PR #13581:
URL: https://github.com/apache/lucene/pull/13581
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
msokolov commented on PR #13581:
URL: https://github.com/apache/lucene/pull/13581#issuecomment-2263206986
I'm going to merge as-is and we can follow up with the additional safety
measure
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
gsmiller commented on PR #13559:
URL: https://github.com/apache/lucene/pull/13559#issuecomment-2263190898
@epotyom just merged. Thanks for the change! I just now noticed you put the
changes entry under 10.0 but I don't see any reason we can't backport to 9.12.
I'm going to backport now and
gsmiller merged PR #13559:
URL: https://github.com/apache/lucene/pull/13559
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
jpountz commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2262842204
Hmm,
[`CombinedHighHigh`](https://home.apache.org/~mikemccand/lucenebench/CombinedHighHigh.html)
is angry. I had not benchmarked it while developping, I'll check it out.
Some spee
bugmakerr commented on PR #13576:
URL: https://github.com/apache/lucene/pull/13576#issuecomment-2262823557
@mikemccand ok, I will close this PR. Btw, I think when we can do the same
optimization for the skip reader, and we usually need to read old indices. I
want to know if I should o
mikemccand commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2262812551
Nice pop in the nightly benchmarks from this!
[`OrHighMedium`](https://home.apache.org/~mikemccand/lucenebench/OrHighMed.html)
jumped. Even
[`Phrase`](https://home.apache.org/~mike
mikemccand commented on PR #13576:
URL: https://github.com/apache/lucene/pull/13576#issuecomment-2262801593
This looks correct to me too -- we know the max number of skip levels for
this particular segment and should size the arrays based on that, not based on
the global worst case max ever
dungba88 commented on PR #13595:
URL: https://github.com/apache/lucene/pull/13595#issuecomment-2262677876
The `advance` will keep reducing the array size and we will generally
advance small steps ahead right? Then I think exponential search makes sense.
I'll try to use `IntArrayDocIdSetIter
epotyom commented on code in PR #13559:
URL: https://github.com/apache/lucene/pull/13559#discussion_r1699871277
##
lucene/core/src/java/org/apache/lucene/util/BitSet.java:
##
@@ -92,6 +92,12 @@ public void clear() {
*/
public abstract int nextSetBit(int index);
+ /**
+
jpountz commented on PR #13585:
URL: https://github.com/apache/lucene/pull/13585#issuecomment-2262631387
Things got a bit better later on
(https://github.com/apache/lucene/pull/13585#issuecomment-2246112137), but your
reading is correct that some queries get slower. This seems to especially
60 matches
Mail list logo