ChrisHegarty commented on PR #12703:
URL: https://github.com/apache/lucene/pull/12703#issuecomment-1778821678
> > Well... as simple wrapping of float[] into MemorySegment is not going to
work out, the Vector API does not like it due to alignment constraints (which
seems overly pedantic sinc
s1monw commented on PR #12549:
URL: https://github.com/apache/lucene/pull/12549#issuecomment-1778953836
I think this is only triggered because of your change but the problem was
already there. We hold the lock in MDW#close() such that we can not run a
concurrent merge. We could either preve
s1monw commented on code in PR #12718:
URL: https://github.com/apache/lucene/pull/12718#discussion_r1371505561
##
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:
##
@@ -425,11 +425,12 @@ public int count(Query query) throws IOException {
}
/**
- * Ret
s1monw commented on PR #12711:
URL: https://github.com/apache/lucene/pull/12711#issuecomment-1778962096
> Another question: do we have any testing around this sort-stability /
block-preservation today? I'm getting nervous now that we are relying on an
undocumented feature that just happens
gashutos opened a new issue, #12720:
URL: https://github.com/apache/lucene/issues/12720
### Description
### Problem
With higher number of deleted entries in a segment, the sort query shows up
to `10x` degradation after one point. We did this experiment using
[nyc_taxis](https://gi
ChrisHegarty commented on PR #12703:
URL: https://github.com/apache/lucene/pull/12703#issuecomment-1779052160
For what it's worth, the changes currently in this PR do not perform
generally well, since we can have a mix of how we represent the underlying
vector values, and where they come fr
gf2121 opened a new issue, #12721:
URL: https://github.com/apache/lucene/issues/12721
### Description
An immature idea ! :)
I noticed that `BPIndexReorderer$ComputeGainsTask#computeGain()` took a lot
in CPU profile:
```
PERCENT CPU SAMPLES STACK
4.75%
jpountz commented on PR #12549:
URL: https://github.com/apache/lucene/pull/12549#issuecomment-1779155292
Thanks! I was thinking of something along the lines of the diff you shared,
I had not thought of the SerialMergeScheduler approach. I'll check it works and
push this change.
--
This i
jpountz commented on issue #12700:
URL: https://github.com/apache/lucene/issues/12700#issuecomment-1779179636
We made changes to similarities to guarantee monotonicity with tf and norm
(e.g. https://github.com/apache/lucene/issues/9063) despite floating-point
rounding errors. I think we sho
jpountz commented on issue #12696:
URL: https://github.com/apache/lucene/issues/12696#issuecomment-1779221543
For reference, Lucene used to use FOR for postings and PFOR for positions in
8.x. This was changed in 9.0 via #69 to use PFOR for both postings and
positions. This PR says it made t
jpountz commented on issue #12720:
URL: https://github.com/apache/lucene/issues/12720#issuecomment-1779265288
Having many deleted documents competitive is definitely a worst-case
scenario for any kind of dynamic pruning that Lucene does. I'm not sure if
there is something that we can do abo
jpountz commented on issue #12675:
URL: https://github.com/apache/lucene/issues/12675#issuecomment-1779267844
Thas has been addressed by #12682. Thanks @KunalSanghvi for contributing and
@benwtrent for merging!
--
This is an automated message from the Apache Git Service.
To respond to the
gf2121 commented on issue #12721:
URL: https://github.com/apache/lucene/issues/12721#issuecomment-1779316042
> did something like intVector = intVector.max(BROAD_1)
Great idea! Here is the benchmark result :
```
Benchmark (maxTerm) (termsNum) Mode
gf2121 commented on issue #12702:
URL: https://github.com/apache/lucene/issues/12702#issuecomment-1779338184
on `wikimediumall`
**Queries (Nothing changed obviously):**
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev
dweiss commented on issue #12704:
URL: https://github.com/apache/lucene/issues/12704#issuecomment-1779342648
If you'd like to do so, I'd suggest moving such a "scattering remix" utility
to a separate class and reusing it elsewhere, much like here:
https://github.com/carrotsearch/hppc/blo
gf2121 opened a new pull request, #12722:
URL: https://github.com/apache/lucene/pull/12722
closes https://github.com/apache/lucene/issues/12702
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
mikemccand commented on code in PR #12722:
URL: https://github.com/apache/lucene/pull/12722#discussion_r1371870374
##
lucene/CHANGES.txt:
##
@@ -227,6 +227,8 @@ Optimizations
* GITHUB#12712: Speed up sorting postings file with an offline radix sorter in
BPIndexReader. (Guo F
mikemccand commented on PR #12716:
URL: https://github.com/apache/lucene/pull/12716#issuecomment-1779416642
> Thank you so much for the help Mike !
Thank you!
> I have never run the fst benchmark but seems like its straightforward java
script?. I could give it a try as well (so
mikemccand commented on PR #12716:
URL: https://github.com/apache/lucene/pull/12716#issuecomment-1779443135
OK I ran it twice on `main`:
```
saved FST to "fst.bin": 294815624 bytes; 59.874 sec
saved FST to "fst.bin": 294815624 bytes; 60.255 sec
```
And twice with th
benwtrent commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1371954603
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsWriter.java:
##
@@ -557,6 +566,12 @@ public void close() throws IOException {
RS146BIJAY commented on issue #12720:
URL: https://github.com/apache/lucene/issues/12720#issuecomment-1779548832
@jpountz so is it same to conclude that user increasing merging rate (which
will remove these obsolete entries) (by tuning some parameters or doing a force
merge) is the only way
bruno-roustant commented on PR #12716:
URL: https://github.com/apache/lucene/pull/12716#issuecomment-1779668350
Oh, the numbers are disappointing. I expected to be both a little more
compact and little faster.
I wonder what is the cause, the rehash threshold, the linear scan, or the
mult
dweiss commented on issue #12708:
URL: https://github.com/apache/lucene/issues/12708#issuecomment-1779762455
Should we add an assumption to this test so that it is ignored on JDK22, at
least until the issue is resolved? Causes some noise on the builds mailing list.
--
This is an automated
kaivalnp commented on PR #12679:
URL: https://github.com/apache/lucene/pull/12679#issuecomment-1779796454
Hi @benwtrent! Curious to hear if you've been able to reproduce the
benchmark?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
jpountz commented on PR #12549:
URL: https://github.com/apache/lucene/pull/12549#issuecomment-1779816530
I ended up implementing your other suggestion. MDW generally expects that
this IndexWriter instantiation will not do merges.
--
This is an automated message from the Apache Git Service
jpountz commented on PR #12685:
URL: https://github.com/apache/lucene/pull/12685#issuecomment-1779837662
FYI we've seen failures on TestIndexWriter recently, which are reproducible
(e.g.
https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-9.x/720/). I
ran git bisect and it poin
benwtrent commented on PR #12679:
URL: https://github.com/apache/lucene/pull/12679#issuecomment-1779866529
@kaivalnp I have been busy doing other things. I hope to look into this in
the next week or so.
--
This is an automated message from the Apache Git Service.
To respond to the message
uschindler commented on issue #12708:
URL: https://github.com/apache/lucene/issues/12708#issuecomment-1780013296
I will update JDK tomorrow or Friday and the issue should be gone.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu
s1monw commented on PR #12685:
URL: https://github.com/apache/lucene/pull/12685#issuecomment-1780091805
I pushed fixes... thanks @jpountz
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
kaivalnp commented on PR #12679:
URL: https://github.com/apache/lucene/pull/12679#issuecomment-1780186180
Thank you! I'll try to incorporate earlier suggestions in the meanwhile
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
mikemccand merged PR #12709:
URL: https://github.com/apache/lucene/pull/12709
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on PR #12709:
URL: https://github.com/apache/lucene/pull/12709#issuecomment-1780274592
Thanks @dungba88 -- I just merged. We can open a new PR when it's time to
backport ...
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
mikemccand commented on issue #12714:
URL: https://github.com/apache/lucene/issues/12714#issuecomment-1780282563
I made a quick hackity change, just to measure the number of additional
bytes we'd "typically" have to copy in order to duplicate suffix bytes from the
growing (forced append-onl
gf2121 commented on code in PR #12722:
URL: https://github.com/apache/lucene/pull/12722#discussion_r1372542501
##
lucene/CHANGES.txt:
##
@@ -227,6 +227,8 @@ Optimizations
* GITHUB#12712: Speed up sorting postings file with an offline radix sorter in
BPIndexReader. (Guo Feng)
gf2121 merged PR #12722:
URL: https://github.com/apache/lucene/pull/12722
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
jmazanec15 commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1372593680
##
lucene/core/src/java/org/apache/lucene/util/ScalarQuantizer.java:
##
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
zhaih commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1372595375
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java:
##
@@ -151,61 +159,128 @@ public OnHeapHnswGraph build(int maxOrd) throws
IOException {
jmazanec15 commented on code in PR #12582:
URL: https://github.com/apache/lucene/pull/12582#discussion_r1372593680
##
lucene/core/src/java/org/apache/lucene/util/ScalarQuantizer.java:
##
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
zhaih commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1372606067
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsFormat.java:
##
@@ -198,14 +218,25 @@ public Lucene99HnswVectorsFormat(
+ ";
39 matches
Mail list logo