zhaih commented on PR #12660:
URL: https://github.com/apache/lucene/pull/12660#issuecomment-1778626211
@msokolov @benwtrent I removed almost all `nocommit` (except the renaming
one) and rebased to main, please take a look if you have time.
@benwtrent Please check whether the rebase an
zhaih commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1371235805
##
lucene/core/src/java/org/apache/lucene/codecs/lucene95/Lucene95HnswVectorsFormat.java:
##
@@ -146,18 +148,24 @@ public final class Lucene95HnswVectorsFormat extends
zhaih commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1371211783
##
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##
@@ -35,6 +38,9 @@ public class NeighborArray {
float[] score;
int[] node;
private int
zhaih commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1371210641
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java:
##
@@ -221,34 +296,50 @@ private long printGraphBuildStatus(int node, long start,
long t) {
zhaih commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1371207682
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java:
##
@@ -221,34 +296,50 @@ private long printGraphBuildStatus(int node, long start,
long t) {
gf2121 merged PR #12712:
URL: https://github.com/apache/lucene/pull/12712
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
dungba88 commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1778455090
There is one thing that baffled me is that we are writing the metadata,
including the numBytes & start node in the beginning of the DataOutput. That
means once the FST is completed, we
gf2121 merged PR #12710:
URL: https://github.com/apache/lucene/pull/12710
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
dungba88 commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1778336652
(A small note: Tantivy use a value-based LRU cache with 2-item bucket, items
will be evicted per bucket:
https://github.com/BurntSushi/fst/blob/a0936e9b25a888a0d5b9f94b91997216253e7088/
benwtrent merged PR #12582:
URL: https://github.com/apache/lucene/pull/12582
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
shubhamvishu commented on PR #12716:
URL: https://github.com/apache/lucene/pull/12716#issuecomment-130978
Thanks for the quick review @mikemccand! I have addressed the comments in
the new revision.
> Could you also change the probe from quadratic (what it is now) to a
simple line
shubhamvishu commented on code in PR #12716:
URL: https://github.com/apache/lucene/pull/12716#discussion_r1370588249
##
lucene/CHANGES.txt:
##
@@ -190,6 +190,8 @@ Improvements
* GITHUB#12705, GITHUB#12705: Improve handling of NullPointerException and
IllegalStateException
i
cavorite commented on code in PR #12715:
URL: https://github.com/apache/lucene/pull/12715#discussion_r1370542730
##
lucene/core/src/java/org/apache/lucene/util/fst/FSTCompiler.java:
##
@@ -122,8 +122,11 @@ public class FSTCompiler {
/**
* Instantiates an FST/FSA builder w
gf2121 commented on PR #12712:
URL: https://github.com/apache/lucene/pull/12712#issuecomment-1777640134
> I sent you an email to your Apache address, can you check it out?
Sorry for missing the email. Thank you so much for reminding me here!
--
This is an automated message from the
dsmiley merged PR #12293:
URL: https://github.com/apache/lucene/pull/12293
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz commented on PR #12712:
URL: https://github.com/apache/lucene/pull/12712#issuecomment-1777605586
Unrelated to this change: I sent you an email to your Apache address, can
you check it out? (Sorry for the noise on this PR, I don't know how else to
contact you).
--
This is an autom
dsmiley commented on PR #12293:
URL: https://github.com/apache/lucene/pull/12293#issuecomment-1777598613
Here's [my build I ran
locally](https://ge.apache.org/s/nsaeazuf3tkcu/timeline) for anyone who is
interested. I observe that Lucene tests seem well balanced (judging from the
cool time
jpountz merged PR #12589:
URL: https://github.com/apache/lucene/pull/12589
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz commented on PR #12549:
URL: https://github.com/apache/lucene/pull/12549#issuecomment-1777530886
I reverted as this causes a deadlock in TestStressIndexing:
```
"TEST-TestStressIndexing.testStressIndexAndSearching-seed#[D4B60FA81EB58FF3]"
#42 [282516] prio=5 os_prio=0 cpu=
javanna commented on PR #12689:
URL: https://github.com/apache/lucene/pull/12689#issuecomment-1777505662
Thanks for the review @jpountz !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the s
javanna merged PR #12689:
URL: https://github.com/apache/lucene/pull/12689
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
clayburn commented on PR #12293:
URL: https://github.com/apache/lucene/pull/12293#issuecomment-1777496879
@dsmiley - Excellent questions:
> An espoused benefit to this PR is that, as an Apache committer, I could do
builds on my own machine and have the analysis be published for viewin
jpountz commented on code in PR #12712:
URL: https://github.com/apache/lucene/pull/12712#discussion_r1370397139
##
lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java:
##
@@ -991,4 +939,166 @@ static int readMonotonicInts(DataInput in, int[] ints)
throws IOE
gf2121 commented on code in PR #12712:
URL: https://github.com/apache/lucene/pull/12712#discussion_r1370377164
##
lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java:
##
@@ -991,4 +939,233 @@ static int readMonotonicInts(DataInput in, int[] ints)
throws IOEx
jpountz merged PR #12549:
URL: https://github.com/apache/lucene/pull/12549
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
dungba88 commented on PR #12709:
URL: https://github.com/apache/lucene/pull/12709#issuecomment-1777396050
I added an entry in the CHANGES.txt under Lucene 10.0 (as we are not
backporting)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
javanna commented on code in PR #12606:
URL: https://github.com/apache/lucene/pull/12606#discussion_r1370308080
##
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:
##
@@ -420,13 +418,12 @@ public int count(Query query) throws IOException {
}
/**
- * Re
dsmiley commented on PR #12293:
URL: https://github.com/apache/lucene/pull/12293#issuecomment-1777343300
@clayburn An espoused benefit to this PR is that, as an Apache committer, I
could do builds on my own machine and have the analysis be published for
viewing on ge.apache.org. How do I d
jpountz commented on code in PR #12712:
URL: https://github.com/apache/lucene/pull/12712#discussion_r1370176150
##
lucene/misc/src/java/org/apache/lucene/misc/index/BPIndexReorderer.java:
##
@@ -991,4 +939,233 @@ static int readMonotonicInts(DataInput in, int[] ints)
throws IOE
mikemccand commented on code in PR #12716:
URL: https://github.com/apache/lucene/pull/12716#discussion_r1370148505
##
lucene/CHANGES.txt:
##
@@ -190,6 +190,8 @@ Improvements
* GITHUB#12705, GITHUB#12705: Improve handling of NullPointerException and
IllegalStateException
in
dungba88 commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1777157731
@mikemccand
I rebased and created some implementation of DataOutput-based FSTWriter. I
think I need to write tests, but let me know what you think.
--
This is an automated mes
msokolov commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1370054483
##
lucene/core/src/java/org/apache/lucene/codecs/lucene95/Lucene95HnswVectorsFormat.java:
##
@@ -146,18 +148,24 @@ public final class Lucene95HnswVectorsFormat extend
jpountz commented on PR #12719:
URL: https://github.com/apache/lucene/pull/12719#issuecomment-1777078920
Wikibigall:
```
TaskQPS baseline StdDevQPS
my_modified_version StdDevPct diff p-value
Prefix3
jpountz opened a new pull request, #12719:
URL: https://github.com/apache/lucene/pull/12719
PR #12382 added a bulk scorer for top-k hits on conjunctions that yielded a
significant speedup (annotation
[FP](http://people.apache.org/~mikemccand/lucenebench/AndHighHigh.html)).
This change pr
msokolov commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1370046773
##
lucene/core/src/java/org/apache/lucene/codecs/lucene95/Lucene95HnswVectorsWriter.java:
##
@@ -635,17 +667,31 @@ private static DocsWithFieldSet writeVectorData(
msokolov commented on PR #12711:
URL: https://github.com/apache/lucene/pull/12711#issuecomment-1777062768
Another question: do we have any testing around this sort-stability /
block-preservation today? I'm getting nervous now that we are relying on an
undocumented feature that just happens
msokolov commented on PR #12711:
URL: https://github.com/apache/lucene/pull/12711#issuecomment-1777060875
The idea of more explicitly modeling doc-blocks makes sense to me, but I
wonder if it would really enable "That way we can sort only on the parent
document". What about the case (suppor
gf2121 commented on PR #12712:
URL: https://github.com/apache/lucene/pull/12712#issuecomment-1777056056
I indexed `wikimidumall` with:
* BPIndexReorder monfig mentioned
[here](https://github.com/apache/lucene/issues/12665#issuecomment-1770827026).
* BPMergePolicy on this
[commit](htt
benwtrent commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1370034532
##
lucene/core/src/java/org/apache/lucene/util/hnsw/NeighborArray.java:
##
@@ -35,6 +38,9 @@ public class NeighborArray {
float[] score;
int[] node;
private
benwtrent commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1370028382
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java:
##
@@ -221,34 +296,50 @@ private long printGraphBuildStatus(int node, long start,
long t)
benwtrent commented on code in PR #12660:
URL: https://github.com/apache/lucene/pull/12660#discussion_r1370023343
##
lucene/core/src/java/org/apache/lucene/codecs/lucene95/Lucene95HnswVectorsFormat.java:
##
@@ -146,18 +148,24 @@ public final class Lucene95HnswVectorsFormat exten
iverase commented on code in PR #12506:
URL: https://github.com/apache/lucene/pull/12506#discussion_r1369987969
##
lucene/core/src/java/org/apache/lucene/util/ByteSlicePool.java:
##
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ *
gf2121 commented on PR #12712:
URL: https://github.com/apache/lucene/pull/12712#issuecomment-1776994927
```
BPIndexReorderer reorderer = new BPIndexReorderer();
reorderer.setMinDocFreq(16384);
reorderer.setMaxIters(3);
reorderer.setMinPartitionSize(8192);
mp = new BPReorderingM
iverase commented on PR #12506:
URL: https://github.com/apache/lucene/pull/12506#issuecomment-1776973771
I like the introduction of `ByteSlicePool` but I wonder if the naming is
correct as it does not feel a generic slicer class but very tied to the format
used by TermsHashPerField. Just
javanna commented on code in PR #12718:
URL: https://github.com/apache/lucene/pull/12718#discussion_r1369871845
##
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:
##
@@ -115,10 +115,10 @@ public class IndexSearcher {
protected final List leafContexts;
/
javanna commented on code in PR #12718:
URL: https://github.com/apache/lucene/pull/12718#discussion_r1369871488
##
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:
##
@@ -425,11 +425,12 @@ public int count(Query query) throws IOException {
}
/**
- * Re
s1monw commented on PR #12711:
URL: https://github.com/apache/lucene/pull/12711#issuecomment-1776831638
I spoke to @jpountz about this topic and we discussed a different approach.
We could get away with not having the check at all and make blocks a first
class citizen by recording the paren
easyice opened a new issue, #12717:
URL: https://github.com/apache/lucene/issues/12717
### Description
Currently we use vint encoding the doc IDs if the doc buffer < 128, then
decode in `Lucene90PostingsReader#readVIntBlock`. In the high cardinality
field, it it possibly slow to
shubhamvishu commented on issue #12704:
URL: https://github.com/apache/lucene/issues/12704#issuecomment-1776661739
I opened a PR to make use of this constant : #12716
Also, I was thinking if this constant could be utilised in other hash
function implementations as well in the codebas
shubhamvishu opened a new pull request, #12716:
URL: https://github.com/apache/lucene/pull/12716
### Description
Addresses #12704. Below is the comment that inspired this
([link](https://github.com/apache/lucene/pull/12633#discussion_r1366847986)),
```
Instead, we shoul
50 matches
Mail list logo