gaoj0017 commented on PR #14078:
URL: https://github.com/apache/lucene/pull/14078#issuecomment-2550510539
Hi @benwtrent , I am the first author of the [RaBitQ
paper](https://arxiv.org/abs/2405.12497) and [its extended
version](https://arxiv.org/abs/2409.09913). As your team have known, our
easyice closed issue #14020:
TestSoftDeletesDirectoryReaderWrapper.testAvoidWrappingReadersWithoutSoftDeletes
AssertionError: expected:<5> but was:<3>
URL: https://github.com/apache/lucene/issues/14020
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
easyice merged PR #14057:
URL: https://github.com/apache/lucene/pull/14057
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
dsmiley commented on PR #13178:
URL: https://github.com/apache/lucene/pull/13178#issuecomment-2550388713
Happy to do so.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2550380846
I took a stab at hacking around this on our side as well:
https://github.com/apache/lucene/pull/14079
--
This is an automated message from the Apache Git Service.
To respond to the messa
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2550335515
Thank you @eusousu
I tried to make some progress, for now at least I have an open bug report:
https://bugs.documentfoundation.org/show_bug.cgi?id=164366
--
This is an automated m
navneet1v commented on code in PR #14076:
URL: https://github.com/apache/lucene/pull/14076#discussion_r1889600647
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsFormat.java:
##
@@ -78,21 +79,23 @@ public final class Lucene99FlatVectorsFormat extends
github-actions[bot] commented on PR #13521:
URL: https://github.com/apache/lucene/pull/13521#issuecomment-2549977673
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
github-actions[bot] commented on PR #12517:
URL: https://github.com/apache/lucene/pull/12517#issuecomment-2549978746
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
github-actions[bot] commented on PR #14035:
URL: https://github.com/apache/lucene/pull/14035#issuecomment-2549976714
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
gsmiller commented on PR #14074:
URL: https://github.com/apache/lucene/pull/14074#issuecomment-2549785261
Ah great catch. Thanks @uschindler!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the s
benwtrent closed pull request #13651: Add a Better Binary Quantizer format for
dense vectors
URL: https://github.com/apache/lucene/pull/13651
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
benwtrent commented on PR #13651:
URL: https://github.com/apache/lucene/pull/13651#issuecomment-2549771070
Closing this PR in deference to this one:
https://github.com/apache/lucene/pull/14078
An evolution of scalar quantization proved more flexible and provided better
recall in our
benwtrent opened a new pull request, #14078:
URL: https://github.com/apache/lucene/pull/14078
This provides a binary vector format for vectors. The key ideas are:
- Centroid centered vectors
- Asymmetric quantization
- Individually optimized scalar quantization
This all
rmuir commented on PR #14072:
URL: https://github.com/apache/lucene/pull/14072#issuecomment-2549721537
@ChrisHegarty take another look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
rmuir commented on PR #14072:
URL: https://github.com/apache/lucene/pull/14072#issuecomment-2549694829
Yeah, i commented out the forwarding and I think i reproduced your issue:
```
fatal: [graviton4]: FAILED! =>
changed: false
cmd: /usr/bin/git ls-remote g...@github.c
jpountz merged PR #14069:
URL: https://github.com/apache/lucene/pull/14069
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
rmuir commented on PR #14072:
URL: https://github.com/apache/lucene/pull/14072#issuecomment-2549619333
OK, we can change that to https. I'm pretty sure it doesn't work for you
because no agent was forwarded. i have in my ssh config:
```
AddKeysToAgent yes
ForwardAgent yes
```
eusousu commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2549438980
> https://github.com/LibreOffice/dictionaries/pull/46
Their contribution process seem to be elsewhere, but I could not understand
it fully 😅
I tried sending the issue to the
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2549192152
https://github.com/LibreOffice/dictionaries/pull/46
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
jpountz commented on PR #14077:
URL: https://github.com/apache/lucene/pull/14077#issuecomment-2549190865
To benchmark this change, I applied a (quick and dirty) patch to luceneutil
to have a mix of 3 `Bits` implementations to represent live docs, using a
`FixedBitSet` on 75% of segments:
jpountz opened a new pull request, #14077:
URL: https://github.com/apache/lucene/pull/14077
This helps make calls sites of `Bits#get` bimorphic at most when checking
live docs. This helps because calls to `FixedBitSet#get` can then be inlined
when live docs are stored in a `FixedBitSet`. An
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2549163081
I will send them a one-liner PR explaining the situation, we can take it
from there.
We may want to separately try to be more lenient about this part of the
parsing. Have not looked
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2549147744
Yes, that's it. the `REP 3619` should be changed to `REP 3621`. I guess we
could send the PR to libreoffice, since the "upstream" dictionary looks totally
different here: https://github.co
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2549123863
I'm happy to try to debug this but it might be a few days. Issue may be with
REP rules in the referenced commit.
the way these rules work are:
```
REP 3619
REP a а
REP c с
eusousu commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2549064565
> If you were to capture the full "reproduce with" command line that is
output by the test framework
You mean this?
```
Reproduce with: gradlew :lucene:analysis:common:test -
dweiss commented on PR #13178:
URL: https://github.com/apache/lucene/pull/13178#issuecomment-2549067219
I just ran into this issue. Do you think we could revisit this and maybe
merge it in? I've hit it with a large query consisting of multiple intervals -
there are no "clauses" as such and
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2549010607
And yeah i see the commit date, but that's not the push date. So I suspect
this issue has nothing to do with your PR and may fail all PRs until we address
it.
--
This is an automated me
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2549005107
It fails again. maybe problem comes from
https://github.com/LibreOffice/dictionaries/commit/d1696029d8923ae697cb2d6d4d7d69791b1943f2
?
--
This is an automated message from the Apache Gi
rmuir commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2548996704
I reran that check to see what happens, if it reproduces.
This "extra regressions" check is also doing unpinned shallow `git clone` of
external dictionaries repositories, so they co
msokolov commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2548911336
> I am unable to understand how my change could impact the analysis on the
Mongolian language.
I don't understand the connection to your change either, but it looks to me
as if M
jimczi commented on code in PR #14076:
URL: https://github.com/apache/lucene/pull/14076#discussion_r1888721064
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsWriter.java:
##
@@ -282,7 +285,7 @@ public CloseableRandomVectorScorerSupplier
mergeOneFie
ChrisHegarty commented on code in PR #14076:
URL: https://github.com/apache/lucene/pull/14076#discussion_r1888686757
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsWriter.java:
##
@@ -282,7 +285,7 @@ public CloseableRandomVectorScorerSupplier
merge
uschindler commented on PR #14074:
URL: https://github.com/apache/lucene/pull/14074#issuecomment-2548724672
> > For the future: If people submit PRs about making private members which
are collections public, always check if the immutability could be violated.
This is a major problem in Java
eusousu commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2548665777
Is there something I can investigate further? I tried parsing the error and
it seems to relate to a mn_MN dictionary that refers to Mongolian 🤔
```
While checking
/home/runner/
mikemccand commented on PR #14074:
URL: https://github.com/apache/lucene/pull/14074#issuecomment-2548665493
> For the future: If people submit PRs about making private members which
are collections public, always check if the immutability could be violated.
This is a major problem in Java w
jpountz commented on code in PR #13948:
URL: https://github.com/apache/lucene/pull/13948#discussion_r188862
##
lucene/core/src/test/org/apache/lucene/util/TestBytesRefArray.java:
##
@@ -43,8 +44,17 @@ public void testAppend() throws IOException {
for (int i = 0; i < e
benwtrent commented on PR #14075:
URL: https://github.com/apache/lucene/pull/14075#issuecomment-2548624976
While its a simple change, it does change the analysis chain. I wonder if it
should stick to Lucene 11 (admittedly, that will not be shipped for a LONG
time).
I wonder what othe
uschindler commented on PR #14074:
URL: https://github.com/apache/lucene/pull/14074#issuecomment-2548529464
I cherrypicked in 10.x and 10.1 branch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
uschindler merged PR #14074:
URL: https://github.com/apache/lucene/pull/14074
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
jimczi opened a new pull request, #14076:
URL: https://github.com/apache/lucene/pull/14076
This change reverts #13985 and makes sure each knn format sticks to a single
read advice consistently.
Switching read advice during merges might help some use cases, but it can
also hurt others—e.
eusousu opened a new pull request, #14075:
URL: https://github.com/apache/lucene/pull/14075
### Description
In brazillian portuguese the conjuntion
"em(preposition)+(article)" take the form "na, nas, no, nos" being
commom stop words.
For some reason the "nas" conjunction appea
uschindler commented on PR #14074:
URL: https://github.com/apache/lucene/pull/14074#issuecomment-2548460187
P.S.: I added the `Collections.immutableCollection` only in the getter,
because making `EnumMap clauseSets` does not work:
- The inner values of the map are custom classes. As we ha
eusousu commented on issue #14065:
URL: https://github.com/apache/lucene/issues/14065#issuecomment-2548452970
It's included on the Portuguese
[stopwords.txt](https://github.com/apache/lucene/blob/0203815/lucene/analysis/common/src/resources/org/apache/lucene/analysis/snowball/portuguese_stop
uschindler commented on code in PR #13950:
URL: https://github.com/apache/lucene/pull/13950#discussion_r1888513047
##
lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java:
##
@@ -136,7 +152,7 @@ public List clauses() {
}
/** Return the collection of queries fo
rmuir commented on issue #14042:
URL: https://github.com/apache/lucene/issues/14042#issuecomment-2548442349
I think we are having a communication issue over terminology. I don't care
about unrolling, i care about superscalar execution. JVM doesn't allow it,
which means the hardware sits the
jpountz opened a new pull request, #14073:
URL: https://github.com/apache/lucene/pull/14073
I did not know it when I checked in the code, but this is almost exactly the
v1 intersection algorithm from the "SIMD compression and the intersection of
sorted integers" paper.
--
This is an auto
uschindler commented on code in PR #13950:
URL: https://github.com/apache/lucene/pull/13950#discussion_r1888434442
##
lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java:
##
@@ -136,7 +152,7 @@ public List clauses() {
}
/** Return the collection of queries fo
uschindler commented on code in PR #13950:
URL: https://github.com/apache/lucene/pull/13950#discussion_r1888404168
##
lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java:
##
@@ -136,7 +152,7 @@ public List clauses() {
}
/** Return the collection of queries fo
uschindler commented on code in PR #13950:
URL: https://github.com/apache/lucene/pull/13950#discussion_r1888387748
##
lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java:
##
@@ -136,7 +152,7 @@ public List clauses() {
}
/** Return the collection of queries fo
uschindler commented on code in PR #13950:
URL: https://github.com/apache/lucene/pull/13950#discussion_r1888387748
##
lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java:
##
@@ -136,7 +152,7 @@ public List clauses() {
}
/** Return the collection of queries fo
51 matches
Mail list logo