msokolov commented on issue #13565:
URL: https://github.com/apache/lucene/issues/13565#issuecomment-2301788507
Hi thanks for that @jpountz, no worries; this was something we all agreed
on. I'm able to continue with the "research" part of this by simply increasing
heap size - it's not a bloc
msokolov commented on issue #13565:
URL: https://github.com/apache/lucene/issues/13565#issuecomment-2301804631
In the meantime, just to let you know I do have a dirt path implementation
of this (multithreading not yet working, totally recomputes centroids on every
iteration, etc), but it is
jpountz commented on PR #13670:
URL: https://github.com/apache/lucene/pull/13670#issuecomment-2301811869
Nightly benchmarks picked it up, see e.g.
https://home.apache.org/~mikemccand/lucenebench/2024.08.20.18.04.44.html#profiler_searching_4_cpu.
cc @mikemccand
--
This is an automated me
msokolov commented on PR #13543:
URL: https://github.com/apache/lucene/pull/13543#issuecomment-2301813096
> This change should also help with that by cutting the number of calls to
BufferedOutputStream#write(int) by a factor of 8192, which cuts the number of
calls to growIfNeeded by the sam
msokolov commented on issue #13675:
URL: https://github.com/apache/lucene/issues/13675#issuecomment-2301818862
What I wonder is: how can Lucene help with this? I feel like we have all the
primitives available to enable Splade-style search and retrieval, but maybe
there is something missing?
benwtrent commented on issue #13675:
URL: https://github.com/apache/lucene/issues/13675#issuecomment-2301866689
There might be a better format than just terms. But I would assume the
bipartite graph stuff would help here.
Additionally, I would expect the most benefits to be made at qu
dweiss commented on issue #13676:
URL: https://github.com/apache/lucene/issues/13676#issuecomment-2302007706
How did you import the project into Eclipse? It should be "Import as an
existing project" or something like this. When I run gradlew eclipse, the
.classpath file doesn't mention thos
mikemccand commented on PR #13568:
URL: https://github.com/apache/lucene/pull/13568#issuecomment-2302206543
> Perhaps this is something we'd want to fix for Lucene 10 if it requires
breaking changes?
+1, thanks @javanna and @gsmiller.
--
This is an automated message from the Apache
aoli-al commented on issue #13547:
URL: https://github.com/apache/lucene/issues/13547#issuecomment-2302212349
Please use this fork to reproduce the failure:
https://github.com/aoli-al/lucene/tree/LUCENE-13547
Command: `./gradlew test --tests "*testSubclassConcurrentMergeScheduler*"`
--
aoli-al commented on issue #13552:
URL: https://github.com/apache/lucene/issues/13552#issuecomment-2302320506
Please use the following fork to reproduce the failure:
https://github.com/aoli-al/lucene/tree/LUCENE-13552
Command: `./gradlew test --tests
"*testIOExceptionDuringWriteSegmentWi
aoli-al commented on issue #13593:
URL: https://github.com/apache/lucene/issues/13593#issuecomment-2302344991
Please use the following fork to reproduce the failure:
https://github.com/aoli-al/lucene/tree/LUCENE-13593
Note that the patch adds an infinite loop at the end of the test. S
iverase commented on code in PR #13592:
URL: https://github.com/apache/lucene/pull/13592#discussion_r1725315073
##
lucene/core/src/java/org/apache/lucene/index/DocValuesSkipper.java:
##
@@ -98,4 +98,29 @@ public abstract class DocValuesSkipper {
/** Return the global number
NavidMitchell commented on issue #3534:
URL: https://github.com/apache/lucene/issues/3534#issuecomment-2302405519
It would be nice to have this feature supported.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
UR
gsmiller commented on code in PR #13592:
URL: https://github.com/apache/lucene/pull/13592#discussion_r1725418294
##
lucene/core/src/java/org/apache/lucene/index/DocValuesSkipper.java:
##
@@ -98,4 +98,29 @@ public abstract class DocValuesSkipper {
/** Return the global numbe
gsmiller commented on PR #13672:
URL: https://github.com/apache/lucene/pull/13672#issuecomment-2302551656
Cleaned up this PR a bit and added testing. Also addressed Robert's feedback
(thanks @rmuir). Should be ready for another review if anyone is interested.
Thanks!
--
This is an automa
msokolov merged PR #13674:
URL: https://github.com/apache/lucene/pull/13674
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
mayya-sharipova commented on PR #13651:
URL: https://github.com/apache/lucene/pull/13651#issuecomment-2302685585
> possibly switch to LongValues for storing vectorOrd -> centroidOrd mapping
I was thinking about adding centroids mappings as LongValues at the end of
meta file, but this
original-brownbear commented on PR #13609:
URL: https://github.com/apache/lucene/pull/13609#issuecomment-2302854261
Thanks Luca!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comme
original-brownbear merged PR #13609:
URL: https://github.com/apache/lucene/pull/13609
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...
benwtrent commented on PR #13651:
URL: https://github.com/apache/lucene/pull/13651#issuecomment-2302857403
100MB assumes that even when compressed, it's a single byte per centroid.
100M vectors might only have 2 centroids and thus only need two bits two store.
Also, I would expect the
gsmiller commented on issue #3534:
URL: https://github.com/apache/lucene/issues/3534#issuecomment-2302926537
@NavidMitchell I'm not sure if there's a more convenient way to do this, but
note that you can do this using `Expressions` compiled from
`JavascriptCompiler` since the compiler suppo
jpountz commented on PR #13658:
URL: https://github.com/apache/lucene/pull/13658#issuecomment-2302929875
I ran wikibigall on a M3, which is interesting because it does inline the
splitLongs call, both before and after the change (presumably because the
generated native code is smaller thus
jpountz commented on code in PR #13672:
URL: https://github.com/apache/lucene/pull/13672#discussion_r1725706940
##
lucene/core/src/test/org/apache/lucene/search/TestDocValuesRewriteMethod.java:
##
@@ -61,14 +61,19 @@ public void setUp() throws Exception {
.setMa
jpountz commented on issue #13675:
URL: https://github.com/apache/lucene/issues/13675#issuecomment-230374
I found this recent paper by well-known people in the IR efficiency space
quite interesting: https://arxiv.org/pdf/2405.01117. It builds on inverted
indexes and simple/intuitive ide
jpountz commented on code in PR #13592:
URL: https://github.com/apache/lucene/pull/13592#discussion_r1725906610
##
lucene/core/src/test/org/apache/lucene/search/TestDocValuesQueries.java:
##
@@ -42,34 +45,100 @@
public class TestDocValuesQueries extends LuceneTestCase {
+
jpountz commented on issue #13565:
URL: https://github.com/apache/lucene/issues/13565#issuecomment-2303226288
> readVInt is also a hotspot at search time for *VectorQuery
We should use group-varint, like for tail postings?
--
This is an automated message from the Apache Git Service.
github-actions[bot] commented on PR #13594:
URL: https://github.com/apache/lucene/pull/13594#issuecomment-2303335718
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
msokolov commented on issue #13565:
URL: https://github.com/apache/lucene/issues/13565#issuecomment-2303338851
> We should use group-varint, like for tail postings?
Does this choose a single bit-width for a group of postings? That sounds
like it would produce savings here, yes. Also i
thomasli9895 commented on issue #13452:
URL: https://github.com/apache/lucene/issues/13452#issuecomment-2303379046
I also encountered this problem, I upgraded from version 7.8.1 to version
7.17.9 and then fell back to version 7.8.1 and this glitch occurred
![image](https://github.com/use
thomasli9895 commented on issue #13452:
URL: https://github.com/apache/lucene/issues/13452#issuecomment-2303380395
I also encountered this problem, I upgraded from version 7.8.1 to version
7.17.9 and then fell back to version 7.8.1 and this glitch occurred
![image](https://github.com/use
thomasli9895 commented on issue #13452:
URL: https://github.com/apache/lucene/issues/13452#issuecomment-2303384472
I also encountered this problem, I upgraded from version 7.8.1 to version
7.17.9 and then fell back to version 7.8.1 and this glitch occurred
`org.elasticsearch.bootstrap.Sta
vsop-479 commented on PR #13398:
URL: https://github.com/apache/lucene/pull/13398#issuecomment-2303599920
@jpountz
Please take a look when you get a chance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
vsop-479 commented on PR #13486:
URL: https://github.com/apache/lucene/pull/13486#issuecomment-2303600507
> Could Lucene maybe track that a field is actually unique internally and
then apply this optimization automatically / always correctly?
@jpountz
Do you have any idea about t
33 matches
Mail list logo