github-actions[bot] commented on PR #13163:
URL: https://github.com/apache/lucene/pull/13163#issuecomment-2010952942
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
epotyom commented on PR #12862:
URL: https://github.com/apache/lucene/pull/12862#issuecomment-2010909252
Thank you for reviewing @gsmiller ! I've addressed your comments and updated
the branch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1533045914
##
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyWriter.java:
##
@@ -32,8 +32,8 @@
import org.apache.lucene.document.Field;
impor
epotyom commented on code in PR #12862:
URL: https://github.com/apache/lucene/pull/12862#discussion_r1533045638
##
lucene/facet/src/java/org/apache/lucene/facet/LongValueFacetCounts.java:
##
@@ -568,6 +568,12 @@ public Number getSpecificValue(String dim, String... path)
{
benwtrent commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2010582889
@jpountz added comments to the creation of the ratelimiter in CMS and the
limiter itself. Also updated the close interaction.
--
This is an automated message from the Apache Git Serv
jpountz commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2010548779
@benwtrent FYI I left a few minor comments: on
`ConcurrentMergeScheduler#close`, and about adding a comment for the limitation
you identified that a merge may not get throttled if the su
jpountz commented on PR #13186:
URL: https://github.com/apache/lucene/pull/13186#issuecomment-2010540255
> it isn't clean and a fragile logic
Your idea may work but fragility is indeed my concern. I guess it's similar
for the idea about tracking information about whether a doc crossed
rishabhmaurya commented on PR #13186:
URL: https://github.com/apache/lucene/pull/13186#issuecomment-2010431193
> Your reasoning about why the swap logic is inefficient seems to assume
that there would be lots of equal elements, but actually since we tie break on
docID, there would never be
uschindler commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2010319895
@ChrisHegarty: For MMapDirectory there's another way for madvise to work (at
least for reading index files): When the user has enabled native Panama access,
we can also mmap our
rmuir commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2010295899
I think the Direct-IO solution is hacky and trappy myself. This is a really
slow way to do perform IO, with the hope? that it will somehow make some
separate NRT search process faster
uschindler commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2010291706
Once we have that we can add some code to lookup the MethodHandle for the
madvise / fadvise from Panama and then call it with the file handle/descriptor
returned by the native AP
uschindler commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2010284960
> > The problem that directio tries to solve is the issue of not allowing
fadvise/madvise when opening index files. I am planning to have some optional
panama foreign in Mmapdir
benwtrent commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2010260555
If y'all are good with it, I will merge this tomorrow.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
hossman commented on PR #13162:
URL: https://github.com/apache/lucene/pull/13162#issuecomment-2010107912
FWIW: This commit seems to have duplicated `checkField` logic that was
previously added in #13105 -- compare `VectorFieldFunction.checkField` with
`FloatVectorValues.checkField` and `Byt
mikemccand commented on code in PR #13190:
URL: https://github.com/apache/lucene/pull/13190#discussion_r1532364018
##
lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java:
##
@@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws
MergePolicy.
benwtrent merged PR #13195:
URL: https://github.com/apache/lucene/pull/13195
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
ChrisHegarty commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2009877399
> The problem that directio tries to solve is the issue of not allowing
fadvise/madvise when opening index files. I am planning to have some optional
panama foreign in Mmapdir
jpountz commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2009851177
> If Elasticsearch or Solr wants to do this, ok. But lower level lucene
should have all options.
That works for me. I'm not worried about Elastisearch and Solr, they have
peo
benwtrent commented on PR #13195:
URL: https://github.com/apache/lucene/pull/13195#issuecomment-2009853169
@ChrisHegarty added a test :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specif
ChrisHegarty commented on PR #13195:
URL: https://github.com/apache/lucene/pull/13195#issuecomment-2009812909
Was this behaviour regressed by https://github.com/apache/lucene/pull/13162?
I wonder if a test would be warranted to ensure that it does not reoccur.
Also, in some areas `fi.getVe
uschindler commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2009732724
If you have enough memory on your machine, not using directio is still
recommended.
If you are indexing and reading index at same time and you merge segments,
why do you w
uschindler commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2009717292
In my opinion, we can add a separate series of factory methods to
FSDirectory to also create NRTCaching wrappers. This would make it easier for
beginners.
--
This is an automa
uschindler commented on issue #13194:
URL: https://github.com/apache/lucene/issues/13194#issuecomment-2009709463
We can fold it in, but make it optional. Mike's investigation is fine but
not the only truth. Circumventing OS cache is not always a good idea and you
should only carefully do it
jpountz commented on issue #13193:
URL: https://github.com/apache/lucene/issues/13193#issuecomment-2009569950
> The instant approach we take today gives no credit for a longish period of
time when no/few bytes were written.
My mental model is that the goal of merge throttling is to pr
jpountz commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009561079
> Maybe we don't need the IO write rate limiter anymore?
Nevermind, I see that you added more details on #13193.
--
This is an automated message from the Apache Git Service.
To r
jpountz commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009512368
> Maybe we don't need the IO write rate limiter anymore?
I'm curious what you have in mind. Are you considering throttling merges
purely based on merge concurrency then? E.g. slow
benwtrent commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009500202
I re-ran `IndexGeoNames` with the latest commits on this PR. I didn't
override CMS settings at all & I allowed merging to occur while indexing. This
was done on my macbook, so the numb
mikemccand commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009418841
> Selfishly, I wish for the second option as its the simplest.
+1 to keep with the second approach. It is best effort, and, the "instant
vs burst-bucket" bug above is yet more
mikemccand opened a new issue, #13193:
URL: https://github.com/apache/lucene/issues/13193
### Description
(Spinoff from tricky discussions on
https://github.com/apache/lucene/pull/13190).
Lucene's `ConcurrentMergeScheduler` allows users to set a `mbPerSec` rate
limit on bytes
benwtrent commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009457766
> How do we control the risk that a massive merge with KNN vectors soaks up
all available concurrency from the shared Executor for intra-merge concurrency
(all threads doing HNSW mergi
mikemccand commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009439976
I opened https://github.com/apache/lucene/issues/13193 specifically about
the "instant vs burst bucket" IO rate limiting approaches.
--
This is an automated message from the Apache
mikemccand commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009413290
Phew, it took me two hours of strongly caffeinated time to catch up on this!
Thank you @benwtrent.
What an awesome change, modernizing Lucene's merge concurrency so that
intra
mikemccand commented on code in PR #13190:
URL: https://github.com/apache/lucene/pull/13190#discussion_r1531945845
##
lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java:
##
@@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws
MergePolicy.
benwtrent commented on code in PR #13190:
URL: https://github.com/apache/lucene/pull/13190#discussion_r1531964164
##
lucene/core/src/java/org/apache/lucene/index/MergeRateLimiter.java:
##
@@ -118,24 +118,32 @@ private long maybePause(long bytes, long curNS) throws
MergePolicy.M
benwtrent commented on PR #13190:
URL: https://github.com/apache/lucene/pull/13190#issuecomment-2009404137
@mikemccand @jpountz
So, thinking about this more as I fell asleep.
This is how throttling will work as it is in this PR:
- Throttling is per thread. Meaning, a in
jpountz commented on code in PR #13190:
URL: https://github.com/apache/lucene/pull/13190#discussion_r1531735553
##
lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java:
##
@@ -446,7 +454,13 @@ private static String rateToString(double mbPerSec) {
@Over
jpountz merged PR #13066:
URL: https://github.com/apache/lucene/pull/13066
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
mrkm4ntr commented on code in PR #13066:
URL: https://github.com/apache/lucene/pull/13066#discussion_r1531738206
##
lucene/core/src/java/org/apache/lucene/search/DisjunctionSumScorer.java:
##
@@ -43,10 +47,25 @@ protected float score(DisiWrapper topList) throws
IOException {
jpountz commented on code in PR #13066:
URL: https://github.com/apache/lucene/pull/13066#discussion_r1531728006
##
lucene/core/src/java/org/apache/lucene/search/DisjunctionSumScorer.java:
##
@@ -43,10 +47,25 @@ protected float score(DisiWrapper topList) throws
IOException {
jpountz commented on PR #13186:
URL: https://github.com/apache/lucene/pull/13186#issuecomment-2009068628
You are anticipating my next question, which was going to be about whether
it helps speed up doc ID reordering.
Your reasoning about why the swap logic is inefficient seems to assu
40 matches
Mail list logo