[GitHub] [lucene] jpountz opened a new pull request #724: LUCENE-10311: Remove pop_XXX helpers from `BitUtil`.
jpountz opened a new pull request #724: URL: https://github.com/apache/lucene/pull/724 As @rmuir noted, it would be as simple and create less cognitive overhead to use `Long#bitCount` directly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10453) Speed up VectorUtil#squareDistance
Adrien Grand created LUCENE-10453: - Summary: Speed up VectorUtil#squareDistance Key: LUCENE-10453 URL: https://issues.apache.org/jira/browse/LUCENE-10453 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand {{VectorUtil#squareDistance}} is used in conjunction with {{VectorSimilarityFunction#EUCLIDEAN}}. It didn't get as much love as dot products (LUCENE-9837) yet there seems to be room for improvement. I wrote a quick JMH benchmark to run some comparisons: https://github.com/jpountz/vector-similarity-benchmarks. While it's not as fast as using the vector API (which makes squareDistance computations more than 2x faster), we can get a ~25% speedup by unrolling the loop in a similar way to what dot product does. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz merged pull request #711: LUCENE-10428: Avoid infinite loop under error conditions.
jpountz merged pull request #711: URL: https://github.com/apache/lucene/pull/711 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jpountz commented on pull request #711: LUCENE-10428: Avoid infinite loop under error conditions.
jpountz commented on pull request #711: URL: https://github.com/apache/lucene/pull/711#issuecomment-1057807344 Thanks all for the feedback and contributions! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500591#comment-17500591 ] ASF subversion and git services commented on LUCENE-10428: -- Commit 44a2a82319c9d375f3399a4b36abf2c3c7e229d6 in lucene's branch refs/heads/main from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=44a2a82 ] LUCENE-10428: Avoid infinite loop under error conditions. (#711) Co-authored-by: dblock > getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge > leading to busy threads in infinite loop > - > > Key: LUCENE-10428 > URL: https://issues.apache.org/jira/browse/LUCENE-10428 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring, core/search >Reporter: Ankit Jain >Priority: Major > Attachments: Flame_graph.png > > Time Spent: 5h 20m > Remaining Estimate: 0h > > Customers complained about high CPU for Elasticsearch cluster in production. > We noticed that few search requests were stuck for long time > {code:java} > % curl -s localhost:9200/_cat/tasks?v > indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205 > AmMLzDQ4RrOJievRDeGFZw:569204 direct1645195007282 14:36:47 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075 > emjWc5bUTG6lgnCGLulq-Q:502074 direct1645195037259 14:37:17 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270 > emjWc5bUTG6lgnCGLulq-Q:583269 direct1645201316981 16:21:56 4.5h > {code} > Flame graphs indicated that CPU time is mostly going into > *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some > live JVM debugging found that > org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had > around 4 million invocations every second > Figured out the values of some parameters from live debugging: > {code:java} > minScoreSum = 3.5541441 > minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = > 3.554144322872162 > returnObj scoreSumUpperBound = 3.5541444 > Math.ulp(minScoreSum) = 2.3841858E-7 > {code} > Example code snippet: > {code:java} > double sumOfOtherMaxScores = 3.554144322872162; > double minScoreSum = 3.5541441; > float minScore = (float) (minScoreSum - sumOfOtherMaxScores); > while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) { > minScore -= Math.ulp(minScoreSum); > System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum)); > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500594#comment-17500594 ] ASF subversion and git services commented on LUCENE-10428: -- Commit 0d35e38b93d4c394aee691f308092cb9cfa792a2 in lucene's branch refs/heads/branch_9x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=0d35e38 ] LUCENE-10428: Avoid infinite loop under error conditions. (#711) Co-authored-by: dblock > getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge > leading to busy threads in infinite loop > - > > Key: LUCENE-10428 > URL: https://issues.apache.org/jira/browse/LUCENE-10428 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring, core/search >Reporter: Ankit Jain >Priority: Major > Attachments: Flame_graph.png > > Time Spent: 5h 20m > Remaining Estimate: 0h > > Customers complained about high CPU for Elasticsearch cluster in production. > We noticed that few search requests were stuck for long time > {code:java} > % curl -s localhost:9200/_cat/tasks?v > indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205 > AmMLzDQ4RrOJievRDeGFZw:569204 direct1645195007282 14:36:47 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075 > emjWc5bUTG6lgnCGLulq-Q:502074 direct1645195037259 14:37:17 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270 > emjWc5bUTG6lgnCGLulq-Q:583269 direct1645201316981 16:21:56 4.5h > {code} > Flame graphs indicated that CPU time is mostly going into > *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some > live JVM debugging found that > org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had > around 4 million invocations every second > Figured out the values of some parameters from live debugging: > {code:java} > minScoreSum = 3.5541441 > minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = > 3.554144322872162 > returnObj scoreSumUpperBound = 3.5541444 > Math.ulp(minScoreSum) = 2.3841858E-7 > {code} > Example code snippet: > {code:java} > double sumOfOtherMaxScores = 3.554144322872162; > double minScoreSum = 3.5541441; > float minScore = (float) (minScoreSum - sumOfOtherMaxScores); > while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) { > minScore -= Math.ulp(minScoreSum); > System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum)); > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500595#comment-17500595 ] Adrien Grand commented on LUCENE-10428: --- [~akjain] While the underlying issue has not been fixed, the infinite loop should be fixed now, so I'm leaning towards marking this issue as resolved and opening a new one if/when we get the new IllegalStateException to trip. What do youthink? > getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge > leading to busy threads in infinite loop > - > > Key: LUCENE-10428 > URL: https://issues.apache.org/jira/browse/LUCENE-10428 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring, core/search >Reporter: Ankit Jain >Priority: Major > Attachments: Flame_graph.png > > Time Spent: 5h 20m > Remaining Estimate: 0h > > Customers complained about high CPU for Elasticsearch cluster in production. > We noticed that few search requests were stuck for long time > {code:java} > % curl -s localhost:9200/_cat/tasks?v > indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205 > AmMLzDQ4RrOJievRDeGFZw:569204 direct1645195007282 14:36:47 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075 > emjWc5bUTG6lgnCGLulq-Q:502074 direct1645195037259 14:37:17 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270 > emjWc5bUTG6lgnCGLulq-Q:583269 direct1645201316981 16:21:56 4.5h > {code} > Flame graphs indicated that CPU time is mostly going into > *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some > live JVM debugging found that > org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had > around 4 million invocations every second > Figured out the values of some parameters from live debugging: > {code:java} > minScoreSum = 3.5541441 > minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = > 3.554144322872162 > returnObj scoreSumUpperBound = 3.5541444 > Math.ulp(minScoreSum) = 2.3841858E-7 > {code} > Example code snippet: > {code:java} > double sumOfOtherMaxScores = 3.554144322872162; > double minScoreSum = 3.5541441; > float minScore = (float) (minScoreSum - sumOfOtherMaxScores); > while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) { > minScore -= Math.ulp(minScoreSum); > System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum)); > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10002) Remove IndexSearcher#search(Query,Collector) in favor of IndexSearcher#search(Query,CollectorManager)
[ https://issues.apache.org/jira/browse/LUCENE-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500606#comment-17500606 ] ASF subversion and git services commented on LUCENE-10002: -- Commit 2a6b2ca1435ddb719bf0834d035ec38b7401c931 in lucene's branch refs/heads/branch_9x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=2a6b2ca ] LUCENE-10002: Fix test failure. When IndexSearcher is created with a threadpool it becomes impossible to assert on the number of evaluated hits overall. > Remove IndexSearcher#search(Query,Collector) in favor of > IndexSearcher#search(Query,CollectorManager) > - > > Key: LUCENE-10002 > URL: https://issues.apache.org/jira/browse/LUCENE-10002 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 13.5h > Remaining Estimate: 0h > > It's a bit trappy that you can create an IndexSearcher with an executor, but > that it would always search on the caller thread when calling > {{IndexSearcher#search(Query,Collector)}}. > Let's remove {{IndexSearcher#search(Query,Collector)}}, point our users to > {{IndexSearcher#search(Query,CollectorManager)}} instead, and change factory > methods of our main collectors (e.g. {{TopScoreDocCollector#create}}) to > return a {{CollectorManager}} instead of a {{Collector}}? -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10002) Remove IndexSearcher#search(Query,Collector) in favor of IndexSearcher#search(Query,CollectorManager)
[ https://issues.apache.org/jira/browse/LUCENE-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500605#comment-17500605 ] ASF subversion and git services commented on LUCENE-10002: -- Commit bff4246476d860942e1b20dae2540b5caae2eda9 in lucene's branch refs/heads/main from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=bff4246 ] LUCENE-10002: Fix test failure. When IndexSearcher is created with a threadpool it becomes impossible to assert on the number of evaluated hits overall. > Remove IndexSearcher#search(Query,Collector) in favor of > IndexSearcher#search(Query,CollectorManager) > - > > Key: LUCENE-10002 > URL: https://issues.apache.org/jira/browse/LUCENE-10002 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 13.5h > Remaining Estimate: 0h > > It's a bit trappy that you can create an IndexSearcher with an executor, but > that it would always search on the caller thread when calling > {{IndexSearcher#search(Query,Collector)}}. > Let's remove {{IndexSearcher#search(Query,Collector)}}, point our users to > {{IndexSearcher#search(Query,CollectorManager)}} instead, and change factory > methods of our main collectors (e.g. {{TopScoreDocCollector#create}}) to > return a {{CollectorManager}} instead of a {{Collector}}? -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] romseygeek commented on a change in pull request #722: LUCENE-10431: Deprecate MultiTermQuery.setRewriteMethod()
romseygeek commented on a change in pull request #722: URL: https://github.com/apache/lucene/pull/722#discussion_r818496740 ## File path: lucene/core/src/java/org/apache/lucene/search/FuzzyQuery.java ## @@ -113,11 +138,40 @@ public FuzzyQuery(Term term, int maxEdits) { this(term, maxEdits, defaultPrefixLength); } + /** + * Calls {@link #FuzzyQuery(Term, int, int, int, boolean, + * org.apache.lucene.search.MultiTermQuery.RewriteMethod) FuzzyQuery(term, maxEdits, + * defaultPrefixLength, defaultMaxExpansions, defaultTransitions, defaultRewriteMethod)}. + */ + public FuzzyQuery(Term term, int maxEdits, RewriteMethod rewriteMethod) { +this( +term, +maxEdits, +defaultPrefixLength, +defaultMaxExpansions, +defaultTranspositions, +rewriteMethod); + } + /** Calls {@link #FuzzyQuery(Term, int) FuzzyQuery(term, defaultMaxEdits)}. */ public FuzzyQuery(Term term) { this(term, defaultMaxEdits); } + /** + * Creates a new FuzzyQuery with default max edits, prefix length, max expansions and + * transpositions + */ + public FuzzyQuery(Term term, RewriteMethod rewriteMethod) { +this( +term, +defaultMaxEdits, +defaultPrefixLength, +defaultMaxExpansions, +defaultTranspositions, +rewriteMethod); + } Review comment: I'm adding about four but there are already about eight :) I quite like the Builder idea but we can argue over that separately, for now I'll just extend the full constructor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500655#comment-17500655 ] kkewwei commented on LUCENE-10448: -- [~vigyas] you mill nothing, there maybe exists a small tip: After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s), the instant write will create presure to IO, according to my statistics, the frequency of no-pause bytes is [2%-20%], this proportion of no-pause is especially high when io pressure is high, too much high instant rate will leads to higher io pressure. > MergeRateLimiter doesn't always limit instant rate. > --- > > Key: LUCENE-10448 > URL: https://issues.apache.org/jira/browse/LUCENE-10448 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Affects Versions: 8.11.1 >Reporter: kkewwei >Priority: Major > > We can see the code in *MergeRateLimiter*: > {code:java} > private long maybePause(long bytes, long curNS) throws > MergePolicy.MergeAbortedException { > > double rate = mbPerSec; > double secondsToPause = (bytes / 1024. / 1024.) / rate; > long targetNS = lastNS + (long) (10 * secondsToPause); > long curPauseNS = targetNS - curNS; > // We don't bother with thread pausing if the pause is smaller than 2 > msec. > if (curPauseNS <= MIN_PAUSE_NS) { > // Set to curNS, not targetNS, to enforce the instant rate, not > // the "averaged over all history" rate: > lastNS = curNS; > return -1; > } >.. > } > {code} > If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, > then the *maybePause* is called in 7:05 again, so the value of > *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than > *curNS*, no matter how big the bytes is, we will return -1 and ignore to > pause. > I count the total times(callTimes) calling *maybePause* and ignored pause > times(ignorePauseTimes) and detail ignored bytes(detailBytes): > {code:java} > [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] > [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 > docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec > throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = > [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, > 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, > 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, > 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] > {code} > There are 857 times calling *maybePause*, including 25 times which is ignored > to pause, we can see that the ignored detail bytes (such as 0.28125mb) are > not small. > As long as the interval between two *maybePause* calls is relatively long, > the pause action that should be executed will not be executed. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500655#comment-17500655 ] kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:44 AM: [~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s), the instant write will create presure to IO, according to my statistics, the frequency of no-pause bytes is [2%-20%], this proportion of no-pause is especially high when io pressure is high, too much high instant rate will leads to higher io pressure. was (Author: kkewwei): [~vigyas] you mill nothing, there maybe exists a small tip: After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s), the instant write will create presure to IO, according to my statistics, the frequency of no-pause bytes is [2%-20%], this proportion of no-pause is especially high when io pressure is high, too much high instant rate will leads to higher io pressure. > MergeRateLimiter doesn't always limit instant rate. > --- > > Key: LUCENE-10448 > URL: https://issues.apache.org/jira/browse/LUCENE-10448 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Affects Versions: 8.11.1 >Reporter: kkewwei >Priority: Major > > We can see the code in *MergeRateLimiter*: > {code:java} > private long maybePause(long bytes, long curNS) throws > MergePolicy.MergeAbortedException { > > double rate = mbPerSec; > double secondsToPause = (bytes / 1024. / 1024.) / rate; > long targetNS = lastNS + (long) (10 * secondsToPause); > long curPauseNS = targetNS - curNS; > // We don't bother with thread pausing if the pause is smaller than 2 > msec. > if (curPauseNS <= MIN_PAUSE_NS) { > // Set to curNS, not targetNS, to enforce the instant rate, not > // the "averaged over all history" rate: > lastNS = curNS; > return -1; > } >.. > } > {code} > If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, > then the *maybePause* is called in 7:05 again, so the value of > *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than > *curNS*, no matter how big the bytes is, we will return -1 and ignore to > pause. > I count the total times(callTimes) calling *maybePause* and ignored pause > times(ignorePauseTimes) and detail ignored bytes(detailBytes): > {code:java} > [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] > [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 > docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec > throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = > [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, > 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, > 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, > 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] > {code} > There are 857 times calling *maybePause*, including 25 times which is ignored > to pause, we can see that the ignored detail bytes (such as 0.28125mb) are > not small. > As long as the interval between two *maybePause* calls is relatively long, > the pause action that should be executed will not be executed. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500655#comment-17500655 ] kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:45 AM: [~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s), the instant write will create presure to IO. According to my statistics, the frequency of no-pause bytes is [2%-20%], this proportion of no-pause is especially high when io pressure is high, too much high instant rate will leads to higher io pressure. was (Author: kkewwei): [~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s), the instant write will create presure to IO, according to my statistics, the frequency of no-pause bytes is [2%-20%], this proportion of no-pause is especially high when io pressure is high, too much high instant rate will leads to higher io pressure. > MergeRateLimiter doesn't always limit instant rate. > --- > > Key: LUCENE-10448 > URL: https://issues.apache.org/jira/browse/LUCENE-10448 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Affects Versions: 8.11.1 >Reporter: kkewwei >Priority: Major > > We can see the code in *MergeRateLimiter*: > {code:java} > private long maybePause(long bytes, long curNS) throws > MergePolicy.MergeAbortedException { > > double rate = mbPerSec; > double secondsToPause = (bytes / 1024. / 1024.) / rate; > long targetNS = lastNS + (long) (10 * secondsToPause); > long curPauseNS = targetNS - curNS; > // We don't bother with thread pausing if the pause is smaller than 2 > msec. > if (curPauseNS <= MIN_PAUSE_NS) { > // Set to curNS, not targetNS, to enforce the instant rate, not > // the "averaged over all history" rate: > lastNS = curNS; > return -1; > } >.. > } > {code} > If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, > then the *maybePause* is called in 7:05 again, so the value of > *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than > *curNS*, no matter how big the bytes is, we will return -1 and ignore to > pause. > I count the total times(callTimes) calling *maybePause* and ignored pause > times(ignorePauseTimes) and detail ignored bytes(detailBytes): > {code:java} > [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] > [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 > docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec > throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = > [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, > 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, > 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, > 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] > {code} > There are 857 times calling *maybePause*, including 25 times which is ignored > to pause, we can see that the ignored detail bytes (such as 0.28125mb) are > not small. > As long as the interval between two *maybePause* calls is relatively long, > the pause action that should be executed will not be executed. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500655#comment-17500655 ] kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:46 AM: [~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s), the instant write will create presure to IO. According to my statistics, the frequency of no-pause bytes is [2%-20%], this proportion of no-pause is especially high when io pressure is high, too much high instant rate will leads to higher io pressure. was (Author: kkewwei): [~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s), the instant write will create presure to IO. According to my statistics, the frequency of no-pause bytes is [2%-20%], this proportion of no-pause is especially high when io pressure is high, too much high instant rate will leads to higher io pressure. > MergeRateLimiter doesn't always limit instant rate. > --- > > Key: LUCENE-10448 > URL: https://issues.apache.org/jira/browse/LUCENE-10448 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Affects Versions: 8.11.1 >Reporter: kkewwei >Priority: Major > > We can see the code in *MergeRateLimiter*: > {code:java} > private long maybePause(long bytes, long curNS) throws > MergePolicy.MergeAbortedException { > > double rate = mbPerSec; > double secondsToPause = (bytes / 1024. / 1024.) / rate; > long targetNS = lastNS + (long) (10 * secondsToPause); > long curPauseNS = targetNS - curNS; > // We don't bother with thread pausing if the pause is smaller than 2 > msec. > if (curPauseNS <= MIN_PAUSE_NS) { > // Set to curNS, not targetNS, to enforce the instant rate, not > // the "averaged over all history" rate: > lastNS = curNS; > return -1; > } >.. > } > {code} > If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, > then the *maybePause* is called in 7:05 again, so the value of > *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than > *curNS*, no matter how big the bytes is, we will return -1 and ignore to > pause. > I count the total times(callTimes) calling *maybePause* and ignored pause > times(ignorePauseTimes) and detail ignored bytes(detailBytes): > {code:java} > [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] > [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 > docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec > throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = > [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, > 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, > 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, > 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] > {code} > There are 857 times calling *maybePause*, including 25 times which is ignored > to pause, we can see that the ignored detail bytes (such as 0.28125mb) are > not small. > As long as the interval between two *maybePause* calls is relatively long, > the pause action that should be executed will not be executed. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] romseygeek merged pull request #722: LUCENE-10431: Deprecate MultiTermQuery.setRewriteMethod()
romseygeek merged pull request #722: URL: https://github.com/apache/lucene/pull/722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()
[ https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500662#comment-17500662 ] ASF subversion and git services commented on LUCENE-10431: -- Commit 3f994dec53a1f45c27be9f577a01f20516461b3e in lucene's branch refs/heads/main from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=3f994de ] LUCENE-10431: Deprecate MultiTermQuery.setRewriteMethod() (#722) Allowing users to mutate MultiTermQuery can give rise to odd bugs, for example in wrapper queries such as BooleanQuery which lazily calculate their hashcodes and then cache the result. This commit deprecates the setRewriteMethod() method on MultiTermQuery, in preparation for removing it entirely, and adds constructor parameters to the various MTQ implementations as a preferred way to set the rewrite method. > AssertionError in BooleanQuery.hashCode() > - > > Key: LUCENE-10431 > URL: https://issues.apache.org/jira/browse/LUCENE-10431 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.11.1 >Reporter: Michael Bien >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Hello devs, > the constructor of BooleanQuery can under some circumstances trigger a hash > code computation before "clauseSets" is fully filled. Since BooleanClause is > using its query field for the hash code too, it can happen that the "wrong" > hash code is stored, since adding the clause to the set triggers its > hashCode(). > If assertions are enabled the check in BooleanQuery, which recomputes the > hash code, will notice it and throw an error. > exception: > {code:java} > java.lang.AssertionError > at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:614) > at java.base/java.util.Objects.hashCode(Objects.java:103) > at java.base/java.util.HashMap$Node.hashCode(HashMap.java:298) > at java.base/java.util.AbstractMap.hashCode(AbstractMap.java:527) > at org.apache.lucene.search.Multiset.hashCode(Multiset.java:119) > at java.base/java.util.EnumMap.entryHashCode(EnumMap.java:717) > at java.base/java.util.EnumMap.hashCode(EnumMap.java:709) > at java.base/java.util.Arrays.hashCode(Arrays.java:4498) > at java.base/java.util.Objects.hash(Objects.java:133) > at > org.apache.lucene.search.BooleanQuery.computeHashCode(BooleanQuery.java:597) > at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:611) > at java.base/java.util.HashMap.hash(HashMap.java:340) > at java.base/java.util.HashMap.put(HashMap.java:612) > at org.apache.lucene.search.Multiset.add(Multiset.java:82) > at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:154) > at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:42) > at > org.apache.lucene.search.BooleanQuery$Builder.build(BooleanQuery.java:133) > {code} > I noticed this while trying to upgrade the NetBeans maven indexer modules > from lucene 5.x to 8.x https://github.com/apache/netbeans/pull/3558 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()
[ https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500678#comment-17500678 ] ASF subversion and git services commented on LUCENE-10431: -- Commit 63454b83ad3cea3bae7c70f4b6276fce60d81672 in lucene's branch refs/heads/branch_9x from Alan Woodward [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=63454b8 ] LUCENE-10431: Deprecate MultiTermQuery.setRewriteMethod() (#722) Allowing users to mutate MultiTermQuery can give rise to odd bugs, for example in wrapper queries such as BooleanQuery which lazily calculate their hashcodes and then cache the result. This commit deprecates the setRewriteMethod() method on MultiTermQuery, in preparation for removing it entirely, and adds constructor parameters to the various MTQ implementations as a preferred way to set the rewrite method. > AssertionError in BooleanQuery.hashCode() > - > > Key: LUCENE-10431 > URL: https://issues.apache.org/jira/browse/LUCENE-10431 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.11.1 >Reporter: Michael Bien >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Hello devs, > the constructor of BooleanQuery can under some circumstances trigger a hash > code computation before "clauseSets" is fully filled. Since BooleanClause is > using its query field for the hash code too, it can happen that the "wrong" > hash code is stored, since adding the clause to the set triggers its > hashCode(). > If assertions are enabled the check in BooleanQuery, which recomputes the > hash code, will notice it and throw an error. > exception: > {code:java} > java.lang.AssertionError > at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:614) > at java.base/java.util.Objects.hashCode(Objects.java:103) > at java.base/java.util.HashMap$Node.hashCode(HashMap.java:298) > at java.base/java.util.AbstractMap.hashCode(AbstractMap.java:527) > at org.apache.lucene.search.Multiset.hashCode(Multiset.java:119) > at java.base/java.util.EnumMap.entryHashCode(EnumMap.java:717) > at java.base/java.util.EnumMap.hashCode(EnumMap.java:709) > at java.base/java.util.Arrays.hashCode(Arrays.java:4498) > at java.base/java.util.Objects.hash(Objects.java:133) > at > org.apache.lucene.search.BooleanQuery.computeHashCode(BooleanQuery.java:597) > at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:611) > at java.base/java.util.HashMap.hash(HashMap.java:340) > at java.base/java.util.HashMap.put(HashMap.java:612) > at org.apache.lucene.search.Multiset.add(Multiset.java:82) > at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:154) > at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:42) > at > org.apache.lucene.search.BooleanQuery$Builder.build(BooleanQuery.java:133) > {code} > I noticed this while trying to upgrade the NetBeans maven indexer modules > from lucene 5.x to 8.x https://github.com/apache/netbeans/pull/3558 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] romseygeek opened a new pull request #727: LUCENE-10431: Don't include rewriteMethod in MTQ hash calculation
romseygeek opened a new pull request #727: URL: https://github.com/apache/lucene/pull/727 BooleanQuery assumes that its children's hashcodes are stable, and has some assertions to this effect. This did not apply to MultiTermQuery, which has a mutable RewriteMethod member variable that was included in its hash calculation. Changing the rewrite method would change the hash, leading to assertion failures being tripped. This commit removes rewriteMethod from the hash calculation, meaning that the hashcode will be stable even under mutability. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()
[ https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500706#comment-17500706 ] Alan Woodward commented on LUCENE-10431: Follow up PRs: * For main, removing the deprecated `setRewriteMethod()` method: [https://github.com/apache/lucene/pull/726] * For 9x, removing rewriteMethod from MTQ's hashCode calculation: https://github.com/apache/lucene/pull/727 > AssertionError in BooleanQuery.hashCode() > - > > Key: LUCENE-10431 > URL: https://issues.apache.org/jira/browse/LUCENE-10431 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.11.1 >Reporter: Michael Bien >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Hello devs, > the constructor of BooleanQuery can under some circumstances trigger a hash > code computation before "clauseSets" is fully filled. Since BooleanClause is > using its query field for the hash code too, it can happen that the "wrong" > hash code is stored, since adding the clause to the set triggers its > hashCode(). > If assertions are enabled the check in BooleanQuery, which recomputes the > hash code, will notice it and throw an error. > exception: > {code:java} > java.lang.AssertionError > at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:614) > at java.base/java.util.Objects.hashCode(Objects.java:103) > at java.base/java.util.HashMap$Node.hashCode(HashMap.java:298) > at java.base/java.util.AbstractMap.hashCode(AbstractMap.java:527) > at org.apache.lucene.search.Multiset.hashCode(Multiset.java:119) > at java.base/java.util.EnumMap.entryHashCode(EnumMap.java:717) > at java.base/java.util.EnumMap.hashCode(EnumMap.java:709) > at java.base/java.util.Arrays.hashCode(Arrays.java:4498) > at java.base/java.util.Objects.hash(Objects.java:133) > at > org.apache.lucene.search.BooleanQuery.computeHashCode(BooleanQuery.java:597) > at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:611) > at java.base/java.util.HashMap.hash(HashMap.java:340) > at java.base/java.util.HashMap.put(HashMap.java:612) > at org.apache.lucene.search.Multiset.add(Multiset.java:82) > at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:154) > at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:42) > at > org.apache.lucene.search.BooleanQuery$Builder.build(BooleanQuery.java:133) > {code} > I noticed this while trying to upgrade the NetBeans maven indexer modules > from lucene 5.x to 8.x https://github.com/apache/netbeans/pull/3558 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dblock commented on pull request #711: LUCENE-10428: Avoid infinite loop under error conditions.
dblock commented on pull request #711: URL: https://github.com/apache/lucene/pull/711#issuecomment-1058085879 Thanks for fixing the issue @jpountz and everyone for your support and ideas! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500789#comment-17500789 ] Daniel Doubrovkine edited comment on LUCENE-10428 at 3/3/22, 2:27 PM: -- I agree with closing. The loop can't happen anymore, and we can open a new issue when we see new data pointing to a bug elsewhere. was (Author: dblock): I agree with the above. The loop can't happen anymore, and we can open a new issue when we see new data pointing to a bug elsewhere. > getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge > leading to busy threads in infinite loop > - > > Key: LUCENE-10428 > URL: https://issues.apache.org/jira/browse/LUCENE-10428 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring, core/search >Reporter: Ankit Jain >Priority: Major > Attachments: Flame_graph.png > > Time Spent: 5.5h > Remaining Estimate: 0h > > Customers complained about high CPU for Elasticsearch cluster in production. > We noticed that few search requests were stuck for long time > {code:java} > % curl -s localhost:9200/_cat/tasks?v > indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205 > AmMLzDQ4RrOJievRDeGFZw:569204 direct1645195007282 14:36:47 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075 > emjWc5bUTG6lgnCGLulq-Q:502074 direct1645195037259 14:37:17 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270 > emjWc5bUTG6lgnCGLulq-Q:583269 direct1645201316981 16:21:56 4.5h > {code} > Flame graphs indicated that CPU time is mostly going into > *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some > live JVM debugging found that > org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had > around 4 million invocations every second > Figured out the values of some parameters from live debugging: > {code:java} > minScoreSum = 3.5541441 > minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = > 3.554144322872162 > returnObj scoreSumUpperBound = 3.5541444 > Math.ulp(minScoreSum) = 2.3841858E-7 > {code} > Example code snippet: > {code:java} > double sumOfOtherMaxScores = 3.554144322872162; > double minScoreSum = 3.5541441; > float minScore = (float) (minScoreSum - sumOfOtherMaxScores); > while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) { > minScore -= Math.ulp(minScoreSum); > System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum)); > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500789#comment-17500789 ] Daniel Doubrovkine commented on LUCENE-10428: - I agree with the above. The loop can't happen anymore, and we can open a new issue when we see new data pointing to a bug elsewhere. > getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge > leading to busy threads in infinite loop > - > > Key: LUCENE-10428 > URL: https://issues.apache.org/jira/browse/LUCENE-10428 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring, core/search >Reporter: Ankit Jain >Priority: Major > Attachments: Flame_graph.png > > Time Spent: 5.5h > Remaining Estimate: 0h > > Customers complained about high CPU for Elasticsearch cluster in production. > We noticed that few search requests were stuck for long time > {code:java} > % curl -s localhost:9200/_cat/tasks?v > indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205 > AmMLzDQ4RrOJievRDeGFZw:569204 direct1645195007282 14:36:47 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075 > emjWc5bUTG6lgnCGLulq-Q:502074 direct1645195037259 14:37:17 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270 > emjWc5bUTG6lgnCGLulq-Q:583269 direct1645201316981 16:21:56 4.5h > {code} > Flame graphs indicated that CPU time is mostly going into > *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some > live JVM debugging found that > org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had > around 4 million invocations every second > Figured out the values of some parameters from live debugging: > {code:java} > minScoreSum = 3.5541441 > minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = > 3.554144322872162 > returnObj scoreSumUpperBound = 3.5541444 > Math.ulp(minScoreSum) = 2.3841858E-7 > {code} > Example code snippet: > {code:java} > double sumOfOtherMaxScores = 3.554144322872162; > double minScoreSum = 3.5541441; > float minScore = (float) (minScoreSum - sumOfOtherMaxScores); > while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) { > minScore -= Math.ulp(minScoreSum); > System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum)); > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10302) PriorityQueue: optimize where we collect then iterate by using O(N) heapify
[ https://issues.apache.org/jira/browse/LUCENE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-10302: -- Attachment: LUCENE_PriorityQueue_Builder_with_heapify.patch > PriorityQueue: optimize where we collect then iterate by using O(N) heapify > --- > > Key: LUCENE-10302 > URL: https://issues.apache.org/jira/browse/LUCENE-10302 > Project: Lucene - Core > Issue Type: Improvement >Reporter: David Smiley >Priority: Major > Attachments: LUCENE_PriorityQueue_Builder_with_heapify.patch > > > Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to > wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue > when we provide a bulk array to initialize the heap/PriorityQueue. It turns > out there is: the JDK's PriorityQueue supports this in its constructors, > referring to "This classic algorithm due to Floyd (1964) is known to be > O(size)" -- heapify() method. There's > [another|https://www.geeksforgeeks.org/building-heap-from-array/] that may > or may not be the same; I didn't look too closely yet. I see a number of > uses of Lucene's PriorityQueue that first collects values and only after > collecting want to do something with the results (typical / unsurprising). > This lends itself to a builder pattern that can look similar to > LargeNumHitsTopDocsCollector in terms of first having an array used like a > list and then move over to the PriorityQueue if/when it gets full (it may > not). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10302) PriorityQueue: optimize where we collect then iterate by using O(N) heapify
[ https://issues.apache.org/jira/browse/LUCENE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500803#comment-17500803 ] David Smiley commented on LUCENE-10302: --- I attached my WIP as a patch file. Looking back, I started with defining the Builder code. I didn't yet implement "heapify". Nobody calls any of it. > PriorityQueue: optimize where we collect then iterate by using O(N) heapify > --- > > Key: LUCENE-10302 > URL: https://issues.apache.org/jira/browse/LUCENE-10302 > Project: Lucene - Core > Issue Type: Improvement >Reporter: David Smiley >Priority: Major > Attachments: LUCENE_PriorityQueue_Builder_with_heapify.patch > > > Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to > wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue > when we provide a bulk array to initialize the heap/PriorityQueue. It turns > out there is: the JDK's PriorityQueue supports this in its constructors, > referring to "This classic algorithm due to Floyd (1964) is known to be > O(size)" -- heapify() method. There's > [another|https://www.geeksforgeeks.org/building-heap-from-array/] that may > or may not be the same; I didn't look too closely yet. I see a number of > uses of Lucene's PriorityQueue that first collects values and only after > collecting want to do something with the results (typical / unsurprising). > This lends itself to a builder pattern that can look similar to > LargeNumHitsTopDocsCollector in terms of first having an array used like a > list and then move over to the PriorityQueue if/when it gets full (it may > not). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova opened a new pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova opened a new pull request #728: URL: https://github.com/apache/lucene/pull/728 Currently VectorValuesWriter buffers vectors in memory. The problem is that as multi-dimensional vectors consume a lot of memory, a lot of flushes are triggered. Each flush is a very expensive involving the construction of an HNSW graph. This patch instead buffers KNN vectors on disk, which prevents construction of new segments only because RAM is full. Also unset RAMBufferSizeMB in KnnGraphTester, now defaults to 16Mb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova commented on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with segments force merged to 1. - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, also force merge at the end. **Indexing** - baseline took Built index in 1099 secs, around **18mins** - candidate took 586 secs, around **10 mins** - search performance is the same. 2022-03-03T15:01:49.958373Z; main IW 1 [2022-03-03T15:14:33.924666Z; main] Details on the search performance Details on the candidate Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Details on the baseline Indexing output ```txt Built index in 1099.0846738815308 ``` **Files in the index** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx -rw-r--r-- 1 mayyasharipova staff 192B 3 Mar 15:14 _w.fnm -rw-r--r-- 1 mayyasharipova staff 532B 3 Mar 15:14 _w.si -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vec -rw-r--r-- 1 mayyasharipova staff 309K 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vem -rw-r--r-- 1 mayyasharipova staff82M 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vex -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 15:14 segments_2 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:56 write.lock ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with segments force merged to 1. - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, also force merge at the end. **Indexing** - baseline took 1099 secs, around **18mins** - candidate took 586 secs, around **10 mins** - search performance is the same. 2022-03-03T15:01:49.958373Z; main IW 1 [2022-03-03T15:14:33.924666Z; main] Details on the search performance Details on the candidate Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Details on the baseline Indexing output ```txt Built index in 1099.0846738815308 ``` **Files in the index** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx -rw-r--r-- 1 mayyasharipova staff 192B 3 Mar 15:14 _w.fnm -rw-r--r-- 1 mayyasharipova staff 532B 3 Mar 15:14 _w.si -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vec -rw-r--r-- 1 mayyasharipova staff 309K 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vem -rw-r--r-- 1 mayyasharipova staff82M 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vex -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 15:14 segments_2 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:56 write.lock ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with segments force merged to 1. - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, also force merge at the end. **Indexing** - baseline took 1099 secs, around **18mins** - candidate took 586 secs, around **10 mins** - search performance is the same. Details on the search performance Details on the candidate Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Details on the baseline Indexing output ```txt Built index in 1099.0846738815308 ``` **Files in the index** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx -rw-r--r-- 1 mayyasharipova staff 192B 3 Mar 15:14 _w.fnm -rw-r--r-- 1 mayyasharipova staff 532B 3 Mar 15:14 _w.si -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vec -rw-r--r-- 1 mayyasharipova staff 309K 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vem -rw-r--r-- 1 mayyasharipova staff82M 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vex -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 15:14 segments_2 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:56 write.lock ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with segments force merged to 1. - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, also force merge at the end. **Indexing** - baseline took 1099 secs, around **18mins** - candidate took 586 secs, around **10 mins** - search performance is the same. Details on the search performance Candidate indexing details Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` **Files after force merge** ```txt 8 -rw-r--r-- 1 mayyasharipova staff 297B 3 Mar 14:40 _0.cfe 1105112 -rw-r--r-- 1 mayyasharipova staff 538M 3 Mar 14:40 _0.cfs 8 -rw-r--r-- 1 mayyasharipova staff 376B 3 Mar 14:40 _0.si 8 -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 14:40 segments_1 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Baseline indexing details Indexing output ```txt Built index in 1099.0846738815308 ``` **Files after force merge** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx -rw-r--r-- 1 mayyasharipova staff 192B 3 Mar 15:14 _w.fnm -rw-r--r-- 1 mayyasharipova staff 532B 3 Mar 15:14 _w.si -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vec -rw-r--r-- 1 mayyasharipova staff 309K 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vem -rw-r--r-- 1 mayyasharipova staff82M 3 Mar 15:14 _w_Lucene91HnswVectorsFormat_0.vex -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 15:14 segments_2 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:56 write.lock ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr.
[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with segments force merged to 1. - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, also force merge at the end. **Indexing** - baseline took 1099 secs, around **18mins** - candidate took 586 secs, around **10 mins** - search performance is the same. Details on the search performance | | baseline recall | baseline QPS | candidate recall | candidate QPS | | --- | --: | ---: | ---: | : | | n_cands=10 | 0.486 | 3995.468 |0.463 | 3636.417 | | n_cands=20 | 0.532 | 3261.435 |0.529 | 3356.358 | | n_cands=40 | 0.608 | 2685.442 |0.603 | 2494.603 | | n_cands=80 | 0.683 | 1874.002 |0.682 | 1884.534 | | n_cands=120 | 0.723 | 1474.137 |0.721 | 1445.883 | | n_cands=200 | 0.766 | 1048.531 |0.766 | 1070.614 | | n_cands=400 | 0.819 | 554.110 |0.819 | 639.026 | | n_cands=600 | 0.844 | 464.523 |0.845 | 435.123 | | n_cands=800 | 0.861 | 355.228 |0.862 | 329.773 | Candidate indexing details Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` **Files after force merge** ```txt 8 -rw-r--r-- 1 mayyasharipova staff 297B 3 Mar 14:40 _0.cfe 1105112 -rw-r--r-- 1 mayyasharipova staff 538M 3 Mar 14:40 _0.cfs 8 -rw-r--r-- 1 mayyasharipova staff 376B 3 Mar 14:40 _0.si 8 -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 14:40 segments_1 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Baseline indexing details Indexing output ```txt Built index in 1099.0846738815308 ``` **Files after force merge** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx -rw-r--r-- 1 mayyasharipova staff
[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with force merge at the end. - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, also force merge at the end. **Indexing** - baseline took 1099 secs, around **18mins** - candidate took 586 secs, around **10 mins** - search performance is the same. Details on the search performance | | baseline recall | baseline QPS | candidate recall | candidate QPS | | --- | --: | ---: | ---: | : | | n_cands=10 | 0.486 | 3995.468 |0.463 | 3636.417 | | n_cands=20 | 0.532 | 3261.435 |0.529 | 3356.358 | | n_cands=40 | 0.608 | 2685.442 |0.603 | 2494.603 | | n_cands=80 | 0.683 | 1874.002 |0.682 | 1884.534 | | n_cands=120 | 0.723 | 1474.137 |0.721 | 1445.883 | | n_cands=200 | 0.766 | 1048.531 |0.766 | 1070.614 | | n_cands=400 | 0.819 | 554.110 |0.819 | 639.026 | | n_cands=600 | 0.844 | 464.523 |0.845 | 435.123 | | n_cands=800 | 0.861 | 355.228 |0.862 | 329.773 | Candidate indexing details Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` **Files after force merge** ```txt 8 -rw-r--r-- 1 mayyasharipova staff 297B 3 Mar 14:40 _0.cfe 1105112 -rw-r--r-- 1 mayyasharipova staff 538M 3 Mar 14:40 _0.cfs 8 -rw-r--r-- 1 mayyasharipova staff 376B 3 Mar 14:40 _0.si 8 -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 14:40 segments_1 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Baseline indexing details Indexing output ```txt Built index in 1099.0846738815308 ``` **Files after force merge** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx -rw-r--r-- 1 mayyasharipova staff 1
[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, also force merge at the end. - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with force merge at the end. **Results** - Indexing candidate took 586 secs, around **10 mins** - Indexing baseline took 1099 secs, around **18mins** - search performance is the same. Details on the search performance | | baseline recall | baseline QPS | candidate recall | candidate QPS | | --- | --: | ---: | ---: | : | | n_cands=10 | 0.486 | 3995.468 |0.463 | 3636.417 | | n_cands=20 | 0.532 | 3261.435 |0.529 | 3356.358 | | n_cands=40 | 0.608 | 2685.442 |0.603 | 2494.603 | | n_cands=80 | 0.683 | 1874.002 |0.682 | 1884.534 | | n_cands=120 | 0.723 | 1474.137 |0.721 | 1445.883 | | n_cands=200 | 0.766 | 1048.531 |0.766 | 1070.614 | | n_cands=400 | 0.819 | 554.110 |0.819 | 639.026 | | n_cands=600 | 0.844 | 464.523 |0.845 | 435.123 | | n_cands=800 | 0.861 | 355.228 |0.862 | 329.773 | Candidate indexing details Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` **Files after force merge** ```txt 8 -rw-r--r-- 1 mayyasharipova staff 297B 3 Mar 14:40 _0.cfe 1105112 -rw-r--r-- 1 mayyasharipova staff 538M 3 Mar 14:40 _0.cfs 8 -rw-r--r-- 1 mayyasharipova staff 376B 3 Mar 14:40 _0.si 8 -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 14:40 segments_1 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Baseline indexing details Indexing output ```txt Built index in 1099.0846738815308 ``` **Files after force merge** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx -rw-r--r-- 1 mayyasha
[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - candidate: this PR, where RAMBufferSizeMB is set to default **16Mb** with force merge at the end of indexing. - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with force merge at the end of indexing. **Results** - Indexing candidate took 586 secs, around **10 mins** - Indexing baseline took 1099 secs, around **18mins** - search performance is the same. Details on the search performance | | baseline recall | baseline QPS | candidate recall | candidate QPS | | --- | --: | ---: | ---: | : | | n_cands=10 | 0.486 | 3995.468 |0.463 | 3636.417 | | n_cands=20 | 0.532 | 3261.435 |0.529 | 3356.358 | | n_cands=40 | 0.608 | 2685.442 |0.603 | 2494.603 | | n_cands=80 | 0.683 | 1874.002 |0.682 | 1884.534 | | n_cands=120 | 0.723 | 1474.137 |0.721 | 1445.883 | | n_cands=200 | 0.766 | 1048.531 |0.766 | 1070.614 | | n_cands=400 | 0.819 | 554.110 |0.819 | 639.026 | | n_cands=600 | 0.844 | 464.523 |0.845 | 435.123 | | n_cands=800 | 0.861 | 355.228 |0.862 | 329.773 | Candidate indexing details Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` **Files after force merge** ```txt 8 -rw-r--r-- 1 mayyasharipova staff 297B 3 Mar 14:40 _0.cfe 1105112 -rw-r--r-- 1 mayyasharipova staff 538M 3 Mar 14:40 _0.cfs 8 -rw-r--r-- 1 mayyasharipova staff 376B 3 Mar 14:40 _0.si 8 -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 14:40 segments_1 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Baseline indexing details Indexing output ```txt Built index in 1099.0846738815308 ``` **Files after force merge** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx -r
[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842 I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, efConstruction:100) - baseline: main branch where we unset RAMBufferSizeMB, which defaults to **16Mb** with force merge at the end of indexing. - candidate: this PR, where RAMBufferSizeMB is set to default **16Mb** with force merge at the end of indexing. **Results** - Indexing baseline took 1099 secs, around **18mins** - Indexing candidate took 586 secs, around **10 mins** - search performance is the same. Details on the search performance | | baseline recall | baseline QPS | candidate recall | candidate QPS | | --- | --: | ---: | ---: | : | | n_cands=10 | 0.486 | 3995.468 |0.463 | 3636.417 | | n_cands=20 | 0.532 | 3261.435 |0.529 | 3356.358 | | n_cands=40 | 0.608 | 2685.442 |0.603 | 2494.603 | | n_cands=80 | 0.683 | 1874.002 |0.682 | 1884.534 | | n_cands=120 | 0.723 | 1474.137 |0.721 | 1445.883 | | n_cands=200 | 0.766 | 1048.531 |0.766 | 1070.614 | | n_cands=400 | 0.819 | 554.110 |0.819 | 639.026 | | n_cands=600 | 0.844 | 464.523 |0.845 | 435.123 | | n_cands=800 | 0.861 | 355.228 |0.862 | 329.773 | Candidate indexing details Indexing output ```txt IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null ramBufferSizeMB=16.0 maxBufferedDocs=-1 IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true Done indexing 1183514 documents; now flush IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close IW 0 [2022-03-03T14:30:50.824401Z; main]: start flush: applyAllDeletes=true IW 0 [2022-03-03T14:30:50.824515Z; main]: index before flush DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 deletes=false hasTickets:false pendingChangesInFullFlush: false DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 numDocs=1183514 HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors ... HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms ... IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint Indexed 1183514 documents in 585s Force merge index in luceneknn-100-16-100.train-16-100.index IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint Built index in 586.944657087326 ``` **Files in the index** ```txt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0.fdm 10080 -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 14:30 _0.fdt 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndex-doc_ids_0.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene90FieldsIndexfile_pointers_1.tmp 929304 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec 924624 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vem 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 _0_Lucene91HnswVectorsFormat_0.vex 953168 -rw-r--r-- 1 mayyasharipova staff 451M 3 Mar 14:30 _0_knn_buffered_vectors_temp_2.tmp 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` **Files after force merge** ```txt 8 -rw-r--r-- 1 mayyasharipova staff 297B 3 Mar 14:40 _0.cfe 1105112 -rw-r--r-- 1 mayyasharipova staff 538M 3 Mar 14:40 _0.cfs 8 -rw-r--r-- 1 mayyasharipova staff 376B 3 Mar 14:40 _0.si 8 -rw-r--r-- 1 mayyasharipova staff 154B 3 Mar 14:40 segments_1 0 -rw-r--r-- 1 mayyasharipova staff 0B 3 Mar 14:30 write.lock ``` Baseline indexing details Indexing output ```txt Built index in 1099.0846738815308 ``` **Files after force merge** ```txt drwxr-xr-x 12 mayyasharipova staff 384B 3 Mar 15:14 . drwxr-xr-x 42 mayyasharipova staff 1.3K 3 Mar 15:14 .. -rw-r--r-- 1 mayyasharipova staff 201B 3 Mar 15:03 _w.fdm -rw-r--r-- 1 mayyasharipova staff 4.6M 3 Mar 15:03 _w.fdt -rw-r--r-- 1 mayyasharipova staff 3.5K 3 Mar 15:03 _w.fdx
[GitHub] [lucene-solr] thelabdude commented on a change in pull request #2165: SOLR-15059: Improve query performance monitoring
thelabdude commented on a change in pull request #2165: URL: https://github.com/apache/lucene-solr/pull/2165#discussion_r818795935 ## File path: solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml ## @@ -315,88 +477,22 @@ node metrics --> -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".clientErrors")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_client_errors_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(client_errors_total, select(.key | endswith(".clientErrors")), count) -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".clientErrors")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_errors_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(errors_total, select(.key | endswith(".errors")), count) -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".requestTimes")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_requests_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(requests_total, select(.key | endswith(".local.requestTimes")), count) Review comment: @dsmiley it's to support the query charts showing core-level query metrics vs. top-level distributed query metrics added in this PR. I like knowing if there's an imbalance of core-level query requests going to certain replicas or if the load across all of my replicas is balanced. So you're skeptical but you haven't said why exactly? You want to change, then change it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude commented on a change in pull request #2165: SOLR-15059: Improve query performance monitoring
thelabdude commented on a change in pull request #2165: URL: https://github.com/apache/lucene-solr/pull/2165#discussion_r818797045 ## File path: solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml ## @@ -315,88 +477,22 @@ node metrics --> -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".clientErrors")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_client_errors_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(client_errors_total, select(.key | endswith(".clientErrors")), count) -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".clientErrors")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_errors_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(errors_total, select(.key | endswith(".errors")), count) -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".requestTimes")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_requests_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(requests_total, select(.key | endswith(".local.requestTimes")), count) Review comment: fwiw ~ have you actually looked at the charts added in this PR in Grafana with query load running? If there's a problem there, then let's fix it and move forward but rehashing old decisions seems unproductive at this point. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10302) PriorityQueue: optimize where we collect then iterate by using O(N) heapify
[ https://issues.apache.org/jira/browse/LUCENE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500898#comment-17500898 ] Greg Miller commented on LUCENE-10302: -- Thanks [~dsmiley]. I'd sketched something out as well and have it sitting over here on a branch: [https://github.com/gsmiller/lucene/tree/LUCENE-10302-pq-builder-sketch.] I'll have a look at your patch file and see where ideas are similar/different. [~vigyas] sounds good! > PriorityQueue: optimize where we collect then iterate by using O(N) heapify > --- > > Key: LUCENE-10302 > URL: https://issues.apache.org/jira/browse/LUCENE-10302 > Project: Lucene - Core > Issue Type: Improvement >Reporter: David Smiley >Priority: Major > Attachments: LUCENE_PriorityQueue_Builder_with_heapify.patch > > > Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to > wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue > when we provide a bulk array to initialize the heap/PriorityQueue. It turns > out there is: the JDK's PriorityQueue supports this in its constructors, > referring to "This classic algorithm due to Floyd (1964) is known to be > O(size)" -- heapify() method. There's > [another|https://www.geeksforgeeks.org/building-heap-from-array/] that may > or may not be the same; I didn't look too closely yet. I see a number of > uses of Lucene's PriorityQueue that first collects values and only after > collecting want to do something with the results (typical / unsurprising). > This lends itself to a builder pattern that can look similar to > LargeNumHitsTopDocsCollector in terms of first having an array used like a > list and then move over to the PriorityQueue if/when it gets full (it may > not). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500900#comment-17500900 ] Ankit Jain commented on LUCENE-10428: - I am fine with closing this issue. I will open another one in case I see the queries failing with this error > getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge > leading to busy threads in infinite loop > - > > Key: LUCENE-10428 > URL: https://issues.apache.org/jira/browse/LUCENE-10428 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring, core/search >Reporter: Ankit Jain >Priority: Major > Attachments: Flame_graph.png > > Time Spent: 5.5h > Remaining Estimate: 0h > > Customers complained about high CPU for Elasticsearch cluster in production. > We noticed that few search requests were stuck for long time > {code:java} > % curl -s localhost:9200/_cat/tasks?v > indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205 > AmMLzDQ4RrOJievRDeGFZw:569204 direct1645195007282 14:36:47 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075 > emjWc5bUTG6lgnCGLulq-Q:502074 direct1645195037259 14:37:17 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270 > emjWc5bUTG6lgnCGLulq-Q:583269 direct1645201316981 16:21:56 4.5h > {code} > Flame graphs indicated that CPU time is mostly going into > *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some > live JVM debugging found that > org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had > around 4 million invocations every second > Figured out the values of some parameters from live debugging: > {code:java} > minScoreSum = 3.5541441 > minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = > 3.554144322872162 > returnObj scoreSumUpperBound = 3.5541444 > Math.ulp(minScoreSum) = 2.3841858E-7 > {code} > Example code snippet: > {code:java} > double sumOfOtherMaxScores = 3.554144322872162; > double minScoreSum = 3.5541441; > float minScore = (float) (minScoreSum - sumOfOtherMaxScores); > while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) { > minScore -= Math.ulp(minScoreSum); > System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum)); > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-10428. --- Fix Version/s: 9.1 Resolution: Fixed Thanks both! > getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge > leading to busy threads in infinite loop > - > > Key: LUCENE-10428 > URL: https://issues.apache.org/jira/browse/LUCENE-10428 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring, core/search >Reporter: Ankit Jain >Priority: Major > Fix For: 9.1 > > Attachments: Flame_graph.png > > Time Spent: 5.5h > Remaining Estimate: 0h > > Customers complained about high CPU for Elasticsearch cluster in production. > We noticed that few search requests were stuck for long time > {code:java} > % curl -s localhost:9200/_cat/tasks?v > indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205 > AmMLzDQ4RrOJievRDeGFZw:569204 direct1645195007282 14:36:47 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075 > emjWc5bUTG6lgnCGLulq-Q:502074 direct1645195037259 14:37:17 6.2h > indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270 > emjWc5bUTG6lgnCGLulq-Q:583269 direct1645201316981 16:21:56 4.5h > {code} > Flame graphs indicated that CPU time is mostly going into > *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some > live JVM debugging found that > org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had > around 4 million invocations every second > Figured out the values of some parameters from live debugging: > {code:java} > minScoreSum = 3.5541441 > minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = > 3.554144322872162 > returnObj scoreSumUpperBound = 3.5541444 > Math.ulp(minScoreSum) = 2.3841858E-7 > {code} > Example code snippet: > {code:java} > double sumOfOtherMaxScores = 3.554144322872162; > double minScoreSum = 3.5541441; > float minScore = (float) (minScoreSum - sumOfOtherMaxScores); > while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) { > minScore -= Math.ulp(minScoreSum); > System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum)); > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10078) Enable merge-on-refresh by default?
[ https://issues.apache.org/jira/browse/LUCENE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500911#comment-17500911 ] Adrien Grand commented on LUCENE-10078: --- LUCENE-10237 introduced a very simple MergeOnFlushMergePolicy which merges together all small segments upon flush. I'd be interested in getting opinions about making it the default implementation for LogMergePolicy and TieredMergePolicy. Any thoughts? cc [~anakot] > Enable merge-on-refresh by default? > --- > > Key: LUCENE-10078 > URL: https://issues.apache.org/jira/browse/LUCENE-10078 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Priority: Major > > This is a spinoff from the discussion in LUCENE-10073. > The newish merge-on-refresh ([crazy origin > story|https://blog.mikemccandless.com/2021/03/open-source-collaboration-or-how-we.html]) > feature is a powerful way to reduce searched segment counts, especially > helpful for applications using many indexing threads. Such usage will write > many tiny segments on each refresh, which could quickly be merged up during > the {{refresh}} operation. > We would have to implement a default for {{findFullFlushMerges}} > (LUCENE-10064 is open for this), and then we would need > {{IndexWriterConfig.getMaxFullFlushMergeWaitMillis}} a non-zero value (this > issue). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2165: SOLR-15059: Improve query performance monitoring
dsmiley commented on a change in pull request #2165: URL: https://github.com/apache/lucene-solr/pull/2165#discussion_r81389 ## File path: solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml ## @@ -315,88 +477,22 @@ node metrics --> -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".clientErrors")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_client_errors_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(client_errors_total, select(.key | endswith(".clientErrors")), count) -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".clientErrors")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_errors_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(errors_total, select(.key | endswith(".errors")), count) -.metrics["solr.node"] | to_entries | .[] | select(.key | endswith(".requestTimes")) as $object | -$object.key | split(".")[0] as $category | -$object.key | split(".")[1] as $handler | -$object.value.count as $value | -{ - name : "solr_metrics_node_requests_total", - type : "COUNTER", - help : "See following URL: https://lucene.apache.org/solr/guide/metrics-reporting.html";, - label_names : ["category", "handler"], - label_values : [$category, $handler], - value: $value -} +$jq:node(requests_total, select(.key | endswith(".local.requestTimes")), count) Review comment: I meant to comment on the "totalTime" metric w.r.t. it's usefulness; sorry for the confusion. It's some massive number of course... it'd need to be divided by something else to be useful? Also, totalTime is in nanoseconds lately! https://issues.apache.org/jira/browse/SOLR-16073 I understand the overarching objective of top-level vs core-level -- makes sense. I'm a bit unclear on the distinction between the node level "$jq:node" metrics, and the "Local (non-distrib) query metrics", both of which are using ".local.". RE Grafana; I haven't seen our official one in use live. I use our own/custom one at work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10441) ArrayIndexOutOfBoundsException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated LUCENE-10441: - Affects Version/s: 8.10 > ArrayIndexOutOfBoundsException during indexing > -- > > Key: LUCENE-10441 > URL: https://issues.apache.org/jira/browse/LUCENE-10441 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 8.10 >Reporter: Peixin Li >Priority: Major > > Hi experts!, i have facing ArrayIndexOutOfBoundsException during indexing and > committing documents, this exception gives me no clue about what happened so > i have little information for debugging, can i have some suggest about what > could be and how to fix this error? i'm using Lucene 8.10.0 > {code:java} > java.lang.ArrayIndexOutOfBoundsException: -1 > at org.apache.lucene.util.BytesRefHash$1.get(BytesRefHash.java:179) > at > org.apache.lucene.util.StringMSBRadixSorter$1.get(StringMSBRadixSorter.java:42) > at > org.apache.lucene.util.StringMSBRadixSorter$1.setPivot(StringMSBRadixSorter.java:63) > at org.apache.lucene.util.Sorter.binarySort(Sorter.java:192) > at org.apache.lucene.util.Sorter.binarySort(Sorter.java:187) > at org.apache.lucene.util.IntroSorter.quicksort(IntroSorter.java:41) > at org.apache.lucene.util.IntroSorter.quicksort(IntroSorter.java:83) > at org.apache.lucene.util.IntroSorter.sort(IntroSorter.java:36) > at > org.apache.lucene.util.MSBRadixSorter.introSort(MSBRadixSorter.java:133) > at org.apache.lucene.util.MSBRadixSorter.sort(MSBRadixSorter.java:126) > at org.apache.lucene.util.MSBRadixSorter.sort(MSBRadixSorter.java:121) > at org.apache.lucene.util.BytesRefHash.sort(BytesRefHash.java:183) > at > org.apache.lucene.index.SortedSetDocValuesWriter.flush(SortedSetDocValuesWriter.java:171) > at > org.apache.lucene.index.DefaultIndexingChain.writeDocValues(DefaultIndexingChain.java:348) > at > org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:228) > at > org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:350) > at > org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:476) > at > org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:656) > at > org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3364) > at > org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3770) > at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3728) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10441) ArrayIndexOutOfBoundsException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500933#comment-17500933 ] Christine Poerschke commented on LUCENE-10441: -- line 179 from the stacktrace above is https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.10.0/lucene/core/src/java/org/apache/lucene/util/BytesRefHash.java#L179 i.e. {code} pool.setBytesRef(scratch, bytesStart[compact[i]]); {code} and {{pool}} as per https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.10.0/lucene/core/src/java/org/apache/lucene/util/BytesRefHash.java#L55 is of {{ByteBlockPool}} type. So this issue could be similar to or same as the LUCENE-8614 issue. > ArrayIndexOutOfBoundsException during indexing > -- > > Key: LUCENE-10441 > URL: https://issues.apache.org/jira/browse/LUCENE-10441 > Project: Lucene - Core > Issue Type: Bug >Reporter: Peixin Li >Priority: Major > > Hi experts!, i have facing ArrayIndexOutOfBoundsException during indexing and > committing documents, this exception gives me no clue about what happened so > i have little information for debugging, can i have some suggest about what > could be and how to fix this error? i'm using Lucene 8.10.0 > {code:java} > java.lang.ArrayIndexOutOfBoundsException: -1 > at org.apache.lucene.util.BytesRefHash$1.get(BytesRefHash.java:179) > at > org.apache.lucene.util.StringMSBRadixSorter$1.get(StringMSBRadixSorter.java:42) > at > org.apache.lucene.util.StringMSBRadixSorter$1.setPivot(StringMSBRadixSorter.java:63) > at org.apache.lucene.util.Sorter.binarySort(Sorter.java:192) > at org.apache.lucene.util.Sorter.binarySort(Sorter.java:187) > at org.apache.lucene.util.IntroSorter.quicksort(IntroSorter.java:41) > at org.apache.lucene.util.IntroSorter.quicksort(IntroSorter.java:83) > at org.apache.lucene.util.IntroSorter.sort(IntroSorter.java:36) > at > org.apache.lucene.util.MSBRadixSorter.introSort(MSBRadixSorter.java:133) > at org.apache.lucene.util.MSBRadixSorter.sort(MSBRadixSorter.java:126) > at org.apache.lucene.util.MSBRadixSorter.sort(MSBRadixSorter.java:121) > at org.apache.lucene.util.BytesRefHash.sort(BytesRefHash.java:183) > at > org.apache.lucene.index.SortedSetDocValuesWriter.flush(SortedSetDocValuesWriter.java:171) > at > org.apache.lucene.index.DefaultIndexingChain.writeDocValues(DefaultIndexingChain.java:348) > at > org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:228) > at > org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:350) > at > org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:476) > at > org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:656) > at > org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3364) > at > org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3770) > at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3728) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10454) UnifiedHighlighter can miss terms because of query rewrites
Julie Tibshirani created LUCENE-10454: - Summary: UnifiedHighlighter can miss terms because of query rewrites Key: LUCENE-10454 URL: https://issues.apache.org/jira/browse/LUCENE-10454 Project: Lucene - Core Issue Type: Bug Reporter: Julie Tibshirani Before extracting terms from a query, UnifiedHighlighter rewrites the query using an empty searcher. If the query rewrites to MatchNoDocsQuery when the reader is empty, then the highlighter will fail to extract terms. This is more of an issue now that we rewrite BooleanQuery to MatchNoDocsQuery when any of its required clauses is MatchNoDocsQuery (https://issues.apache.org/jira/browse/LUCENE-10412). I attached a patch showing the problem. This feels like a pretty esoteric issue, but I figured it was worth raising for awareness. I think it only applies when weightMatches=false, which isn't the default. I couldn't find any existing queries in Lucene that would be affected. We ran into it while upgrading Elasticsearch to the latest Lucene snapshot, since a couple custom queries rewrite to MatchNoDocsQuery when the reader is empty. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10454) UnifiedHighlighter can miss terms because of query rewrites
[ https://issues.apache.org/jira/browse/LUCENE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julie Tibshirani updated LUCENE-10454: -- Attachment: LUCENE-10454.patch > UnifiedHighlighter can miss terms because of query rewrites > --- > > Key: LUCENE-10454 > URL: https://issues.apache.org/jira/browse/LUCENE-10454 > Project: Lucene - Core > Issue Type: Bug >Reporter: Julie Tibshirani >Priority: Minor > Attachments: LUCENE-10454.patch > > > Before extracting terms from a query, UnifiedHighlighter rewrites the query > using an empty searcher. If the query rewrites to MatchNoDocsQuery when the > reader is empty, then the highlighter will fail to extract terms. This is > more of an issue now that we rewrite BooleanQuery to MatchNoDocsQuery when > any of its required clauses is MatchNoDocsQuery > (https://issues.apache.org/jira/browse/LUCENE-10412). I attached a patch > showing the problem. > This feels like a pretty esoteric issue, but I figured it was worth raising > for awareness. I think it only applies when weightMatches=false, which isn't > the default. I couldn't find any existing queries in Lucene that would be > affected. > We ran into it while upgrading Elasticsearch to the latest Lucene snapshot, > since a couple custom queries rewrite to MatchNoDocsQuery when the reader is > empty. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500982#comment-17500982 ] Vigya Sharma commented on LUCENE-10448: --- The only API which can lead to unexpected big write bursts seems to be the {{writeBytes(byte[] b, int offset, int length)}} API in RateLimitedIndexOutput. We could potentially add an upper bound on the bytes that writeBytes attempts to write in one shot, in RateLimitedIndexOutput - break the byte array in chunks and check for rate limiting between each chunk. Would that be desirable in the wider Lucene context? All other APIs check for rate before every write, so the instant burst rate is really determined by the configured {{mbPerSec}} and {{MIN_PAUSE_CHECK_MSEC}} values. I think this is what makes all the burst writes in this JIRA log ~0.28 MB. > According to my statistics, the frequency of no-pause bytes is [2%-20%], What is the high instant burst rate you see during these no-pause writes? From the logs above, it should still be less than 11.2 MB/s. Maybe we should look at the burst write rate (in addition to/ instead of) the no-pause-write frequency? > MergeRateLimiter doesn't always limit instant rate. > --- > > Key: LUCENE-10448 > URL: https://issues.apache.org/jira/browse/LUCENE-10448 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Affects Versions: 8.11.1 >Reporter: kkewwei >Priority: Major > > We can see the code in *MergeRateLimiter*: > {code:java} > private long maybePause(long bytes, long curNS) throws > MergePolicy.MergeAbortedException { > > double rate = mbPerSec; > double secondsToPause = (bytes / 1024. / 1024.) / rate; > long targetNS = lastNS + (long) (10 * secondsToPause); > long curPauseNS = targetNS - curNS; > // We don't bother with thread pausing if the pause is smaller than 2 > msec. > if (curPauseNS <= MIN_PAUSE_NS) { > // Set to curNS, not targetNS, to enforce the instant rate, not > // the "averaged over all history" rate: > lastNS = curNS; > return -1; > } >.. > } > {code} > If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, > then the *maybePause* is called in 7:05 again, so the value of > *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than > *curNS*, no matter how big the bytes is, we will return -1 and ignore to > pause. > I count the total times(callTimes) calling *maybePause* and ignored pause > times(ignorePauseTimes) and detail ignored bytes(detailBytes): > {code:java} > [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] > [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 > docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec > throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = > [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, > 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, > 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, > 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] > {code} > There are 857 times calling *maybePause*, including 25 times which is ignored > to pause, we can see that the ignored detail bytes (such as 0.28125mb) are > not small. > As long as the interval between two *maybePause* calls is relatively long, > the pause action that should be executed will not be executed. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dweiss merged pull request #717: LUCENE-10447: always use utf8 for forked process encoding. Use the sa…
dweiss merged pull request #717: URL: https://github.com/apache/lucene/pull/717 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10447) Charset issue in TestScripts#testLukeCanBeLaunched()
[ https://issues.apache.org/jira/browse/LUCENE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501009#comment-17501009 ] ASF subversion and git services commented on LUCENE-10447: -- Commit 81ab1e598f4e2a6f16de312614823c9eccb7abe2 in lucene's branch refs/heads/main from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=81ab1e5 ] LUCENE-10447: always use utf8 for forked process encoding. Use the sa… (#717) > Charset issue in TestScripts#testLukeCanBeLaunched() > > > Key: LUCENE-10447 > URL: https://issues.apache.org/jira/browse/LUCENE-10447 > Project: Lucene - Core > Issue Type: Bug > Components: luke >Reporter: Lu Xugang >Assignee: Dawid Weiss >Priority: Minor > Attachments: 1.png, 2.png, process-10536545874299101128.out > > Time Spent: 1h 50m > Remaining Estimate: 0h > > When run TestScripts#testLukeCanBeLaunched(), a temp file will be created in > the path of lucene/distribution.tests/build/tmp/tests-tmp/process-*.out, this > process-*.out file may contains some non StandardCharsets.US_ASCII content > base on Operating System language, and then a Exception will be throw because > later the test will read this temp file with StandardCharsets.US_ASCII. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-10447) Charset issue in TestScripts#testLukeCanBeLaunched()
[ https://issues.apache.org/jira/browse/LUCENE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-10447. -- Fix Version/s: 9.1 Resolution: Fixed > Charset issue in TestScripts#testLukeCanBeLaunched() > > > Key: LUCENE-10447 > URL: https://issues.apache.org/jira/browse/LUCENE-10447 > Project: Lucene - Core > Issue Type: Bug > Components: luke >Reporter: Lu Xugang >Assignee: Dawid Weiss >Priority: Minor > Fix For: 9.1 > > Attachments: 1.png, 2.png, process-10536545874299101128.out > > Time Spent: 1h 50m > Remaining Estimate: 0h > > When run TestScripts#testLukeCanBeLaunched(), a temp file will be created in > the path of lucene/distribution.tests/build/tmp/tests-tmp/process-*.out, this > process-*.out file may contains some non StandardCharsets.US_ASCII content > base on Operating System language, and then a Exception will be throw because > later the test will read this temp file with StandardCharsets.US_ASCII. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10447) Charset issue in TestScripts#testLukeCanBeLaunched()
[ https://issues.apache.org/jira/browse/LUCENE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501013#comment-17501013 ] ASF subversion and git services commented on LUCENE-10447: -- Commit 8f92ec157f1a01e7903186da8607e3d1003b1829 in lucene's branch refs/heads/branch_9x from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=8f92ec1 ] LUCENE-10447: always use utf8 for forked process encoding. Use the sa… (#717) > Charset issue in TestScripts#testLukeCanBeLaunched() > > > Key: LUCENE-10447 > URL: https://issues.apache.org/jira/browse/LUCENE-10447 > Project: Lucene - Core > Issue Type: Bug > Components: luke >Reporter: Lu Xugang >Assignee: Dawid Weiss >Priority: Minor > Fix For: 9.1 > > Attachments: 1.png, 2.png, process-10536545874299101128.out > > Time Spent: 1h 50m > Remaining Estimate: 0h > > When run TestScripts#testLukeCanBeLaunched(), a temp file will be created in > the path of lucene/distribution.tests/build/tmp/tests-tmp/process-*.out, this > process-*.out file may contains some non StandardCharsets.US_ASCII content > base on Operating System language, and then a Exception will be throw because > later the test will read this temp file with StandardCharsets.US_ASCII. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] msokolov commented on a change in pull request #718: LUCENE-10444: Support alternate aggregation functions in association facets
msokolov commented on a change in pull request #718: URL: https://github.com/apache/lucene/pull/718#discussion_r819043542 ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/IntTaxonomyFacets.java ## @@ -173,17 +185,17 @@ public FacetResult getTopChildren(int topN, String dim, String... path) throws I if (sparseValues != null) { for (IntIntCursor c : sparseValues) { -int count = c.value; +int value = c.value; int ord = c.key; -if (parents[ord] == dimOrd && count > 0) { - totValue += count; +if (parents[ord] == dimOrd && value > 0) { + aggregatedValue = aggregationFunction.aggregate(aggregatedValue, value); childCount++; - if (count > bottomValue) { + if (value > bottomValue) { Review comment: I guess we need to ensure that aggregation functions are nondecreasing? I mean `min` wouldn't work very well here ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FloatTaxonomyFacets.java ## @@ -130,16 +140,16 @@ public FacetResult getTopChildren(int topN, String dim, String... path) throws I ord = siblings[ord]; } -if (sumValues == 0) { +if (aggregatedValue == 0) { return null; } if (dimConfig.multiValued) { if (dimConfig.requireDimCount) { -sumValues = values[dimOrd]; +aggregatedValue = values[dimOrd]; } else { // Our sum'd count is not correct, in general: Review comment: our "aggregated" count? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] msokolov commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
msokolov commented on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058487021 This makes sense to me, but I'm a little confused about how you described the test condition: > baseline: main branch where we unset RAMBufferSizeMB, which defaults to 16Mb with force merge at the end of indexing. I'm not sure what unset means? I guess it goes to the default 16MB, but I assume you must be doing the same in the other test condition? Is there some difference in how the IndexWriter is configured between the two conditions? Or maybe I'm misunderstanding and you are allowing the entire set of vectors to buffer in RAM (in the baseline case)? But if that's the case the results are truly astounding! Actually I would like to understand the difference between that case, and buffering on disk. Do we pay any penalty for buffering on disk? how much? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] msokolov commented on a change in pull request #728: LUCENE-10194 Buffer KNN vectors on disk
msokolov commented on a change in pull request #728: URL: https://github.com/apache/lucene/pull/728#discussion_r819056510 ## File path: lucene/core/src/java/org/apache/lucene/index/VectorValuesWriter.java ## @@ -20,39 +20,53 @@ import java.io.IOException; import java.nio.ByteBuffer; import java.nio.ByteOrder; -import java.util.ArrayList; -import java.util.List; +import org.apache.lucene.codecs.CodecUtil; import org.apache.lucene.codecs.KnnVectorsReader; import org.apache.lucene.codecs.KnnVectorsWriter; import org.apache.lucene.search.DocIdSetIterator; import org.apache.lucene.search.TopDocs; -import org.apache.lucene.util.ArrayUtil; +import org.apache.lucene.store.Directory; +import org.apache.lucene.store.IOContext; +import org.apache.lucene.store.IndexInput; +import org.apache.lucene.store.IndexOutput; import org.apache.lucene.util.Bits; import org.apache.lucene.util.BytesRef; import org.apache.lucene.util.Counter; -import org.apache.lucene.util.RamUsageEstimator; +import org.apache.lucene.util.IOUtils; /** - * Buffers up pending vector value(s) per doc, then flushes when segment flushes. + * Buffers up pending vector value per doc on disk until segment flushes. * * @lucene.experimental */ class VectorValuesWriter { private final FieldInfo fieldInfo; private final Counter iwBytesUsed; - private final List vectors = new ArrayList<>(); private final DocsWithFieldSet docsWithField; + private final int dim; + private final int byteSize; + private final ByteBuffer buffer; + private final Directory directory; + private final IndexOutput dataOut; private int lastDocID = -1; private long bytesUsed; - VectorValuesWriter(FieldInfo fieldInfo, Counter iwBytesUsed) { + VectorValuesWriter( + FieldInfo fieldInfo, Counter iwBytesUsed, Directory directory, String segmentName) + throws IOException { this.fieldInfo = fieldInfo; this.iwBytesUsed = iwBytesUsed; -this.docsWithField = new DocsWithFieldSet(); -this.bytesUsed = docsWithField.ramBytesUsed(); +docsWithField = new DocsWithFieldSet(); +this.directory = directory; +String fileName = segmentName + "_" + fieldInfo.getName() + "_buffered_vectors"; Review comment: I think fields can have pretty much any character in their name. Perhaps instead of using the field name, we should use its number in the filename? ## File path: lucene/core/src/java/org/apache/lucene/index/IndexingChain.java ## @@ -522,6 +526,18 @@ void abort() throws IOException { // finalizer will e.g. close any open files in the term vectors writer: Review comment: maybe this comment should be updated? ## File path: lucene/core/src/java/org/apache/lucene/index/VectorValuesWriter.java ## @@ -20,39 +20,53 @@ import java.io.IOException; import java.nio.ByteBuffer; import java.nio.ByteOrder; -import java.util.ArrayList; -import java.util.List; +import org.apache.lucene.codecs.CodecUtil; import org.apache.lucene.codecs.KnnVectorsReader; import org.apache.lucene.codecs.KnnVectorsWriter; import org.apache.lucene.search.DocIdSetIterator; import org.apache.lucene.search.TopDocs; -import org.apache.lucene.util.ArrayUtil; +import org.apache.lucene.store.Directory; +import org.apache.lucene.store.IOContext; +import org.apache.lucene.store.IndexInput; +import org.apache.lucene.store.IndexOutput; import org.apache.lucene.util.Bits; import org.apache.lucene.util.BytesRef; import org.apache.lucene.util.Counter; -import org.apache.lucene.util.RamUsageEstimator; +import org.apache.lucene.util.IOUtils; /** - * Buffers up pending vector value(s) per doc, then flushes when segment flushes. + * Buffers up pending vector value per doc on disk until segment flushes. * * @lucene.experimental */ class VectorValuesWriter { private final FieldInfo fieldInfo; private final Counter iwBytesUsed; - private final List vectors = new ArrayList<>(); private final DocsWithFieldSet docsWithField; + private final int dim; + private final int byteSize; + private final ByteBuffer buffer; + private final Directory directory; + private final IndexOutput dataOut; private int lastDocID = -1; private long bytesUsed; - VectorValuesWriter(FieldInfo fieldInfo, Counter iwBytesUsed) { + VectorValuesWriter( + FieldInfo fieldInfo, Counter iwBytesUsed, Directory directory, String segmentName) + throws IOException { this.fieldInfo = fieldInfo; this.iwBytesUsed = iwBytesUsed; -this.docsWithField = new DocsWithFieldSet(); -this.bytesUsed = docsWithField.ramBytesUsed(); +docsWithField = new DocsWithFieldSet(); +this.directory = directory; +String fileName = segmentName + "_" + fieldInfo.getName() + "_buffered_vectors"; +dataOut = directory.createTempOutput(fileName, "temp", IOContext.DEFAULT); Review comment: I'm curious what does `createTempOutput` do? Does it mean if we crash these files w
[GitHub] [lucene] msokolov commented on pull request #718: LUCENE-10444: Support alternate aggregation functions in association facets
msokolov commented on pull request #718: URL: https://github.com/apache/lucene/pull/718#issuecomment-1058501380 > It makes sense to me to add this capability. I wonder if the extra abstraction hurts us though in these tight loops summing up values in an array? If it does, we might want to provide a specialization for such loops as well? Oh I missed the benchmarking you did - I guess it was on the backport PR. Looked like no significant change there, good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it
rmuir commented on pull request #709: URL: https://github.com/apache/lucene/pull/709#issuecomment-1058571689 @iverase @jpountz I "undrafted" the PR and added a commit with the `grow(long)` that just truncates-n-forwards. It seems like the best compromise based on discussion above. I also made some minor tweaks to the javadoc to try to simplify the explanation about what the grow parameter means. Again, it is kind of academic when you think about it, values larger than `maxDoc >> 8` are not really needed by any code because we switch to the `FixedBitSet`. But the one-liner method doesn't bother me that much, i am just after keeping logic simple and abstractions minimal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10430) Literal double quotes cause exception in class RegExp
[ https://issues.apache.org/jira/browse/LUCENE-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501072#comment-17501072 ] Holger Rehn commented on LUCENE-10430: -- Thanks for the feedback! But why do I need to escape double quotes? This isn't a regex meta character and doesn't have a special meaning in regular expressions, so should be treated as literal, right? And {code:java} Pattern.compile( "\"" ).matcher( "\"" ).matches(){code} simply returns true, as expected. Btw. - are you sure escaping double quotes really works as expected? I seem to remember to have already tried that, without getting the expected result... but I'm not sure. > Literal double quotes cause exception in class RegExp > - > > Key: LUCENE-10430 > URL: https://issues.apache.org/jira/browse/LUCENE-10430 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Affects Versions: 9.0 >Reporter: Holger Rehn >Priority: Major > > Class org.apache.lucene.util.automaton.RegExp fails to parse valid regular > expressions that contain double quotes (except in character classes). This of > course affects corresponding RegexpQuerys, as well. > Example: > {code:java} > Query q = new RegexpQuery( new Term( "field", "a\"b" ) ); > RegExp r = new RegExp( "a\"b" );{code} > Both fail with: > {code:java} > java.lang.IllegalArgumentException: expected '"' at position 3 > at > org.apache.lucene.util.automaton.RegExp.parseSimpleExp(RegExp.java:1299) > at > org.apache.lucene.util.automaton.RegExp.parseCharClassExp(RegExp.java:1229) > at org.apache.lucene.util.automaton.RegExp.parseComplExp(RegExp.java:1218) > at > org.apache.lucene.util.automaton.RegExp.parseRepeatExp(RegExp.java:1192) > at > org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1185) > at > org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1187) > at org.apache.lucene.util.automaton.RegExp.parseInterExp(RegExp.java:1179) > at org.apache.lucene.util.automaton.RegExp.parseUnionExp(RegExp.java:1173) > at org.apache.lucene.util.automaton.RegExp.(RegExp.java:496) > ...{code} > As a workaround we currently replace all double quotes with a dot. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jtibshirani commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
jtibshirani commented on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058674634 Great that we're exploring this! I had a couple high-level thoughts: * If a user had 100 vector fields, then now we might have 100+ files being written concurrently, multiplied by the number of segments we're writing at the same time. It seems like this could cause problems -- should we only use this strategy if there are a relatively small number of vector fields? Having 100 vector fields sounds farfetched, but I could imagine it happening as users experiment with ways to model long text documents. * It feels wasteful to be writing the vectors to a temp file in `IndexingChain`, then immediately reading and writing them to a temp file again `Lucene91HnswVectorsWriter`. I wonder if we could make a top-level `OffHeapVectorValues` class that's more broadly visible, so that `Lucene91HnswVectorsWriter` could just check if it's dealing with a file-backed vector values and create another one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jtibshirani edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
jtibshirani edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058674634 Great that we're exploring this! I had a couple high-level thoughts: * If a user had 100 vector fields, then now we might have 100+ files being written concurrently, multiplied by the number of segments we're writing at the same time. It seems like this could cause problems -- should we only use this strategy if there are a relatively small number of vector fields? Having 100 vector fields sounds farfetched, but I could imagine it happening as users experiment with ways to model long text documents. * It feels wasteful to be writing the vectors to a temp file in `IndexingChain`, then immediately reading and writing them to a temp file again `Lucene91HnswVectorsWriter`. I wonder if we could make a top-level `OffHeapVectorValues` class that's more broadly visible, so that `Lucene91HnswVectorsWriter` could just check if it's dealing with a file-backed vector values and avoid creating another one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10302) PriorityQueue: optimize where we collect then iterate by using O(N) heapify
[ https://issues.apache.org/jira/browse/LUCENE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501101#comment-17501101 ] Vigya Sharma commented on LUCENE-10302: --- [~gsmiller] Is the link you pasted working? I get a 404 when I try to open it. > PriorityQueue: optimize where we collect then iterate by using O(N) heapify > --- > > Key: LUCENE-10302 > URL: https://issues.apache.org/jira/browse/LUCENE-10302 > Project: Lucene - Core > Issue Type: Improvement >Reporter: David Smiley >Priority: Major > Attachments: LUCENE_PriorityQueue_Builder_with_heapify.patch > > > Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to > wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue > when we provide a bulk array to initialize the heap/PriorityQueue. It turns > out there is: the JDK's PriorityQueue supports this in its constructors, > referring to "This classic algorithm due to Floyd (1964) is known to be > O(size)" -- heapify() method. There's > [another|https://www.geeksforgeeks.org/building-heap-from-array/] that may > or may not be the same; I didn't look too closely yet. I see a number of > uses of Lucene's PriorityQueue that first collects values and only after > collecting want to do something with the results (typical / unsurprising). > This lends itself to a builder pattern that can look similar to > LargeNumHitsTopDocsCollector in terms of first having an array used like a > list and then move over to the PriorityQueue if/when it gets full (it may > not). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500982#comment-17500982 ] Vigya Sharma edited comment on LUCENE-10448 at 3/4/22, 2:06 AM: The only API which can lead to unexpected big write bursts seems to be the {{writeBytes(byte[] b, int offset, int length)}} API in RateLimitedIndexOutput. We could potentially add an upper bound on the bytes that writeBytes attempts to write in one shot, in RateLimitedIndexOutput - break the byte array in chunks and check for rate limiting between each chunk. Would that be desirable in the wider Lucene context? All other APIs check for rate before every write, so the instant burst rate is really determined by the configured {{mbPerSec}} and {{MIN_PAUSE_CHECK_MSEC}} values. I think this is what makes all the burst writes in this JIRA log ~0.28 MB. {quote}According to my statistics, the frequency of no-pause bytes is [2%-20%], {quote} What is the high instant burst rate you see during these no-pause writes? From the logs above, it should still be less than 11.2 MB/s. Maybe we should look at the burst write rate (in addition to/ instead of) the no-pause-write frequency? was (Author: vigyas): The only API which can lead to unexpected big write bursts seems to be the {{writeBytes(byte[] b, int offset, int length)}} API in RateLimitedIndexOutput. We could potentially add an upper bound on the bytes that writeBytes attempts to write in one shot, in RateLimitedIndexOutput - break the byte array in chunks and check for rate limiting between each chunk. Would that be desirable in the wider Lucene context? All other APIs check for rate before every write, so the instant burst rate is really determined by the configured {{mbPerSec}} and {{MIN_PAUSE_CHECK_MSEC}} values. I think this is what makes all the burst writes in this JIRA log ~0.28 MB. > According to my statistics, the frequency of no-pause bytes is [2%-20%], What is the high instant burst rate you see during these no-pause writes? From the logs above, it should still be less than 11.2 MB/s. Maybe we should look at the burst write rate (in addition to/ instead of) the no-pause-write frequency? > MergeRateLimiter doesn't always limit instant rate. > --- > > Key: LUCENE-10448 > URL: https://issues.apache.org/jira/browse/LUCENE-10448 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Affects Versions: 8.11.1 >Reporter: kkewwei >Priority: Major > > We can see the code in *MergeRateLimiter*: > {code:java} > private long maybePause(long bytes, long curNS) throws > MergePolicy.MergeAbortedException { > > double rate = mbPerSec; > double secondsToPause = (bytes / 1024. / 1024.) / rate; > long targetNS = lastNS + (long) (10 * secondsToPause); > long curPauseNS = targetNS - curNS; > // We don't bother with thread pausing if the pause is smaller than 2 > msec. > if (curPauseNS <= MIN_PAUSE_NS) { > // Set to curNS, not targetNS, to enforce the instant rate, not > // the "averaged over all history" rate: > lastNS = curNS; > return -1; > } >.. > } > {code} > If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, > then the *maybePause* is called in 7:05 again, so the value of > *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than > *curNS*, no matter how big the bytes is, we will return -1 and ignore to > pause. > I count the total times(callTimes) calling *maybePause* and ignored pause > times(ignorePauseTimes) and detail ignored bytes(detailBytes): > {code:java} > [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] > [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 > docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec > throttle], [callTimes=857], [ignorePauseTimes=25], [detailBytes(mb) = > [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, > 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, > 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, > 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]] > {code} > There are 857 times calling *maybePause*, including 25 times which is ignored > to pause, we can see that the ignored detail bytes (such as 0.28125mb) are > not small. > As long as the interval between two *maybePause* calls is relatively long, > the pause action that should be executed will not be executed. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
[jira] [Commented] (LUCENE-10162) Add IntField, LongField, FloatField and DoubleField classes to index both points and doc values
[ https://issues.apache.org/jira/browse/LUCENE-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501122#comment-17501122 ] Lu Xugang commented on LUCENE-10162: Move the conversation from LUCENE-10446 about current issue to one place: {quote}I think that one way we could make the situation better would be by implementing LUCENE-10162 to create fields that index both points and doc values. Then factory methods on these fields would know exactly how the field is indexed and they could make the best decision without having to hurt the API by merging what PointRangeQuery, IndexOrDocValuesQuery and IndexSortSortedNumericDocValuesRangeQuery do: - If the points index tells us that all docs match, then return DocIdSetIterator#range(0,maxDoc). - If the field is the primary index sort, then use the index to figure out the min and max values and return the appropriate range. - Otherwise do what IndexOrDocValuesQuery is doing today. One thought I had in mind when opening LUCENE-10162 was that we could return queries that can more easily do the right thing because they know both{quote} > Add IntField, LongField, FloatField and DoubleField classes to index both > points and doc values > --- > > Key: LUCENE-10162 > URL: https://issues.apache.org/jira/browse/LUCENE-10162 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > > Currently we have IntPoint, LongPoint, FloatPoint and DoublePoint on the one > hand, and NumericDocValuesField and SortedNumericDocValuesField on the other > hand. > When we introduced these classes, this distinction made sense: use the > XXXPoint classes if you want your numeric fields to be searchable and the > XXXDocValuesField classes if you want your numeric fields to be > sortable/aggregatable. > However since then, we introduced logic to take advantage of doc values for > filtering (IndexOrDocValuesQuery) and enhanced sorting to take advantage of > the Points index to skip non-competitive documents. So even if you only need > searching, or if you only need sorting, it's likely a good idea to index both > with points *and* doc values. > Could we make this easier on users by having XXXField classes that > automatically do it as opposed to requiring users to add both an XXXPoint and > an XXXDocValuesField for every numeric field to their index? This could also > make consuming these fields easier, e.g. factory methods for range queries > could automatically use IndexOrDocValuesQuery. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10162) Add IntField, LongField, FloatField and DoubleField classes to index both points and doc values
[ https://issues.apache.org/jira/browse/LUCENE-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lu Xugang updated LUCENE-10162: --- Attachment: LUCENE-10162-1.patch > Add IntField, LongField, FloatField and DoubleField classes to index both > points and doc values > --- > > Key: LUCENE-10162 > URL: https://issues.apache.org/jira/browse/LUCENE-10162 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-10162-1.patch > > > Currently we have IntPoint, LongPoint, FloatPoint and DoublePoint on the one > hand, and NumericDocValuesField and SortedNumericDocValuesField on the other > hand. > When we introduced these classes, this distinction made sense: use the > XXXPoint classes if you want your numeric fields to be searchable and the > XXXDocValuesField classes if you want your numeric fields to be > sortable/aggregatable. > However since then, we introduced logic to take advantage of doc values for > filtering (IndexOrDocValuesQuery) and enhanced sorting to take advantage of > the Points index to skip non-competitive documents. So even if you only need > searching, or if you only need sorting, it's likely a good idea to index both > with points *and* doc values. > Could we make this easier on users by having XXXField classes that > automatically do it as opposed to requiring users to add both an XXXPoint and > an XXXDocValuesField for every numeric field to their index? This could also > make consuming these fields easier, e.g. factory methods for range queries > could automatically use IndexOrDocValuesQuery. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10162) Add IntField, LongField, FloatField and DoubleField classes to index both points and doc values
[ https://issues.apache.org/jira/browse/LUCENE-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501124#comment-17501124 ] Lu Xugang commented on LUCENE-10162: This is a new Query within the quick patch named NumericRangeQuery. It's implementation is a simply merge that how PointRangeQuery, IndexOrDocValuesQuery and IndexSortSortedNumericDocValuesRangeQuery supply a Scorer. Just wanner confirm if it is closed to LUCENE-10162's thought. [^LUCENE-10162-1.patch] > Add IntField, LongField, FloatField and DoubleField classes to index both > points and doc values > --- > > Key: LUCENE-10162 > URL: https://issues.apache.org/jira/browse/LUCENE-10162 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Attachments: LUCENE-10162-1.patch > > > Currently we have IntPoint, LongPoint, FloatPoint and DoublePoint on the one > hand, and NumericDocValuesField and SortedNumericDocValuesField on the other > hand. > When we introduced these classes, this distinction made sense: use the > XXXPoint classes if you want your numeric fields to be searchable and the > XXXDocValuesField classes if you want your numeric fields to be > sortable/aggregatable. > However since then, we introduced logic to take advantage of doc values for > filtering (IndexOrDocValuesQuery) and enhanced sorting to take advantage of > the Points index to skip non-competitive documents. So even if you only need > searching, or if you only need sorting, it's likely a good idea to index both > with points *and* doc values. > Could we make this easier on users by having XXXField classes that > automatically do it as opposed to requiring users to add both an XXXPoint and > an XXXDocValuesField for every numeric field to their index? This could also > make consuming these fields easier, e.g. factory methods for range queries > could automatically use IndexOrDocValuesQuery. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
rmuir commented on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058815901 Sorry I see it differently. I'm not a fan of IndexWriter handling the temporary files/encoding/decoding data, this seems to be in the wrong place. If IndexWriter shouldn't buffer vectors, then can it simply stream vectors to the codec api? This would be similar to how StoredFields and TermVectors work today (see e.g. StoredFieldsConsumer). The problem is, today we have two cases of IndexWriter behavior 1. Stuff that indexwriter buffers in memory and flushes in batches to the codec (terms, postings, docvalues, etc) 2. Stuff that indexwriter streams directly to the codec (stored fields, term vectors) For our own sanity, let's avoid adding a third case :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk
rmuir commented on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1058821263 I'm suspicious of the reported performance improvement based on looking at your benchmark output, I don't think its realistic. Looks like nothing else was indexed in any other way (docvalues/postings/etc), nobody ever called reopen() to force any flushes, so with the benchmark you ran, IW just wrote one big segment, avoiding all merging. So everything looks fantastic on paper, but this isn't realistic. It is easy to run into the same trap when benchmarking e.g. stored fields and other things. But it isn't really a performance improvement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10455) IndexSortSortedNumericDocValuesRangeQuery should implement Weight#scorerSupplier(LeafReaderContext)
Lu Xugang created LUCENE-10455: -- Summary: IndexSortSortedNumericDocValuesRangeQuery should implement Weight#scorerSupplier(LeafReaderContext) Key: LUCENE-10455 URL: https://issues.apache.org/jira/browse/LUCENE-10455 Project: Lucene - Core Issue Type: Improvement Reporter: Lu Xugang IndexOrDocValuesQuery was used as a fallbackQuery of IndexSortSortedNumericDocValuesRangeQuery in Elasticsearch, but When IndexSortSortedNumericDocValuesRangeQuery can't take advantage of index sort, the fallbackQuery(IndexOrDocValuesQuery) always only supply Scorer by indexQuery, beacuse IndexSortSortedNumericDocValuesRangeQuery did not implement Weight#scorerSupplier(LeafReaderContext). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10078) Enable merge-on-refresh by default?
[ https://issues.apache.org/jira/browse/LUCENE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501193#comment-17501193 ] Anand Kotriwal commented on LUCENE-10078: - I like the idea of adding this feature to LogMergePolicy and TieredMergePolicy. If we agree on making it default I can work on a PR. > Enable merge-on-refresh by default? > --- > > Key: LUCENE-10078 > URL: https://issues.apache.org/jira/browse/LUCENE-10078 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Priority: Major > > This is a spinoff from the discussion in LUCENE-10073. > The newish merge-on-refresh ([crazy origin > story|https://blog.mikemccandless.com/2021/03/open-source-collaboration-or-how-we.html]) > feature is a powerful way to reduce searched segment counts, especially > helpful for applications using many indexing threads. Such usage will write > many tiny segments on each refresh, which could quickly be merged up during > the {{refresh}} operation. > We would have to implement a default for {{findFullFlushMerges}} > (LUCENE-10064 is open for this), and then we would need > {{IndexWriterConfig.getMaxFullFlushMergeWaitMillis}} a non-zero value (this > issue). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10078) Enable merge-on-refresh by default?
[ https://issues.apache.org/jira/browse/LUCENE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501193#comment-17501193 ] Anand Kotriwal edited comment on LUCENE-10078 at 3/4/22, 7:19 AM: -- I like the idea of adding this feature to LogMergePolicy and TieredMergePolicy. If we agree on making it default I can work on a PR that does this. was (Author: anakot): I like the idea of adding this feature to LogMergePolicy and TieredMergePolicy. If we agree on making it default I can work on a PR. > Enable merge-on-refresh by default? > --- > > Key: LUCENE-10078 > URL: https://issues.apache.org/jira/browse/LUCENE-10078 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless >Priority: Major > > This is a spinoff from the discussion in LUCENE-10073. > The newish merge-on-refresh ([crazy origin > story|https://blog.mikemccandless.com/2021/03/open-source-collaboration-or-how-we.html]) > feature is a powerful way to reduce searched segment counts, especially > helpful for applications using many indexing threads. Such usage will write > many tiny segments on each refresh, which could quickly be merged up during > the {{refresh}} operation. > We would have to implement a default for {{findFullFlushMerges}} > (LUCENE-10064 is open for this), and then we would need > {{IndexWriterConfig.getMaxFullFlushMergeWaitMillis}} a non-zero value (this > issue). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] LuXugang opened a new pull request #729: LUCENE-10455: IndexSortSortedNumericDocValuesRangeQuery should implement Weight#scorerSupplier(LeafReaderContext)
LuXugang opened a new pull request #729: URL: https://github.com/apache/lucene/pull/729 IndexOrDocValuesQuery was used as a fallbackQuery of IndexSortSortedNumericDocValuesRangeQuery in Elasticsearch, but When IndexSortSortedNumericDocValuesRangeQuery can't take advantage of index sort, the fallbackQuery(IndexOrDocValuesQuery) always only supply Scorer by indexQuery, beacuse IndexSortSortedNumericDocValuesRangeQuery did not implement Weight#scorerSupplier(LeafReaderContext). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org