[GitHub] [lucene] jpountz opened a new pull request #724: LUCENE-10311: Remove pop_XXX helpers from `BitUtil`.

2022-03-03 Thread GitBox


jpountz opened a new pull request #724:
URL: https://github.com/apache/lucene/pull/724


   As @rmuir noted, it would be as simple and create less cognitive overhead to
   use `Long#bitCount` directly.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10453) Speed up VectorUtil#squareDistance

2022-03-03 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-10453:
-

 Summary: Speed up VectorUtil#squareDistance
 Key: LUCENE-10453
 URL: https://issues.apache.org/jira/browse/LUCENE-10453
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand


{{VectorUtil#squareDistance}} is used in conjunction with 
{{VectorSimilarityFunction#EUCLIDEAN}}.

It didn't get as much love as dot products (LUCENE-9837) yet there seems to be 
room for improvement. I wrote a quick JMH benchmark to run some comparisons: 
https://github.com/jpountz/vector-similarity-benchmarks.

While it's not as fast as using the vector API (which makes squareDistance 
computations more than 2x faster), we can get a ~25% speedup by unrolling the 
loop in a similar way to what dot product does.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz merged pull request #711: LUCENE-10428: Avoid infinite loop under error conditions.

2022-03-03 Thread GitBox


jpountz merged pull request #711:
URL: https://github.com/apache/lucene/pull/711


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on pull request #711: LUCENE-10428: Avoid infinite loop under error conditions.

2022-03-03 Thread GitBox


jpountz commented on pull request #711:
URL: https://github.com/apache/lucene/pull/711#issuecomment-1057807344


   Thanks all for the feedback and contributions!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-03-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500591#comment-17500591
 ] 

ASF subversion and git services commented on LUCENE-10428:
--

Commit 44a2a82319c9d375f3399a4b36abf2c3c7e229d6 in lucene's branch 
refs/heads/main from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=44a2a82 ]

LUCENE-10428: Avoid infinite loop under error conditions. (#711)

Co-authored-by: dblock 

> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge 
> leading to busy threads in infinite loop
> -
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring, core/search
>Reporter: Ankit Jain
>Priority: Major
> Attachments: Flame_graph.png
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Customers complained about high CPU for Elasticsearch cluster in production. 
> We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v   
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205  
> AmMLzDQ4RrOJievRDeGFZw:569204  direct1645195007282 14:36:47  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075  
> emjWc5bUTG6lgnCGLulq-Q:502074  direct1645195037259 14:37:17  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270  
> emjWc5bUTG6lgnCGLulq-Q:583269  direct1645201316981 16:21:56  4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into 
> *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some 
> live JVM debugging found that 
> org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had 
> around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = 
> 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-03-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500594#comment-17500594
 ] 

ASF subversion and git services commented on LUCENE-10428:
--

Commit 0d35e38b93d4c394aee691f308092cb9cfa792a2 in lucene's branch 
refs/heads/branch_9x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=0d35e38 ]

LUCENE-10428: Avoid infinite loop under error conditions. (#711)

Co-authored-by: dblock 

> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge 
> leading to busy threads in infinite loop
> -
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring, core/search
>Reporter: Ankit Jain
>Priority: Major
> Attachments: Flame_graph.png
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Customers complained about high CPU for Elasticsearch cluster in production. 
> We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v   
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205  
> AmMLzDQ4RrOJievRDeGFZw:569204  direct1645195007282 14:36:47  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075  
> emjWc5bUTG6lgnCGLulq-Q:502074  direct1645195037259 14:37:17  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270  
> emjWc5bUTG6lgnCGLulq-Q:583269  direct1645201316981 16:21:56  4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into 
> *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some 
> live JVM debugging found that 
> org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had 
> around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = 
> 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-03-03 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500595#comment-17500595
 ] 

Adrien Grand commented on LUCENE-10428:
---

[~akjain] While the underlying issue has not been fixed, the infinite loop 
should be fixed now, so I'm leaning towards marking this issue as resolved and 
opening a new one if/when we get the new IllegalStateException to trip. What do 
youthink?

> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge 
> leading to busy threads in infinite loop
> -
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring, core/search
>Reporter: Ankit Jain
>Priority: Major
> Attachments: Flame_graph.png
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Customers complained about high CPU for Elasticsearch cluster in production. 
> We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v   
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205  
> AmMLzDQ4RrOJievRDeGFZw:569204  direct1645195007282 14:36:47  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075  
> emjWc5bUTG6lgnCGLulq-Q:502074  direct1645195037259 14:37:17  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270  
> emjWc5bUTG6lgnCGLulq-Q:583269  direct1645201316981 16:21:56  4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into 
> *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some 
> live JVM debugging found that 
> org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had 
> around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = 
> 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10002) Remove IndexSearcher#search(Query,Collector) in favor of IndexSearcher#search(Query,CollectorManager)

2022-03-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500606#comment-17500606
 ] 

ASF subversion and git services commented on LUCENE-10002:
--

Commit 2a6b2ca1435ddb719bf0834d035ec38b7401c931 in lucene's branch 
refs/heads/branch_9x from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=2a6b2ca ]

LUCENE-10002: Fix test failure.

When IndexSearcher is created with a threadpool it becomes impossible to assert
on the number of evaluated hits overall.


> Remove IndexSearcher#search(Query,Collector) in favor of 
> IndexSearcher#search(Query,CollectorManager)
> -
>
> Key: LUCENE-10002
> URL: https://issues.apache.org/jira/browse/LUCENE-10002
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> It's a bit trappy that you can create an IndexSearcher with an executor, but 
> that it would always search on the caller thread when calling 
> {{IndexSearcher#search(Query,Collector)}}.
>  Let's remove {{IndexSearcher#search(Query,Collector)}}, point our users to 
> {{IndexSearcher#search(Query,CollectorManager)}} instead, and change factory 
> methods of our main collectors (e.g. {{TopScoreDocCollector#create}}) to 
> return a {{CollectorManager}} instead of a {{Collector}}?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10002) Remove IndexSearcher#search(Query,Collector) in favor of IndexSearcher#search(Query,CollectorManager)

2022-03-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500605#comment-17500605
 ] 

ASF subversion and git services commented on LUCENE-10002:
--

Commit bff4246476d860942e1b20dae2540b5caae2eda9 in lucene's branch 
refs/heads/main from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=bff4246 ]

LUCENE-10002: Fix test failure.

When IndexSearcher is created with a threadpool it becomes impossible to assert
on the number of evaluated hits overall.


> Remove IndexSearcher#search(Query,Collector) in favor of 
> IndexSearcher#search(Query,CollectorManager)
> -
>
> Key: LUCENE-10002
> URL: https://issues.apache.org/jira/browse/LUCENE-10002
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> It's a bit trappy that you can create an IndexSearcher with an executor, but 
> that it would always search on the caller thread when calling 
> {{IndexSearcher#search(Query,Collector)}}.
>  Let's remove {{IndexSearcher#search(Query,Collector)}}, point our users to 
> {{IndexSearcher#search(Query,CollectorManager)}} instead, and change factory 
> methods of our main collectors (e.g. {{TopScoreDocCollector#create}}) to 
> return a {{CollectorManager}} instead of a {{Collector}}?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] romseygeek commented on a change in pull request #722: LUCENE-10431: Deprecate MultiTermQuery.setRewriteMethod()

2022-03-03 Thread GitBox


romseygeek commented on a change in pull request #722:
URL: https://github.com/apache/lucene/pull/722#discussion_r818496740



##
File path: lucene/core/src/java/org/apache/lucene/search/FuzzyQuery.java
##
@@ -113,11 +138,40 @@ public FuzzyQuery(Term term, int maxEdits) {
 this(term, maxEdits, defaultPrefixLength);
   }
 
+  /**
+   * Calls {@link #FuzzyQuery(Term, int, int, int, boolean,
+   * org.apache.lucene.search.MultiTermQuery.RewriteMethod) FuzzyQuery(term, 
maxEdits,
+   * defaultPrefixLength, defaultMaxExpansions, defaultTransitions, 
defaultRewriteMethod)}.
+   */
+  public FuzzyQuery(Term term, int maxEdits, RewriteMethod rewriteMethod) {
+this(
+term,
+maxEdits,
+defaultPrefixLength,
+defaultMaxExpansions,
+defaultTranspositions,
+rewriteMethod);
+  }
+
   /** Calls {@link #FuzzyQuery(Term, int) FuzzyQuery(term, defaultMaxEdits)}. 
*/
   public FuzzyQuery(Term term) {
 this(term, defaultMaxEdits);
   }
 
+  /**
+   * Creates a new FuzzyQuery with default max edits, prefix length, max 
expansions and
+   * transpositions
+   */
+  public FuzzyQuery(Term term, RewriteMethod rewriteMethod) {
+this(
+term,
+defaultMaxEdits,
+defaultPrefixLength,
+defaultMaxExpansions,
+defaultTranspositions,
+rewriteMethod);
+  }

Review comment:
   I'm adding about four but there are already about eight :) I quite like 
the Builder idea but we can argue over that separately, for now I'll just 
extend the full constructor.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500655#comment-17500655
 ] 

kkewwei commented on LUCENE-10448:
--

[~vigyas] you mill nothing, there maybe exists a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO, according to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500655#comment-17500655
 ] 

kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:44 AM:


[~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO, according to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 


was (Author: kkewwei):
[~vigyas] you mill nothing, there maybe exists a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO, according to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500655#comment-17500655
 ] 

kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:45 AM:


[~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO. According to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 


was (Author: kkewwei):
[~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO, according to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500655#comment-17500655
 ] 

kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:46 AM:


[~vigyas] you mill nothing, there may be exist a small tip: 
After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than 
what we set(10mb/s),  the instant write will create presure to IO. According to 
my statistics, the frequency of no-pause bytes is [2%-20%],  this proportion of 
no-pause  is especially high when io pressure is high, too much high instant 
rate will leads to higher io pressure. 


was (Author: kkewwei):
[~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO. According to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] romseygeek merged pull request #722: LUCENE-10431: Deprecate MultiTermQuery.setRewriteMethod()

2022-03-03 Thread GitBox


romseygeek merged pull request #722:
URL: https://github.com/apache/lucene/pull/722


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()

2022-03-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500662#comment-17500662
 ] 

ASF subversion and git services commented on LUCENE-10431:
--

Commit 3f994dec53a1f45c27be9f577a01f20516461b3e in lucene's branch 
refs/heads/main from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=3f994de ]

LUCENE-10431: Deprecate MultiTermQuery.setRewriteMethod() (#722)

Allowing users to mutate MultiTermQuery can give rise to odd bugs, for example
in wrapper queries such as BooleanQuery which lazily calculate their hashcodes
and then cache the result. This commit deprecates the setRewriteMethod()
method on MultiTermQuery, in preparation for removing it entirely, and adds
constructor parameters to the various MTQ implementations as a preferred
way to set the rewrite method.

> AssertionError in BooleanQuery.hashCode()
> -
>
> Key: LUCENE-10431
> URL: https://issues.apache.org/jira/browse/LUCENE-10431
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.11.1
>Reporter: Michael Bien
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Hello devs,
> the constructor of BooleanQuery can under some circumstances trigger a hash 
> code computation before "clauseSets" is fully filled. Since BooleanClause is 
> using its query field for the hash code too, it can happen that the "wrong" 
> hash code is stored, since adding the clause to the set triggers its 
> hashCode().
> If assertions are enabled the check in BooleanQuery, which recomputes the 
> hash code, will notice it and throw an error.
> exception:
> {code:java}
> java.lang.AssertionError
>     at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:614)
>     at java.base/java.util.Objects.hashCode(Objects.java:103)
>     at java.base/java.util.HashMap$Node.hashCode(HashMap.java:298)
>     at java.base/java.util.AbstractMap.hashCode(AbstractMap.java:527)
>     at org.apache.lucene.search.Multiset.hashCode(Multiset.java:119)
>     at java.base/java.util.EnumMap.entryHashCode(EnumMap.java:717)
>     at java.base/java.util.EnumMap.hashCode(EnumMap.java:709)
>     at java.base/java.util.Arrays.hashCode(Arrays.java:4498)
>     at java.base/java.util.Objects.hash(Objects.java:133)
>     at 
> org.apache.lucene.search.BooleanQuery.computeHashCode(BooleanQuery.java:597)
>     at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:611)
>     at java.base/java.util.HashMap.hash(HashMap.java:340)
>     at java.base/java.util.HashMap.put(HashMap.java:612)
>     at org.apache.lucene.search.Multiset.add(Multiset.java:82)
>     at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:154)
>     at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:42)
>     at 
> org.apache.lucene.search.BooleanQuery$Builder.build(BooleanQuery.java:133)
> {code}
> I noticed this while trying to upgrade the NetBeans maven indexer modules 
> from lucene 5.x to 8.x https://github.com/apache/netbeans/pull/3558



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()

2022-03-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500678#comment-17500678
 ] 

ASF subversion and git services commented on LUCENE-10431:
--

Commit 63454b83ad3cea3bae7c70f4b6276fce60d81672 in lucene's branch 
refs/heads/branch_9x from Alan Woodward
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=63454b8 ]

LUCENE-10431: Deprecate MultiTermQuery.setRewriteMethod() (#722)

Allowing users to mutate MultiTermQuery can give rise to odd bugs, for example
in wrapper queries such as BooleanQuery which lazily calculate their hashcodes
and then cache the result. This commit deprecates the setRewriteMethod()
method on MultiTermQuery, in preparation for removing it entirely, and adds
constructor parameters to the various MTQ implementations as a preferred
way to set the rewrite method.


> AssertionError in BooleanQuery.hashCode()
> -
>
> Key: LUCENE-10431
> URL: https://issues.apache.org/jira/browse/LUCENE-10431
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.11.1
>Reporter: Michael Bien
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Hello devs,
> the constructor of BooleanQuery can under some circumstances trigger a hash 
> code computation before "clauseSets" is fully filled. Since BooleanClause is 
> using its query field for the hash code too, it can happen that the "wrong" 
> hash code is stored, since adding the clause to the set triggers its 
> hashCode().
> If assertions are enabled the check in BooleanQuery, which recomputes the 
> hash code, will notice it and throw an error.
> exception:
> {code:java}
> java.lang.AssertionError
>     at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:614)
>     at java.base/java.util.Objects.hashCode(Objects.java:103)
>     at java.base/java.util.HashMap$Node.hashCode(HashMap.java:298)
>     at java.base/java.util.AbstractMap.hashCode(AbstractMap.java:527)
>     at org.apache.lucene.search.Multiset.hashCode(Multiset.java:119)
>     at java.base/java.util.EnumMap.entryHashCode(EnumMap.java:717)
>     at java.base/java.util.EnumMap.hashCode(EnumMap.java:709)
>     at java.base/java.util.Arrays.hashCode(Arrays.java:4498)
>     at java.base/java.util.Objects.hash(Objects.java:133)
>     at 
> org.apache.lucene.search.BooleanQuery.computeHashCode(BooleanQuery.java:597)
>     at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:611)
>     at java.base/java.util.HashMap.hash(HashMap.java:340)
>     at java.base/java.util.HashMap.put(HashMap.java:612)
>     at org.apache.lucene.search.Multiset.add(Multiset.java:82)
>     at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:154)
>     at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:42)
>     at 
> org.apache.lucene.search.BooleanQuery$Builder.build(BooleanQuery.java:133)
> {code}
> I noticed this while trying to upgrade the NetBeans maven indexer modules 
> from lucene 5.x to 8.x https://github.com/apache/netbeans/pull/3558



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] romseygeek opened a new pull request #727: LUCENE-10431: Don't include rewriteMethod in MTQ hash calculation

2022-03-03 Thread GitBox


romseygeek opened a new pull request #727:
URL: https://github.com/apache/lucene/pull/727


   BooleanQuery assumes that its children's hashcodes are stable, and has some
   assertions to this effect.  This did not apply to MultiTermQuery, which has 
a 
   mutable RewriteMethod member variable that was included in its hash 
calculation.
   Changing the rewrite method would change the hash, leading to assertion 
failures
   being tripped.  This commit removes rewriteMethod from the hash calculation,
   meaning that the hashcode will be stable even under mutability.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()

2022-03-03 Thread Alan Woodward (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500706#comment-17500706
 ] 

Alan Woodward commented on LUCENE-10431:


Follow up PRs:
 * For main, removing the deprecated `setRewriteMethod()` method: 
[https://github.com/apache/lucene/pull/726]
 * For 9x, removing rewriteMethod from MTQ's hashCode calculation: 
https://github.com/apache/lucene/pull/727

> AssertionError in BooleanQuery.hashCode()
> -
>
> Key: LUCENE-10431
> URL: https://issues.apache.org/jira/browse/LUCENE-10431
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.11.1
>Reporter: Michael Bien
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Hello devs,
> the constructor of BooleanQuery can under some circumstances trigger a hash 
> code computation before "clauseSets" is fully filled. Since BooleanClause is 
> using its query field for the hash code too, it can happen that the "wrong" 
> hash code is stored, since adding the clause to the set triggers its 
> hashCode().
> If assertions are enabled the check in BooleanQuery, which recomputes the 
> hash code, will notice it and throw an error.
> exception:
> {code:java}
> java.lang.AssertionError
>     at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:614)
>     at java.base/java.util.Objects.hashCode(Objects.java:103)
>     at java.base/java.util.HashMap$Node.hashCode(HashMap.java:298)
>     at java.base/java.util.AbstractMap.hashCode(AbstractMap.java:527)
>     at org.apache.lucene.search.Multiset.hashCode(Multiset.java:119)
>     at java.base/java.util.EnumMap.entryHashCode(EnumMap.java:717)
>     at java.base/java.util.EnumMap.hashCode(EnumMap.java:709)
>     at java.base/java.util.Arrays.hashCode(Arrays.java:4498)
>     at java.base/java.util.Objects.hash(Objects.java:133)
>     at 
> org.apache.lucene.search.BooleanQuery.computeHashCode(BooleanQuery.java:597)
>     at org.apache.lucene.search.BooleanQuery.hashCode(BooleanQuery.java:611)
>     at java.base/java.util.HashMap.hash(HashMap.java:340)
>     at java.base/java.util.HashMap.put(HashMap.java:612)
>     at org.apache.lucene.search.Multiset.add(Multiset.java:82)
>     at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:154)
>     at org.apache.lucene.search.BooleanQuery.(BooleanQuery.java:42)
>     at 
> org.apache.lucene.search.BooleanQuery$Builder.build(BooleanQuery.java:133)
> {code}
> I noticed this while trying to upgrade the NetBeans maven indexer modules 
> from lucene 5.x to 8.x https://github.com/apache/netbeans/pull/3558



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dblock commented on pull request #711: LUCENE-10428: Avoid infinite loop under error conditions.

2022-03-03 Thread GitBox


dblock commented on pull request #711:
URL: https://github.com/apache/lucene/pull/711#issuecomment-1058085879


   Thanks for fixing the issue @jpountz and everyone for your support and ideas!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-03-03 Thread Daniel Doubrovkine (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500789#comment-17500789
 ] 

Daniel Doubrovkine edited comment on LUCENE-10428 at 3/3/22, 2:27 PM:
--

I agree with closing. The loop can't happen anymore, and we can open a new 
issue when we see new data pointing to a bug elsewhere.


was (Author: dblock):
I agree with the above. The loop can't happen anymore, and we can open a new 
issue when we see new data pointing to a bug elsewhere.

> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge 
> leading to busy threads in infinite loop
> -
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring, core/search
>Reporter: Ankit Jain
>Priority: Major
> Attachments: Flame_graph.png
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Customers complained about high CPU for Elasticsearch cluster in production. 
> We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v   
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205  
> AmMLzDQ4RrOJievRDeGFZw:569204  direct1645195007282 14:36:47  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075  
> emjWc5bUTG6lgnCGLulq-Q:502074  direct1645195037259 14:37:17  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270  
> emjWc5bUTG6lgnCGLulq-Q:583269  direct1645201316981 16:21:56  4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into 
> *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some 
> live JVM debugging found that 
> org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had 
> around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = 
> 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-03-03 Thread Daniel Doubrovkine (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500789#comment-17500789
 ] 

Daniel Doubrovkine commented on LUCENE-10428:
-

I agree with the above. The loop can't happen anymore, and we can open a new 
issue when we see new data pointing to a bug elsewhere.

> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge 
> leading to busy threads in infinite loop
> -
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring, core/search
>Reporter: Ankit Jain
>Priority: Major
> Attachments: Flame_graph.png
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Customers complained about high CPU for Elasticsearch cluster in production. 
> We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v   
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205  
> AmMLzDQ4RrOJievRDeGFZw:569204  direct1645195007282 14:36:47  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075  
> emjWc5bUTG6lgnCGLulq-Q:502074  direct1645195037259 14:37:17  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270  
> emjWc5bUTG6lgnCGLulq-Q:583269  direct1645201316981 16:21:56  4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into 
> *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some 
> live JVM debugging found that 
> org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had 
> around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = 
> 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10302) PriorityQueue: optimize where we collect then iterate by using O(N) heapify

2022-03-03 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-10302:
--
Attachment: LUCENE_PriorityQueue_Builder_with_heapify.patch

> PriorityQueue: optimize where we collect then iterate by using O(N) heapify
> ---
>
> Key: LUCENE-10302
> URL: https://issues.apache.org/jira/browse/LUCENE-10302
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: David Smiley
>Priority: Major
> Attachments: LUCENE_PriorityQueue_Builder_with_heapify.patch
>
>
> Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to 
> wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue 
> when we provide a bulk array to initialize the heap/PriorityQueue.  It turns 
> out there is: the JDK's PriorityQueue supports this in its constructors, 
> referring to "This classic algorithm due to Floyd (1964) is known to be 
> O(size)" -- heapify() method.  There's 
> [another|https://www.geeksforgeeks.org/building-heap-from-array/]  that may 
> or may not be the same; I didn't look too closely yet.  I see a number of 
> uses of Lucene's PriorityQueue that first collects values and only after 
> collecting want to do something with the results (typical / unsurprising).  
> This lends itself to a builder pattern that can look similar to 
> LargeNumHitsTopDocsCollector in terms of first having an array used like a 
> list and then move over to the PriorityQueue if/when it gets full (it may 
> not).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10302) PriorityQueue: optimize where we collect then iterate by using O(N) heapify

2022-03-03 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500803#comment-17500803
 ] 

David Smiley commented on LUCENE-10302:
---

I attached my WIP as a patch file.  Looking back, I started with defining the 
Builder code.  I didn't yet implement "heapify".   Nobody calls any of it.

> PriorityQueue: optimize where we collect then iterate by using O(N) heapify
> ---
>
> Key: LUCENE-10302
> URL: https://issues.apache.org/jira/browse/LUCENE-10302
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: David Smiley
>Priority: Major
> Attachments: LUCENE_PriorityQueue_Builder_with_heapify.patch
>
>
> Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to 
> wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue 
> when we provide a bulk array to initialize the heap/PriorityQueue.  It turns 
> out there is: the JDK's PriorityQueue supports this in its constructors, 
> referring to "This classic algorithm due to Floyd (1964) is known to be 
> O(size)" -- heapify() method.  There's 
> [another|https://www.geeksforgeeks.org/building-heap-from-array/]  that may 
> or may not be the same; I didn't look too closely yet.  I see a number of 
> uses of Lucene's PriorityQueue that first collects values and only after 
> collecting want to do something with the results (typical / unsurprising).  
> This lends itself to a builder pattern that can look similar to 
> LargeNumHitsTopDocsCollector in terms of first having an array used like a 
> list and then move over to the PriorityQueue if/when it gets full (it may 
> not).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mayya-sharipova opened a new pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova opened a new pull request #728:
URL: https://github.com/apache/lucene/pull/728


   Currently VectorValuesWriter buffers vectors in memory.
   The  problem is that as multi-dimensional vectors consume a lot of memory,
   a lot of flushes are triggered. Each flush is a very expensive involving
   the construction of an HNSW graph.
   
   This patch instead buffers KNN vectors on disk, which prevents
   construction of new segments only because RAM is full.
   
   Also unset RAMBufferSizeMB in KnnGraphTester, now defaults to  16Mb


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mayya-sharipova commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova commented on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with segments force merged to 1.
   - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, 
also force merge at the end.
   
   **Indexing**
   - baseline took Built index in 1099 secs, around **18mins**
   - candidate took 586 secs, around **10 mins**
   - search performance is the same.
   
   2022-03-03T15:01:49.958373Z; main
   IW 1 [2022-03-03T15:14:33.924666Z; main]
   
   
   
Details on the search performance 
   
   
   
   
Details on the candidate 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
Details on the baseline 
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files in the index**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
   -rw-r--r--   1 mayyasharipova  staff   192B  3 Mar 15:14 _w.fnm
   -rw-r--r--   1 mayyasharipova  staff   532B  3 Mar 15:14 _w.si
   -rw-r--r--   1 mayyasharipova  staff   451M  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vec
   -rw-r--r--   1 mayyasharipova  staff   309K  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vem
   -rw-r--r--   1 mayyasharipova  staff82M  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vex
   -rw-r--r--   1 mayyasharipova  staff   154B  3 Mar 15:14 segments_2
   -rw-r--r--   1 mayyasharipova  staff 0B  3 Mar 14:56 write.lock
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with segments force merged to 1.
   - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, 
also force merge at the end.
   
   **Indexing**
   - baseline took 1099 secs, around **18mins**
   - candidate took 586 secs, around **10 mins**
   - search performance is the same.
   
   2022-03-03T15:01:49.958373Z; main
   IW 1 [2022-03-03T15:14:33.924666Z; main]
   
   
   
Details on the search performance 
   
   
   
   
Details on the candidate 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
Details on the baseline 
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files in the index**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
   -rw-r--r--   1 mayyasharipova  staff   192B  3 Mar 15:14 _w.fnm
   -rw-r--r--   1 mayyasharipova  staff   532B  3 Mar 15:14 _w.si
   -rw-r--r--   1 mayyasharipova  staff   451M  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vec
   -rw-r--r--   1 mayyasharipova  staff   309K  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vem
   -rw-r--r--   1 mayyasharipova  staff82M  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vex
   -rw-r--r--   1 mayyasharipova  staff   154B  3 Mar 15:14 segments_2
   -rw-r--r--   1 mayyasharipova  staff 0B  3 Mar 14:56 write.lock
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with segments force merged to 1.
   - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, 
also force merge at the end.
   
   **Indexing**
   - baseline took 1099 secs, around **18mins**
   - candidate took 586 secs, around **10 mins**
   - search performance is the same.
   
   
   
Details on the search performance 
   
   
   
   
Details on the candidate 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
Details on the baseline 
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files in the index**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
   -rw-r--r--   1 mayyasharipova  staff   192B  3 Mar 15:14 _w.fnm
   -rw-r--r--   1 mayyasharipova  staff   532B  3 Mar 15:14 _w.si
   -rw-r--r--   1 mayyasharipova  staff   451M  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vec
   -rw-r--r--   1 mayyasharipova  staff   309K  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vem
   -rw-r--r--   1 mayyasharipova  staff82M  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vex
   -rw-r--r--   1 mayyasharipova  staff   154B  3 Mar 15:14 segments_2
   -rw-r--r--   1 mayyasharipova  staff 0B  3 Mar 14:56 write.lock
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with segments force merged to 1.
   - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, 
also force merge at the end.
   
   **Indexing**
   - baseline took 1099 secs, around **18mins**
   - candidate took 586 secs, around **10 mins**
   - search performance is the same.
   
   
   
Details on the search performance 
   
   
   
   
Candidate indexing details 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   **Files after force merge**
   ```txt
 8 -rw-r--r--  1 mayyasharipova  staff   297B  3 Mar 14:40 _0.cfe
   1105112 -rw-r--r--  1 mayyasharipova  staff   538M  3 Mar 14:40 _0.cfs
 8 -rw-r--r--  1 mayyasharipova  staff   376B  3 Mar 14:40 _0.si
 8 -rw-r--r--  1 mayyasharipova  staff   154B  3 Mar 14:40 segments_1
 0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
   
   
Baseline indexing details
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files after force merge**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
   -rw-r--r--   1 mayyasharipova  staff   192B  3 Mar 15:14 _w.fnm
   -rw-r--r--   1 mayyasharipova  staff   532B  3 Mar 15:14 _w.si
   -rw-r--r--   1 mayyasharipova  staff   451M  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vec
   -rw-r--r--   1 mayyasharipova  staff   309K  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vem
   -rw-r--r--   1 mayyasharipova  staff82M  3 Mar 15:14 
_w_Lucene91HnswVectorsFormat_0.vex
   -rw-r--r--   1 mayyasharipova  staff   154B  3 Mar 15:14 segments_2
   -rw-r--r--   1 mayyasharipova  staff 0B  3 Mar 14:56 write.lock
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr.

[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with segments force merged to 1.
   - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, 
also force merge at the end.
   
   **Indexing**
   - baseline took 1099 secs, around **18mins**
   - candidate took 586 secs, around **10 mins**
   - search performance is the same.
   
   
   
Details on the search performance 
   
   | | baseline recall | baseline QPS | candidate recall | 
candidate QPS |
   | --- | --: | ---: | ---: | 
: |
   | n_cands=10  |   0.486 | 3995.468 |0.463 |  
3636.417 |
   | n_cands=20  |   0.532 | 3261.435 |0.529 |  
3356.358 |
   | n_cands=40  |   0.608 | 2685.442 |0.603 |  
2494.603 |
   | n_cands=80  |   0.683 | 1874.002 |0.682 |  
1884.534 |
   | n_cands=120 |   0.723 | 1474.137 |0.721 |  
1445.883 |
   | n_cands=200 |   0.766 | 1048.531 |0.766 |  
1070.614 |
   | n_cands=400 |   0.819 |  554.110 |0.819 |   
639.026 |
   | n_cands=600 |   0.844 |  464.523 |0.845 |   
435.123 |
   | n_cands=800 |   0.861 |  355.228 |0.862 |   
329.773 |
   
   
   
   
Candidate indexing details 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   **Files after force merge**
   ```txt
 8 -rw-r--r--  1 mayyasharipova  staff   297B  3 Mar 14:40 _0.cfe
   1105112 -rw-r--r--  1 mayyasharipova  staff   538M  3 Mar 14:40 _0.cfs
 8 -rw-r--r--  1 mayyasharipova  staff   376B  3 Mar 14:40 _0.si
 8 -rw-r--r--  1 mayyasharipova  staff   154B  3 Mar 14:40 segments_1
 0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
   
   
Baseline indexing details
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files after force merge**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
   -rw-r--r--   1 mayyasharipova  staff

[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with force merge at the end.
   - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, 
also force merge at the end.
   
   **Indexing**
   - baseline took 1099 secs, around **18mins**
   - candidate took 586 secs, around **10 mins**
   - search performance is the same.
   
   
   
Details on the search performance 
   
   | | baseline recall | baseline QPS | candidate recall | 
candidate QPS |
   | --- | --: | ---: | ---: | 
: |
   | n_cands=10  |   0.486 | 3995.468 |0.463 |  
3636.417 |
   | n_cands=20  |   0.532 | 3261.435 |0.529 |  
3356.358 |
   | n_cands=40  |   0.608 | 2685.442 |0.603 |  
2494.603 |
   | n_cands=80  |   0.683 | 1874.002 |0.682 |  
1884.534 |
   | n_cands=120 |   0.723 | 1474.137 |0.721 |  
1445.883 |
   | n_cands=200 |   0.766 | 1048.531 |0.766 |  
1070.614 |
   | n_cands=400 |   0.819 |  554.110 |0.819 |   
639.026 |
   | n_cands=600 |   0.844 |  464.523 |0.845 |   
435.123 |
   | n_cands=800 |   0.861 |  355.228 |0.862 |   
329.773 |
   
   
   
   
Candidate indexing details 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   **Files after force merge**
   ```txt
 8 -rw-r--r--  1 mayyasharipova  staff   297B  3 Mar 14:40 _0.cfe
   1105112 -rw-r--r--  1 mayyasharipova  staff   538M  3 Mar 14:40 _0.cfs
 8 -rw-r--r--  1 mayyasharipova  staff   376B  3 Mar 14:40 _0.si
 8 -rw-r--r--  1 mayyasharipova  staff   154B  3 Mar 14:40 segments_1
 0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
   
   
Baseline indexing details
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files after force merge**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
   -rw-r--r--   1 mayyasharipova  staff   1

[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - candidate: this PR, where RAMBufferSizeMB similarly is set to **16Mb**, 
also force merge at the end.
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with force merge at the end.
   
   **Results**
   - Indexing candidate took 586 secs, around **10 mins**
   - Indexing baseline took 1099 secs, around **18mins**
   - search performance is the same.
   
   
   
Details on the search performance 
   
   | | baseline recall | baseline QPS | candidate recall | 
candidate QPS |
   | --- | --: | ---: | ---: | 
: |
   | n_cands=10  |   0.486 | 3995.468 |0.463 |  
3636.417 |
   | n_cands=20  |   0.532 | 3261.435 |0.529 |  
3356.358 |
   | n_cands=40  |   0.608 | 2685.442 |0.603 |  
2494.603 |
   | n_cands=80  |   0.683 | 1874.002 |0.682 |  
1884.534 |
   | n_cands=120 |   0.723 | 1474.137 |0.721 |  
1445.883 |
   | n_cands=200 |   0.766 | 1048.531 |0.766 |  
1070.614 |
   | n_cands=400 |   0.819 |  554.110 |0.819 |   
639.026 |
   | n_cands=600 |   0.844 |  464.523 |0.845 |   
435.123 |
   | n_cands=800 |   0.861 |  355.228 |0.862 |   
329.773 |
   
   
   
   
Candidate indexing details 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   **Files after force merge**
   ```txt
 8 -rw-r--r--  1 mayyasharipova  staff   297B  3 Mar 14:40 _0.cfe
   1105112 -rw-r--r--  1 mayyasharipova  staff   538M  3 Mar 14:40 _0.cfs
 8 -rw-r--r--  1 mayyasharipova  staff   376B  3 Mar 14:40 _0.si
 8 -rw-r--r--  1 mayyasharipova  staff   154B  3 Mar 14:40 segments_1
 0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
   
   
Baseline indexing details
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files after force merge**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
   -rw-r--r--   1 mayyasha

[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - candidate: this PR, where RAMBufferSizeMB is set to default **16Mb** with 
force merge at the end of indexing.
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with force merge at the end of indexing.
   
   **Results**
   - Indexing candidate took 586 secs, around **10 mins**
   - Indexing baseline took 1099 secs, around **18mins**
   - search performance is the same.
   
   
   
Details on the search performance 
   
   | | baseline recall | baseline QPS | candidate recall | 
candidate QPS |
   | --- | --: | ---: | ---: | 
: |
   | n_cands=10  |   0.486 | 3995.468 |0.463 |  
3636.417 |
   | n_cands=20  |   0.532 | 3261.435 |0.529 |  
3356.358 |
   | n_cands=40  |   0.608 | 2685.442 |0.603 |  
2494.603 |
   | n_cands=80  |   0.683 | 1874.002 |0.682 |  
1884.534 |
   | n_cands=120 |   0.723 | 1474.137 |0.721 |  
1445.883 |
   | n_cands=200 |   0.766 | 1048.531 |0.766 |  
1070.614 |
   | n_cands=400 |   0.819 |  554.110 |0.819 |   
639.026 |
   | n_cands=600 |   0.844 |  464.523 |0.845 |   
435.123 |
   | n_cands=800 |   0.861 |  355.228 |0.862 |   
329.773 |
   
   
   
   
Candidate indexing details 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   **Files after force merge**
   ```txt
 8 -rw-r--r--  1 mayyasharipova  staff   297B  3 Mar 14:40 _0.cfe
   1105112 -rw-r--r--  1 mayyasharipova  staff   538M  3 Mar 14:40 _0.cfs
 8 -rw-r--r--  1 mayyasharipova  staff   376B  3 Mar 14:40 _0.si
 8 -rw-r--r--  1 mayyasharipova  staff   154B  3 Mar 14:40 segments_1
 0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
   
   
Baseline indexing details
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files after force merge**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
   -r

[GitHub] [lucene] mayya-sharipova edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058148842


   I've benchmarked the results with ann-benchmarks on glove-100-angular (M:16, 
 efConstruction:100)
   
   - baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
**16Mb** with force merge at the end of indexing.
   - candidate: this PR, where RAMBufferSizeMB is set to default **16Mb** with 
force merge at the end of indexing.
   
   
   **Results**
   - Indexing baseline took 1099 secs, around **18mins**
   - Indexing candidate took 586 secs, around **10 mins**
   - search performance is the same.
   
   
   
Details on the search performance 
   
   | | baseline recall | baseline QPS | candidate recall | 
candidate QPS |
   | --- | --: | ---: | ---: | 
: |
   | n_cands=10  |   0.486 | 3995.468 |0.463 |  
3636.417 |
   | n_cands=20  |   0.532 | 3261.435 |0.529 |  
3356.358 |
   | n_cands=40  |   0.608 | 2685.442 |0.603 |  
2494.603 |
   | n_cands=80  |   0.683 | 1874.002 |0.682 |  
1884.534 |
   | n_cands=120 |   0.723 | 1474.137 |0.721 |  
1445.883 |
   | n_cands=200 |   0.766 | 1048.531 |0.766 |  
1070.614 |
   | n_cands=400 |   0.819 |  554.110 |0.819 |   
639.026 |
   | n_cands=600 |   0.844 |  464.523 |0.845 |   
435.123 |
   | n_cands=800 |   0.861 |  355.228 |0.862 |   
329.773 |
   
   
   
   
Candidate indexing details 
   
   Indexing output
   
```txt
   IW 0 [2022-03-03T14:30:49.413950Z; main]: init: create=true reader=null
  ramBufferSizeMB=16.0
   maxBufferedDocs=-1
   IW 0 [2022-03-03T14:30:49.424202Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
   Done indexing 1183514 documents; now flush
   IW 0 [2022-03-03T14:30:50.824200Z; main]: now flush at close
   IW 0 [2022-03-03T14:30:50.824401Z; main]:   start flush: applyAllDeletes=true
   IW 0 [2022-03-03T14:30:50.824515Z; main]:   index before flush
   DW 0 [2022-03-03T14:30:50.824557Z; main]: startFullFlush
   DW 0 [2022-03-03T14:30:50.827209Z; main]: anyChanges? numDocsInRam=1183514 
deletes=false hasTickets:false pendingChangesInFullFlush: false
   DWPT 0 [2022-03-03T14:30:50.831053Z; main]: flush postings as segment _0 
numDocs=1183514
   HNSW 0 [2022-03-03T14:30:52.334343Z; main]: build graph from 1183514 vectors
   ...
   HNSW 0 [2022-03-03T14:40:31.049504Z; main]: built 118 in 5585/578724 ms
   ...
   IW 0 [2022-03-03T14:40:33.492318Z; main]: 582671 msec to write vectors
   IFD 0 [2022-03-03T14:40:34.655718Z; main]: 20 msec to checkpoint
   Indexed 1183514 documents in 585s
   Force merge index in luceneknn-100-16-100.train-16-100.index
   IFD 1 [2022-03-03T14:40:34.671943Z; main]: 0 msec to checkpoint
   Built index in 586.944657087326
   ```
   
   **Files in the index**
   
   ```txt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 _0.fdm
10080 -rw-r--r--  1 mayyasharipova  staff   4.6M  3 Mar 14:30 _0.fdt
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndex-doc_ids_0.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene90FieldsIndexfile_pointers_1.tmp
   929304 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec
   924624 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vec_temp_3.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vem
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 
_0_Lucene91HnswVectorsFormat_0.vex
   953168 -rw-r--r--  1 mayyasharipova  staff   451M  3 Mar 14:30 
_0_knn_buffered_vectors_temp_2.tmp
0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   **Files after force merge**
   ```txt
 8 -rw-r--r--  1 mayyasharipova  staff   297B  3 Mar 14:40 _0.cfe
   1105112 -rw-r--r--  1 mayyasharipova  staff   538M  3 Mar 14:40 _0.cfs
 8 -rw-r--r--  1 mayyasharipova  staff   376B  3 Mar 14:40 _0.si
 8 -rw-r--r--  1 mayyasharipova  staff   154B  3 Mar 14:40 segments_1
 0 -rw-r--r--  1 mayyasharipova  staff 0B  3 Mar 14:30 write.lock
   ```
   
   
   
   
   
   
Baseline indexing details
   
   Indexing output
   
```txt
   Built index in 1099.0846738815308
   ```
   
   **Files after force merge**
   
   ```txt
   drwxr-xr-x  12 mayyasharipova  staff   384B  3 Mar 15:14 .
   drwxr-xr-x  42 mayyasharipova  staff   1.3K  3 Mar 15:14 ..
   -rw-r--r--   1 mayyasharipova  staff   201B  3 Mar 15:03 _w.fdm
   -rw-r--r--   1 mayyasharipova  staff   4.6M  3 Mar 15:03 _w.fdt
   -rw-r--r--   1 mayyasharipova  staff   3.5K  3 Mar 15:03 _w.fdx
 

[GitHub] [lucene-solr] thelabdude commented on a change in pull request #2165: SOLR-15059: Improve query performance monitoring

2022-03-03 Thread GitBox


thelabdude commented on a change in pull request #2165:
URL: https://github.com/apache/lucene-solr/pull/2165#discussion_r818795935



##
File path: solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml
##
@@ -315,88 +477,22 @@
 node metrics
   -->
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".clientErrors")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_client_errors_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(client_errors_total, select(.key | 
endswith(".clientErrors")), count)
   
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".clientErrors")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_errors_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(errors_total, select(.key | endswith(".errors")), count)
   
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".requestTimes")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_requests_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(requests_total, select(.key | 
endswith(".local.requestTimes")), count)

Review comment:
   @dsmiley it's to support the query charts showing core-level query 
metrics vs. top-level distributed query metrics added in this PR. I like 
knowing if there's an imbalance of core-level query requests going to certain 
replicas or if the load across all of my replicas is balanced. So you're 
skeptical but you haven't said why exactly? You want to change, then change it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] thelabdude commented on a change in pull request #2165: SOLR-15059: Improve query performance monitoring

2022-03-03 Thread GitBox


thelabdude commented on a change in pull request #2165:
URL: https://github.com/apache/lucene-solr/pull/2165#discussion_r818797045



##
File path: solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml
##
@@ -315,88 +477,22 @@
 node metrics
   -->
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".clientErrors")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_client_errors_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(client_errors_total, select(.key | 
endswith(".clientErrors")), count)
   
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".clientErrors")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_errors_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(errors_total, select(.key | endswith(".errors")), count)
   
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".requestTimes")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_requests_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(requests_total, select(.key | 
endswith(".local.requestTimes")), count)

Review comment:
   fwiw ~ have you actually looked at the charts added in this PR in 
Grafana with query load running? If there's a problem there, then let's fix it 
and move forward but rehashing old decisions seems unproductive at this point.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10302) PriorityQueue: optimize where we collect then iterate by using O(N) heapify

2022-03-03 Thread Greg Miller (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500898#comment-17500898
 ] 

Greg Miller commented on LUCENE-10302:
--

Thanks [~dsmiley]. I'd sketched something out as well and have it sitting over 
here on a branch: 
[https://github.com/gsmiller/lucene/tree/LUCENE-10302-pq-builder-sketch.] I'll 
have a look at your patch file and see where ideas are similar/different.

 

[~vigyas] sounds good!

> PriorityQueue: optimize where we collect then iterate by using O(N) heapify
> ---
>
> Key: LUCENE-10302
> URL: https://issues.apache.org/jira/browse/LUCENE-10302
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: David Smiley
>Priority: Major
> Attachments: LUCENE_PriorityQueue_Builder_with_heapify.patch
>
>
> Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to 
> wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue 
> when we provide a bulk array to initialize the heap/PriorityQueue.  It turns 
> out there is: the JDK's PriorityQueue supports this in its constructors, 
> referring to "This classic algorithm due to Floyd (1964) is known to be 
> O(size)" -- heapify() method.  There's 
> [another|https://www.geeksforgeeks.org/building-heap-from-array/]  that may 
> or may not be the same; I didn't look too closely yet.  I see a number of 
> uses of Lucene's PriorityQueue that first collects values and only after 
> collecting want to do something with the results (typical / unsurprising).  
> This lends itself to a builder pattern that can look similar to 
> LargeNumHitsTopDocsCollector in terms of first having an array used like a 
> list and then move over to the PriorityQueue if/when it gets full (it may 
> not).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-03-03 Thread Ankit Jain (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500900#comment-17500900
 ] 

Ankit Jain commented on LUCENE-10428:
-

I am fine with closing this issue. I will open another one in case I see the 
queries failing with this error

> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge 
> leading to busy threads in infinite loop
> -
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring, core/search
>Reporter: Ankit Jain
>Priority: Major
> Attachments: Flame_graph.png
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Customers complained about high CPU for Elasticsearch cluster in production. 
> We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v   
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205  
> AmMLzDQ4RrOJievRDeGFZw:569204  direct1645195007282 14:36:47  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075  
> emjWc5bUTG6lgnCGLulq-Q:502074  direct1645195037259 14:37:17  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270  
> emjWc5bUTG6lgnCGLulq-Q:583269  direct1645201316981 16:21:56  4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into 
> *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some 
> live JVM debugging found that 
> org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had 
> around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = 
> 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-03-03 Thread Adrien Grand (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-10428.
---
Fix Version/s: 9.1
   Resolution: Fixed

Thanks both!

> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge 
> leading to busy threads in infinite loop
> -
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/query/scoring, core/search
>Reporter: Ankit Jain
>Priority: Major
> Fix For: 9.1
>
> Attachments: Flame_graph.png
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Customers complained about high CPU for Elasticsearch cluster in production. 
> We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v   
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205  
> AmMLzDQ4RrOJievRDeGFZw:569204  direct1645195007282 14:36:47  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075  
> emjWc5bUTG6lgnCGLulq-Q:502074  direct1645195037259 14:37:17  6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270  
> emjWc5bUTG6lgnCGLulq-Q:583269  direct1645201316981 16:21:56  4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into 
> *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some 
> live JVM debugging found that 
> org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had 
> around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = 
> 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10078) Enable merge-on-refresh by default?

2022-03-03 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500911#comment-17500911
 ] 

Adrien Grand commented on LUCENE-10078:
---

LUCENE-10237 introduced a very simple MergeOnFlushMergePolicy which merges 
together all small segments upon flush.

I'd be interested in getting opinions about making it the default 
implementation for LogMergePolicy and TieredMergePolicy. Any thoughts? cc 
[~anakot]

> Enable merge-on-refresh by default?
> ---
>
> Key: LUCENE-10078
> URL: https://issues.apache.org/jira/browse/LUCENE-10078
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> This is a spinoff from the discussion in LUCENE-10073.
> The newish merge-on-refresh ([crazy origin 
> story|https://blog.mikemccandless.com/2021/03/open-source-collaboration-or-how-we.html])
>  feature is a powerful way to reduce searched segment counts, especially 
> helpful for applications using many indexing threads.  Such usage will write 
> many tiny segments on each refresh, which could quickly be merged up during 
> the {{refresh}} operation.
> We would have to implement a default for {{findFullFlushMerges}} 
> (LUCENE-10064 is open for this), and then we would need 
> {{IndexWriterConfig.getMaxFullFlushMergeWaitMillis}} a non-zero value (this 
> issue).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2165: SOLR-15059: Improve query performance monitoring

2022-03-03 Thread GitBox


dsmiley commented on a change in pull request #2165:
URL: https://github.com/apache/lucene-solr/pull/2165#discussion_r81389



##
File path: solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml
##
@@ -315,88 +477,22 @@
 node metrics
   -->
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".clientErrors")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_client_errors_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(client_errors_total, select(.key | 
endswith(".clientErrors")), count)
   
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".clientErrors")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_errors_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(errors_total, select(.key | endswith(".errors")), count)
   
   
-.metrics["solr.node"] | to_entries | .[] | select(.key | 
endswith(".requestTimes")) as $object |
-$object.key | split(".")[0] as $category |
-$object.key | split(".")[1] as $handler |
-$object.value.count as $value |
-{
-  name : "solr_metrics_node_requests_total",
-  type : "COUNTER",
-  help : "See following URL: 
https://lucene.apache.org/solr/guide/metrics-reporting.html";,
-  label_names  : ["category", "handler"],
-  label_values : [$category, $handler],
-  value: $value
-}
+$jq:node(requests_total, select(.key | 
endswith(".local.requestTimes")), count)

Review comment:
   I meant to comment on the "totalTime" metric w.r.t. it's usefulness; 
sorry for the confusion.  It's some massive number of course... it'd need to be 
divided by something else to be useful?  Also, totalTime is in nanoseconds 
lately!  https://issues.apache.org/jira/browse/SOLR-16073
   
   I understand the overarching objective of top-level vs core-level -- makes 
sense. 
   I'm a bit unclear on the distinction between the node level "$jq:node" 
metrics, and the "Local (non-distrib) query metrics", both of which are using 
".local.".
   
   RE Grafana; I haven't seen our official one in use live.  I use our 
own/custom one at work.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10441) ArrayIndexOutOfBoundsException during indexing

2022-03-03 Thread Christine Poerschke (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated LUCENE-10441:
-
Affects Version/s: 8.10

> ArrayIndexOutOfBoundsException during indexing
> --
>
> Key: LUCENE-10441
> URL: https://issues.apache.org/jira/browse/LUCENE-10441
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 8.10
>Reporter: Peixin Li
>Priority: Major
>
> Hi experts!, i have facing ArrayIndexOutOfBoundsException during indexing and 
> committing documents, this exception gives me no clue about what happened so 
> i have little information for debugging, can i have some suggest about what 
> could be and how to fix this error? i'm using Lucene 8.10.0
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: -1
>     at org.apache.lucene.util.BytesRefHash$1.get(BytesRefHash.java:179)
>     at 
> org.apache.lucene.util.StringMSBRadixSorter$1.get(StringMSBRadixSorter.java:42)
>     at 
> org.apache.lucene.util.StringMSBRadixSorter$1.setPivot(StringMSBRadixSorter.java:63)
>     at org.apache.lucene.util.Sorter.binarySort(Sorter.java:192)
>     at org.apache.lucene.util.Sorter.binarySort(Sorter.java:187)
>     at org.apache.lucene.util.IntroSorter.quicksort(IntroSorter.java:41)
>     at org.apache.lucene.util.IntroSorter.quicksort(IntroSorter.java:83)
>     at org.apache.lucene.util.IntroSorter.sort(IntroSorter.java:36)
>     at 
> org.apache.lucene.util.MSBRadixSorter.introSort(MSBRadixSorter.java:133)
>     at org.apache.lucene.util.MSBRadixSorter.sort(MSBRadixSorter.java:126)
>     at org.apache.lucene.util.MSBRadixSorter.sort(MSBRadixSorter.java:121)
>     at org.apache.lucene.util.BytesRefHash.sort(BytesRefHash.java:183)
>     at 
> org.apache.lucene.index.SortedSetDocValuesWriter.flush(SortedSetDocValuesWriter.java:171)
>     at 
> org.apache.lucene.index.DefaultIndexingChain.writeDocValues(DefaultIndexingChain.java:348)
>     at 
> org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:228)
>     at 
> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:350)
>     at 
> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:476)
>     at 
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:656)
>     at 
> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3364)
>     at 
> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3770)
>     at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3728) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10441) ArrayIndexOutOfBoundsException during indexing

2022-03-03 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500933#comment-17500933
 ] 

Christine Poerschke commented on LUCENE-10441:
--

line 179 from the stacktrace above is 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.10.0/lucene/core/src/java/org/apache/lucene/util/BytesRefHash.java#L179
 i.e.
{code}
pool.setBytesRef(scratch, bytesStart[compact[i]]);
{code}
and {{pool}} as per 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.10.0/lucene/core/src/java/org/apache/lucene/util/BytesRefHash.java#L55
 is of {{ByteBlockPool}} type. So this issue could be similar to or same as the 
LUCENE-8614 issue.

> ArrayIndexOutOfBoundsException during indexing
> --
>
> Key: LUCENE-10441
> URL: https://issues.apache.org/jira/browse/LUCENE-10441
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Peixin Li
>Priority: Major
>
> Hi experts!, i have facing ArrayIndexOutOfBoundsException during indexing and 
> committing documents, this exception gives me no clue about what happened so 
> i have little information for debugging, can i have some suggest about what 
> could be and how to fix this error? i'm using Lucene 8.10.0
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: -1
>     at org.apache.lucene.util.BytesRefHash$1.get(BytesRefHash.java:179)
>     at 
> org.apache.lucene.util.StringMSBRadixSorter$1.get(StringMSBRadixSorter.java:42)
>     at 
> org.apache.lucene.util.StringMSBRadixSorter$1.setPivot(StringMSBRadixSorter.java:63)
>     at org.apache.lucene.util.Sorter.binarySort(Sorter.java:192)
>     at org.apache.lucene.util.Sorter.binarySort(Sorter.java:187)
>     at org.apache.lucene.util.IntroSorter.quicksort(IntroSorter.java:41)
>     at org.apache.lucene.util.IntroSorter.quicksort(IntroSorter.java:83)
>     at org.apache.lucene.util.IntroSorter.sort(IntroSorter.java:36)
>     at 
> org.apache.lucene.util.MSBRadixSorter.introSort(MSBRadixSorter.java:133)
>     at org.apache.lucene.util.MSBRadixSorter.sort(MSBRadixSorter.java:126)
>     at org.apache.lucene.util.MSBRadixSorter.sort(MSBRadixSorter.java:121)
>     at org.apache.lucene.util.BytesRefHash.sort(BytesRefHash.java:183)
>     at 
> org.apache.lucene.index.SortedSetDocValuesWriter.flush(SortedSetDocValuesWriter.java:171)
>     at 
> org.apache.lucene.index.DefaultIndexingChain.writeDocValues(DefaultIndexingChain.java:348)
>     at 
> org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:228)
>     at 
> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:350)
>     at 
> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:476)
>     at 
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:656)
>     at 
> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3364)
>     at 
> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3770)
>     at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3728) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10454) UnifiedHighlighter can miss terms because of query rewrites

2022-03-03 Thread Julie Tibshirani (Jira)
Julie Tibshirani created LUCENE-10454:
-

 Summary: UnifiedHighlighter can miss terms because of query 
rewrites
 Key: LUCENE-10454
 URL: https://issues.apache.org/jira/browse/LUCENE-10454
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Julie Tibshirani


Before extracting terms from a query, UnifiedHighlighter rewrites the query 
using an empty searcher. If the query rewrites to MatchNoDocsQuery when the 
reader is empty, then the highlighter will fail to extract terms. This is more 
of an issue now that we rewrite BooleanQuery to MatchNoDocsQuery when any of 
its required clauses is MatchNoDocsQuery 
(https://issues.apache.org/jira/browse/LUCENE-10412). I attached a patch 
showing the problem.

This feels like a pretty esoteric issue, but I figured it was worth raising for 
awareness. I think it only applies when weightMatches=false, which isn't the 
default. I couldn't find any existing queries in Lucene that would be affected.

We ran into it while upgrading Elasticsearch to the latest Lucene snapshot, 
since a couple custom queries rewrite to MatchNoDocsQuery when the reader is 
empty.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10454) UnifiedHighlighter can miss terms because of query rewrites

2022-03-03 Thread Julie Tibshirani (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julie Tibshirani updated LUCENE-10454:
--
Attachment: LUCENE-10454.patch

> UnifiedHighlighter can miss terms because of query rewrites
> ---
>
> Key: LUCENE-10454
> URL: https://issues.apache.org/jira/browse/LUCENE-10454
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Julie Tibshirani
>Priority: Minor
> Attachments: LUCENE-10454.patch
>
>
> Before extracting terms from a query, UnifiedHighlighter rewrites the query 
> using an empty searcher. If the query rewrites to MatchNoDocsQuery when the 
> reader is empty, then the highlighter will fail to extract terms. This is 
> more of an issue now that we rewrite BooleanQuery to MatchNoDocsQuery when 
> any of its required clauses is MatchNoDocsQuery 
> (https://issues.apache.org/jira/browse/LUCENE-10412). I attached a patch 
> showing the problem.
> This feels like a pretty esoteric issue, but I figured it was worth raising 
> for awareness. I think it only applies when weightMatches=false, which isn't 
> the default. I couldn't find any existing queries in Lucene that would be 
> affected.
> We ran into it while upgrading Elasticsearch to the latest Lucene snapshot, 
> since a couple custom queries rewrite to MatchNoDocsQuery when the reader is 
> empty.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread Vigya Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500982#comment-17500982
 ] 

Vigya Sharma commented on LUCENE-10448:
---

The only API which can lead to unexpected big write bursts seems to be the 
{{writeBytes(byte[] b, int offset, int length)}} API in  
RateLimitedIndexOutput. We could potentially add an upper bound on the bytes 
that writeBytes attempts to write in one shot, in RateLimitedIndexOutput - 
break the byte array in chunks and check for rate limiting between each chunk. 
Would that be desirable in the wider Lucene context?

All other APIs check for rate before every write, so the instant burst rate is 
really determined by the configured {{mbPerSec}} and {{MIN_PAUSE_CHECK_MSEC}} 
values. I think this is what makes all the burst writes in this JIRA log ~0.28 
MB.

> According to my statistics, the frequency of no-pause bytes is [2%-20%],
What is the high instant burst rate you see during these no-pause writes? From 
the logs above, it should still be less than 11.2 MB/s. Maybe we should look at 
the burst write rate (in addition to/ instead of) the no-pause-write frequency?

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss merged pull request #717: LUCENE-10447: always use utf8 for forked process encoding. Use the sa…

2022-03-03 Thread GitBox


dweiss merged pull request #717:
URL: https://github.com/apache/lucene/pull/717


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10447) Charset issue in TestScripts#testLukeCanBeLaunched()

2022-03-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501009#comment-17501009
 ] 

ASF subversion and git services commented on LUCENE-10447:
--

Commit 81ab1e598f4e2a6f16de312614823c9eccb7abe2 in lucene's branch 
refs/heads/main from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=81ab1e5 ]

LUCENE-10447: always use utf8 for forked process encoding. Use the sa… (#717)



> Charset issue in TestScripts#testLukeCanBeLaunched()
> 
>
> Key: LUCENE-10447
> URL: https://issues.apache.org/jira/browse/LUCENE-10447
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: luke
>Reporter: Lu Xugang
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: 1.png, 2.png, process-10536545874299101128.out
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When run TestScripts#testLukeCanBeLaunched(), a temp file will be created in 
> the path of lucene/distribution.tests/build/tmp/tests-tmp/process-*.out, this 
> process-*.out file may contains some non StandardCharsets.US_ASCII content 
> base on Operating System language, and then a Exception will be throw because 
> later the test will read this temp file with StandardCharsets.US_ASCII.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10447) Charset issue in TestScripts#testLukeCanBeLaunched()

2022-03-03 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-10447.
--
Fix Version/s: 9.1
   Resolution: Fixed

> Charset issue in TestScripts#testLukeCanBeLaunched()
> 
>
> Key: LUCENE-10447
> URL: https://issues.apache.org/jira/browse/LUCENE-10447
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: luke
>Reporter: Lu Xugang
>Assignee: Dawid Weiss
>Priority: Minor
> Fix For: 9.1
>
> Attachments: 1.png, 2.png, process-10536545874299101128.out
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When run TestScripts#testLukeCanBeLaunched(), a temp file will be created in 
> the path of lucene/distribution.tests/build/tmp/tests-tmp/process-*.out, this 
> process-*.out file may contains some non StandardCharsets.US_ASCII content 
> base on Operating System language, and then a Exception will be throw because 
> later the test will read this temp file with StandardCharsets.US_ASCII.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10447) Charset issue in TestScripts#testLukeCanBeLaunched()

2022-03-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501013#comment-17501013
 ] 

ASF subversion and git services commented on LUCENE-10447:
--

Commit 8f92ec157f1a01e7903186da8607e3d1003b1829 in lucene's branch 
refs/heads/branch_9x from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=8f92ec1 ]

LUCENE-10447: always use utf8 for forked process encoding. Use the sa… (#717)



> Charset issue in TestScripts#testLukeCanBeLaunched()
> 
>
> Key: LUCENE-10447
> URL: https://issues.apache.org/jira/browse/LUCENE-10447
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: luke
>Reporter: Lu Xugang
>Assignee: Dawid Weiss
>Priority: Minor
> Fix For: 9.1
>
> Attachments: 1.png, 2.png, process-10536545874299101128.out
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When run TestScripts#testLukeCanBeLaunched(), a temp file will be created in 
> the path of lucene/distribution.tests/build/tmp/tests-tmp/process-*.out, this 
> process-*.out file may contains some non StandardCharsets.US_ASCII content 
> base on Operating System language, and then a Exception will be throw because 
> later the test will read this temp file with StandardCharsets.US_ASCII.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on a change in pull request #718: LUCENE-10444: Support alternate aggregation functions in association facets

2022-03-03 Thread GitBox


msokolov commented on a change in pull request #718:
URL: https://github.com/apache/lucene/pull/718#discussion_r819043542



##
File path: 
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/IntTaxonomyFacets.java
##
@@ -173,17 +185,17 @@ public FacetResult getTopChildren(int topN, String dim, 
String... path) throws I
 
 if (sparseValues != null) {
   for (IntIntCursor c : sparseValues) {
-int count = c.value;
+int value = c.value;
 int ord = c.key;
-if (parents[ord] == dimOrd && count > 0) {
-  totValue += count;
+if (parents[ord] == dimOrd && value > 0) {
+  aggregatedValue = aggregationFunction.aggregate(aggregatedValue, 
value);
   childCount++;
-  if (count > bottomValue) {
+  if (value > bottomValue) {

Review comment:
   I guess we need to ensure that aggregation functions are nondecreasing? 
I mean `min` wouldn't work very well here

##
File path: 
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FloatTaxonomyFacets.java
##
@@ -130,16 +140,16 @@ public FacetResult getTopChildren(int topN, String dim, 
String... path) throws I
   ord = siblings[ord];
 }
 
-if (sumValues == 0) {
+if (aggregatedValue == 0) {
   return null;
 }
 
 if (dimConfig.multiValued) {
   if (dimConfig.requireDimCount) {
-sumValues = values[dimOrd];
+aggregatedValue = values[dimOrd];
   } else {
 // Our sum'd count is not correct, in general:

Review comment:
   our "aggregated" count?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


msokolov commented on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058487021


   This makes sense to me, but I'm a little confused about how you described 
the test condition:
   
   > baseline: main branch where we unset RAMBufferSizeMB, which defaults to 
16Mb with force merge at the end of indexing.
   
   I'm not sure what unset means? I guess it goes to the default 16MB, but I 
assume you must be doing the same in the other test condition? Is there some 
difference in how the IndexWriter is configured between the two conditions? Or 
maybe I'm misunderstanding and you are allowing the entire set of vectors to 
buffer in RAM (in the baseline case)? But if that's the case the results are 
truly astounding! Actually I would like  to understand the difference between 
that case, and buffering on disk. Do we pay any penalty for buffering on disk? 
how much?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on a change in pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


msokolov commented on a change in pull request #728:
URL: https://github.com/apache/lucene/pull/728#discussion_r819056510



##
File path: lucene/core/src/java/org/apache/lucene/index/VectorValuesWriter.java
##
@@ -20,39 +20,53 @@
 import java.io.IOException;
 import java.nio.ByteBuffer;
 import java.nio.ByteOrder;
-import java.util.ArrayList;
-import java.util.List;
+import org.apache.lucene.codecs.CodecUtil;
 import org.apache.lucene.codecs.KnnVectorsReader;
 import org.apache.lucene.codecs.KnnVectorsWriter;
 import org.apache.lucene.search.DocIdSetIterator;
 import org.apache.lucene.search.TopDocs;
-import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.IOContext;
+import org.apache.lucene.store.IndexInput;
+import org.apache.lucene.store.IndexOutput;
 import org.apache.lucene.util.Bits;
 import org.apache.lucene.util.BytesRef;
 import org.apache.lucene.util.Counter;
-import org.apache.lucene.util.RamUsageEstimator;
+import org.apache.lucene.util.IOUtils;
 
 /**
- * Buffers up pending vector value(s) per doc, then flushes when segment 
flushes.
+ * Buffers up pending vector value per doc on disk until segment flushes.
  *
  * @lucene.experimental
  */
 class VectorValuesWriter {
 
   private final FieldInfo fieldInfo;
   private final Counter iwBytesUsed;
-  private final List vectors = new ArrayList<>();
   private final DocsWithFieldSet docsWithField;
+  private final int dim;
+  private final int byteSize;
+  private final ByteBuffer buffer;
+  private final Directory directory;
+  private final IndexOutput dataOut;
 
   private int lastDocID = -1;
 
   private long bytesUsed;
 
-  VectorValuesWriter(FieldInfo fieldInfo, Counter iwBytesUsed) {
+  VectorValuesWriter(
+  FieldInfo fieldInfo, Counter iwBytesUsed, Directory directory, String 
segmentName)
+  throws IOException {
 this.fieldInfo = fieldInfo;
 this.iwBytesUsed = iwBytesUsed;
-this.docsWithField = new DocsWithFieldSet();
-this.bytesUsed = docsWithField.ramBytesUsed();
+docsWithField = new DocsWithFieldSet();
+this.directory = directory;
+String fileName = segmentName + "_" + fieldInfo.getName() + 
"_buffered_vectors";

Review comment:
   I think fields can have pretty much any character in their name. Perhaps 
instead of using the field name, we should use its number in the filename?

##
File path: lucene/core/src/java/org/apache/lucene/index/IndexingChain.java
##
@@ -522,6 +526,18 @@ void abort() throws IOException {
 // finalizer will e.g. close any open files in the term vectors writer:

Review comment:
   maybe this comment should be updated?

##
File path: lucene/core/src/java/org/apache/lucene/index/VectorValuesWriter.java
##
@@ -20,39 +20,53 @@
 import java.io.IOException;
 import java.nio.ByteBuffer;
 import java.nio.ByteOrder;
-import java.util.ArrayList;
-import java.util.List;
+import org.apache.lucene.codecs.CodecUtil;
 import org.apache.lucene.codecs.KnnVectorsReader;
 import org.apache.lucene.codecs.KnnVectorsWriter;
 import org.apache.lucene.search.DocIdSetIterator;
 import org.apache.lucene.search.TopDocs;
-import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.IOContext;
+import org.apache.lucene.store.IndexInput;
+import org.apache.lucene.store.IndexOutput;
 import org.apache.lucene.util.Bits;
 import org.apache.lucene.util.BytesRef;
 import org.apache.lucene.util.Counter;
-import org.apache.lucene.util.RamUsageEstimator;
+import org.apache.lucene.util.IOUtils;
 
 /**
- * Buffers up pending vector value(s) per doc, then flushes when segment 
flushes.
+ * Buffers up pending vector value per doc on disk until segment flushes.
  *
  * @lucene.experimental
  */
 class VectorValuesWriter {
 
   private final FieldInfo fieldInfo;
   private final Counter iwBytesUsed;
-  private final List vectors = new ArrayList<>();
   private final DocsWithFieldSet docsWithField;
+  private final int dim;
+  private final int byteSize;
+  private final ByteBuffer buffer;
+  private final Directory directory;
+  private final IndexOutput dataOut;
 
   private int lastDocID = -1;
 
   private long bytesUsed;
 
-  VectorValuesWriter(FieldInfo fieldInfo, Counter iwBytesUsed) {
+  VectorValuesWriter(
+  FieldInfo fieldInfo, Counter iwBytesUsed, Directory directory, String 
segmentName)
+  throws IOException {
 this.fieldInfo = fieldInfo;
 this.iwBytesUsed = iwBytesUsed;
-this.docsWithField = new DocsWithFieldSet();
-this.bytesUsed = docsWithField.ramBytesUsed();
+docsWithField = new DocsWithFieldSet();
+this.directory = directory;
+String fileName = segmentName + "_" + fieldInfo.getName() + 
"_buffered_vectors";
+dataOut = directory.createTempOutput(fileName, "temp", IOContext.DEFAULT);

Review comment:
   I'm curious what does `createTempOutput` do? Does it mean if we crash 
these files w

[GitHub] [lucene] msokolov commented on pull request #718: LUCENE-10444: Support alternate aggregation functions in association facets

2022-03-03 Thread GitBox


msokolov commented on pull request #718:
URL: https://github.com/apache/lucene/pull/718#issuecomment-1058501380


   > It makes sense to me to add this capability. I wonder if the extra 
abstraction hurts us though in these tight loops summing up values in an array? 
If it does, we might want to provide a specialization for such loops as well?
   
   Oh I missed the benchmarking you did - I guess it was on the backport PR. 
Looked like no significant change there, good.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-03-03 Thread GitBox


rmuir commented on pull request #709:
URL: https://github.com/apache/lucene/pull/709#issuecomment-1058571689


   @iverase @jpountz I "undrafted" the PR and added a commit with the 
`grow(long)` that just truncates-n-forwards. It seems like the best compromise 
based on discussion above. 
   
   I also made some minor tweaks to the javadoc to try to simplify the 
explanation about what the grow parameter means. Again, it is kind of academic 
when you think about it, values larger than `maxDoc >> 8` are not really needed 
by any code because we switch to the `FixedBitSet`. But the one-liner method 
doesn't bother me that much, i am just after keeping logic simple and 
abstractions minimal.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10430) Literal double quotes cause exception in class RegExp

2022-03-03 Thread Holger Rehn (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501072#comment-17501072
 ] 

Holger Rehn commented on LUCENE-10430:
--

Thanks for the feedback! But why do I need to escape double quotes? This isn't 
a regex meta character and doesn't have a special meaning in regular 
expressions, so should be treated as literal, right? And
{code:java}
Pattern.compile( "\"" ).matcher( "\"" ).matches(){code}
simply returns true, as expected. Btw. - are you sure escaping double quotes 
really works as expected? I seem to remember to have already tried that, 
without getting the expected result... but I'm not sure.

> Literal double quotes cause exception in class RegExp
> -
>
> Key: LUCENE-10430
> URL: https://issues.apache.org/jira/browse/LUCENE-10430
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 9.0
>Reporter: Holger Rehn
>Priority: Major
>
> Class org.apache.lucene.util.automaton.RegExp fails to parse valid regular 
> expressions that contain double quotes (except in character classes). This of 
> course affects corresponding RegexpQuerys, as well.
> Example: 
> {code:java}
> Query  q = new RegexpQuery( new Term( "field", "a\"b" ) );
> RegExp r = new RegExp( "a\"b" );{code}
> Both fail with:
> {code:java}
> java.lang.IllegalArgumentException: expected '"' at position 3
>     at 
> org.apache.lucene.util.automaton.RegExp.parseSimpleExp(RegExp.java:1299)
>     at 
> org.apache.lucene.util.automaton.RegExp.parseCharClassExp(RegExp.java:1229)
>     at org.apache.lucene.util.automaton.RegExp.parseComplExp(RegExp.java:1218)
>     at 
> org.apache.lucene.util.automaton.RegExp.parseRepeatExp(RegExp.java:1192)
>     at 
> org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1185)
>     at 
> org.apache.lucene.util.automaton.RegExp.parseConcatExp(RegExp.java:1187)
>     at org.apache.lucene.util.automaton.RegExp.parseInterExp(RegExp.java:1179)
>     at org.apache.lucene.util.automaton.RegExp.parseUnionExp(RegExp.java:1173)
>     at org.apache.lucene.util.automaton.RegExp.(RegExp.java:496)
> ...{code}
> As a workaround we currently replace all double quotes with a dot.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


jtibshirani commented on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058674634


   Great that we're exploring this! I had a couple high-level thoughts:
   * If a user had 100 vector fields, then now we might have 100+ files being 
written concurrently, multiplied by the number of segments we're writing at the 
same time. It seems like this could cause problems -- should we only use this 
strategy if there are a relatively small number of vector fields? Having 100 
vector fields sounds farfetched, but I could imagine it happening as users 
experiment with ways to model long text documents.
   * It feels wasteful to be writing the vectors to a temp file in 
`IndexingChain`, then immediately reading and writing them to a temp file again 
`Lucene91HnswVectorsWriter`. I wonder if we could make a top-level 
`OffHeapVectorValues` class that's more broadly visible, so that 
`Lucene91HnswVectorsWriter` could just check if it's dealing with a file-backed 
vector values and create another one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani edited a comment on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


jtibshirani edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058674634


   Great that we're exploring this! I had a couple high-level thoughts:
   * If a user had 100 vector fields, then now we might have 100+ files being 
written concurrently, multiplied by the number of segments we're writing at the 
same time. It seems like this could cause problems -- should we only use this 
strategy if there are a relatively small number of vector fields? Having 100 
vector fields sounds farfetched, but I could imagine it happening as users 
experiment with ways to model long text documents.
   * It feels wasteful to be writing the vectors to a temp file in 
`IndexingChain`, then immediately reading and writing them to a temp file again 
`Lucene91HnswVectorsWriter`. I wonder if we could make a top-level 
`OffHeapVectorValues` class that's more broadly visible, so that 
`Lucene91HnswVectorsWriter` could just check if it's dealing with a file-backed 
vector values and avoid creating another one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10302) PriorityQueue: optimize where we collect then iterate by using O(N) heapify

2022-03-03 Thread Vigya Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501101#comment-17501101
 ] 

Vigya Sharma commented on LUCENE-10302:
---

[~gsmiller] Is the link you pasted working? I get a 404 when I try to open it.

> PriorityQueue: optimize where we collect then iterate by using O(N) heapify
> ---
>
> Key: LUCENE-10302
> URL: https://issues.apache.org/jira/browse/LUCENE-10302
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: David Smiley
>Priority: Major
> Attachments: LUCENE_PriorityQueue_Builder_with_heapify.patch
>
>
> Looking at LUCENE-8875 (LargeNumHitsTopDocsCollector.java ) I got to 
> wondering if there was faster-than O(N*log(N)) way of loading a PriorityQueue 
> when we provide a bulk array to initialize the heap/PriorityQueue.  It turns 
> out there is: the JDK's PriorityQueue supports this in its constructors, 
> referring to "This classic algorithm due to Floyd (1964) is known to be 
> O(size)" -- heapify() method.  There's 
> [another|https://www.geeksforgeeks.org/building-heap-from-array/]  that may 
> or may not be the same; I didn't look too closely yet.  I see a number of 
> uses of Lucene's PriorityQueue that first collects values and only after 
> collecting want to do something with the results (typical / unsurprising).  
> This lends itself to a builder pattern that can look similar to 
> LargeNumHitsTopDocsCollector in terms of first having an array used like a 
> list and then move over to the PriorityQueue if/when it gets full (it may 
> not).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread Vigya Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500982#comment-17500982
 ] 

Vigya Sharma edited comment on LUCENE-10448 at 3/4/22, 2:06 AM:


The only API which can lead to unexpected big write bursts seems to be the 
{{writeBytes(byte[] b, int offset, int length)}} API in  
RateLimitedIndexOutput. We could potentially add an upper bound on the bytes 
that writeBytes attempts to write in one shot, in RateLimitedIndexOutput - 
break the byte array in chunks and check for rate limiting between each chunk. 
Would that be desirable in the wider Lucene context?

All other APIs check for rate before every write, so the instant burst rate is 
really determined by the configured {{mbPerSec}} and {{MIN_PAUSE_CHECK_MSEC}} 
values. I think this is what makes all the burst writes in this JIRA log ~0.28 
MB.
{quote}According to my statistics, the frequency of no-pause bytes is [2%-20%],
{quote}
What is the high instant burst rate you see during these no-pause writes? From 
the logs above, it should still be less than 11.2 MB/s. Maybe we should look at 
the burst write rate (in addition to/ instead of) the no-pause-write frequency?


was (Author: vigyas):
The only API which can lead to unexpected big write bursts seems to be the 
{{writeBytes(byte[] b, int offset, int length)}} API in  
RateLimitedIndexOutput. We could potentially add an upper bound on the bytes 
that writeBytes attempts to write in one shot, in RateLimitedIndexOutput - 
break the byte array in chunks and check for rate limiting between each chunk. 
Would that be desirable in the wider Lucene context?

All other APIs check for rate before every write, so the instant burst rate is 
really determined by the configured {{mbPerSec}} and {{MIN_PAUSE_CHECK_MSEC}} 
values. I think this is what makes all the burst writes in this JIRA log ~0.28 
MB.

> According to my statistics, the frequency of no-pause bytes is [2%-20%],
What is the high instant burst rate you see during these no-pause writes? From 
the logs above, it should still be less than 11.2 MB/s. Maybe we should look at 
the burst write rate (in addition to/ instead of) the no-pause-write frequency?

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

[jira] [Commented] (LUCENE-10162) Add IntField, LongField, FloatField and DoubleField classes to index both points and doc values

2022-03-03 Thread Lu Xugang (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501122#comment-17501122
 ] 

Lu Xugang commented on LUCENE-10162:


Move the conversation from LUCENE-10446 about current issue to one place:

{quote}I think that one way we could make the situation better would be by 
implementing LUCENE-10162 to create fields that index both points and doc 
values. Then factory methods on these fields would know exactly how the field 
is indexed and they could make the best decision without having to hurt the API 
by merging what PointRangeQuery, IndexOrDocValuesQuery and 
IndexSortSortedNumericDocValuesRangeQuery do:
 - If the points index tells us that all docs match, then return 
DocIdSetIterator#range(0,maxDoc).
 - If the field is the primary index sort, then use the index to figure out the 
min and max values and return the appropriate range.
 - Otherwise do what IndexOrDocValuesQuery is doing today.

One thought I had in mind when opening LUCENE-10162 was that we could return 
queries that can more easily do the right thing because they know both{quote}

> Add IntField, LongField, FloatField and DoubleField classes to index both 
> points and doc values
> ---
>
> Key: LUCENE-10162
> URL: https://issues.apache.org/jira/browse/LUCENE-10162
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> Currently we have IntPoint, LongPoint, FloatPoint and DoublePoint on the one 
> hand, and NumericDocValuesField and SortedNumericDocValuesField on the other 
> hand.
> When we introduced these classes, this distinction made sense: use the 
> XXXPoint classes if you want your numeric fields to be searchable and the 
> XXXDocValuesField classes if you want your numeric fields to be 
> sortable/aggregatable.
> However since then, we introduced logic to take advantage of doc values for 
> filtering (IndexOrDocValuesQuery) and enhanced sorting to take advantage of 
> the Points index to skip non-competitive documents. So even if you only need 
> searching, or if you only need sorting, it's likely a good idea to index both 
> with points *and* doc values.
> Could we make this easier on users by having XXXField classes that 
> automatically do it as opposed to requiring users to add both an XXXPoint and 
> an XXXDocValuesField for every numeric field to their index? This could also 
> make consuming these fields easier, e.g. factory methods for range queries 
> could automatically use IndexOrDocValuesQuery.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10162) Add IntField, LongField, FloatField and DoubleField classes to index both points and doc values

2022-03-03 Thread Lu Xugang (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lu Xugang updated LUCENE-10162:
---
Attachment: LUCENE-10162-1.patch

> Add IntField, LongField, FloatField and DoubleField classes to index both 
> points and doc values
> ---
>
> Key: LUCENE-10162
> URL: https://issues.apache.org/jira/browse/LUCENE-10162
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-10162-1.patch
>
>
> Currently we have IntPoint, LongPoint, FloatPoint and DoublePoint on the one 
> hand, and NumericDocValuesField and SortedNumericDocValuesField on the other 
> hand.
> When we introduced these classes, this distinction made sense: use the 
> XXXPoint classes if you want your numeric fields to be searchable and the 
> XXXDocValuesField classes if you want your numeric fields to be 
> sortable/aggregatable.
> However since then, we introduced logic to take advantage of doc values for 
> filtering (IndexOrDocValuesQuery) and enhanced sorting to take advantage of 
> the Points index to skip non-competitive documents. So even if you only need 
> searching, or if you only need sorting, it's likely a good idea to index both 
> with points *and* doc values.
> Could we make this easier on users by having XXXField classes that 
> automatically do it as opposed to requiring users to add both an XXXPoint and 
> an XXXDocValuesField for every numeric field to their index? This could also 
> make consuming these fields easier, e.g. factory methods for range queries 
> could automatically use IndexOrDocValuesQuery.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10162) Add IntField, LongField, FloatField and DoubleField classes to index both points and doc values

2022-03-03 Thread Lu Xugang (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501124#comment-17501124
 ] 

Lu Xugang commented on LUCENE-10162:


This is a new Query within the quick patch named NumericRangeQuery. It's 
implementation is a simply merge that how PointRangeQuery, 
IndexOrDocValuesQuery and IndexSortSortedNumericDocValuesRangeQuery supply a 
Scorer. 

Just wanner confirm if it is closed to  LUCENE-10162's thought.

[^LUCENE-10162-1.patch]

> Add IntField, LongField, FloatField and DoubleField classes to index both 
> points and doc values
> ---
>
> Key: LUCENE-10162
> URL: https://issues.apache.org/jira/browse/LUCENE-10162
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: LUCENE-10162-1.patch
>
>
> Currently we have IntPoint, LongPoint, FloatPoint and DoublePoint on the one 
> hand, and NumericDocValuesField and SortedNumericDocValuesField on the other 
> hand.
> When we introduced these classes, this distinction made sense: use the 
> XXXPoint classes if you want your numeric fields to be searchable and the 
> XXXDocValuesField classes if you want your numeric fields to be 
> sortable/aggregatable.
> However since then, we introduced logic to take advantage of doc values for 
> filtering (IndexOrDocValuesQuery) and enhanced sorting to take advantage of 
> the Points index to skip non-competitive documents. So even if you only need 
> searching, or if you only need sorting, it's likely a good idea to index both 
> with points *and* doc values.
> Could we make this easier on users by having XXXField classes that 
> automatically do it as opposed to requiring users to add both an XXXPoint and 
> an XXXDocValuesField for every numeric field to their index? This could also 
> make consuming these fields easier, e.g. factory methods for range queries 
> could automatically use IndexOrDocValuesQuery.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


rmuir commented on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058815901


   Sorry I see it differently.
   
   I'm not a fan of IndexWriter handling the temporary files/encoding/decoding 
data, this seems to be in the wrong place.
   
   If IndexWriter shouldn't buffer vectors, then can it simply stream vectors 
to the codec api? This would be similar to how StoredFields and TermVectors 
work today (see e.g. StoredFieldsConsumer). 
   
   The problem is, today we have two cases of IndexWriter behavior
   1. Stuff that indexwriter buffers in memory and flushes in batches to the 
codec (terms, postings, docvalues, etc)
   2. Stuff that indexwriter streams directly to the codec (stored fields, term 
vectors)
   
   For our own sanity, let's avoid adding a third case :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #728: LUCENE-10194 Buffer KNN vectors on disk

2022-03-03 Thread GitBox


rmuir commented on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1058821263


   I'm suspicious of the reported performance improvement based on looking at 
your benchmark output, I don't think its realistic. Looks like nothing else was 
indexed in any other way (docvalues/postings/etc), nobody ever called reopen() 
to force any flushes, so with the benchmark you ran, IW just wrote one big 
segment, avoiding all merging. So everything looks fantastic on paper, but this 
isn't realistic.
   
   It is easy to run into the same trap when benchmarking e.g. stored fields 
and other things. But it isn't really a performance improvement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10455) IndexSortSortedNumericDocValuesRangeQuery should implement Weight#scorerSupplier(LeafReaderContext)

2022-03-03 Thread Lu Xugang (Jira)
Lu Xugang created LUCENE-10455:
--

 Summary: IndexSortSortedNumericDocValuesRangeQuery should 
implement Weight#scorerSupplier(LeafReaderContext)
 Key: LUCENE-10455
 URL: https://issues.apache.org/jira/browse/LUCENE-10455
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Lu Xugang


IndexOrDocValuesQuery was used as a fallbackQuery of 
IndexSortSortedNumericDocValuesRangeQuery in Elasticsearch, but When 
IndexSortSortedNumericDocValuesRangeQuery can't take advantage of index sort, 
the fallbackQuery(IndexOrDocValuesQuery)  always only supply Scorer by 
indexQuery, beacuse IndexSortSortedNumericDocValuesRangeQuery did not implement 
Weight#scorerSupplier(LeafReaderContext).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10078) Enable merge-on-refresh by default?

2022-03-03 Thread Anand Kotriwal (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501193#comment-17501193
 ] 

Anand Kotriwal commented on LUCENE-10078:
-

I like the idea of adding this feature to LogMergePolicy and TieredMergePolicy. 
If we agree on making it default I can work on a PR.  

> Enable merge-on-refresh by default?
> ---
>
> Key: LUCENE-10078
> URL: https://issues.apache.org/jira/browse/LUCENE-10078
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> This is a spinoff from the discussion in LUCENE-10073.
> The newish merge-on-refresh ([crazy origin 
> story|https://blog.mikemccandless.com/2021/03/open-source-collaboration-or-how-we.html])
>  feature is a powerful way to reduce searched segment counts, especially 
> helpful for applications using many indexing threads.  Such usage will write 
> many tiny segments on each refresh, which could quickly be merged up during 
> the {{refresh}} operation.
> We would have to implement a default for {{findFullFlushMerges}} 
> (LUCENE-10064 is open for this), and then we would need 
> {{IndexWriterConfig.getMaxFullFlushMergeWaitMillis}} a non-zero value (this 
> issue).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10078) Enable merge-on-refresh by default?

2022-03-03 Thread Anand Kotriwal (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501193#comment-17501193
 ] 

Anand Kotriwal edited comment on LUCENE-10078 at 3/4/22, 7:19 AM:
--

I like the idea of adding this feature to LogMergePolicy and TieredMergePolicy. 
If we agree on making it default I can work on a PR that does this.  


was (Author: anakot):
I like the idea of adding this feature to LogMergePolicy and TieredMergePolicy. 
If we agree on making it default I can work on a PR.  

> Enable merge-on-refresh by default?
> ---
>
> Key: LUCENE-10078
> URL: https://issues.apache.org/jira/browse/LUCENE-10078
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Priority: Major
>
> This is a spinoff from the discussion in LUCENE-10073.
> The newish merge-on-refresh ([crazy origin 
> story|https://blog.mikemccandless.com/2021/03/open-source-collaboration-or-how-we.html])
>  feature is a powerful way to reduce searched segment counts, especially 
> helpful for applications using many indexing threads.  Such usage will write 
> many tiny segments on each refresh, which could quickly be merged up during 
> the {{refresh}} operation.
> We would have to implement a default for {{findFullFlushMerges}} 
> (LUCENE-10064 is open for this), and then we would need 
> {{IndexWriterConfig.getMaxFullFlushMergeWaitMillis}} a non-zero value (this 
> issue).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] LuXugang opened a new pull request #729: LUCENE-10455: IndexSortSortedNumericDocValuesRangeQuery should implement Weight#scorerSupplier(LeafReaderContext)

2022-03-03 Thread GitBox


LuXugang opened a new pull request #729:
URL: https://github.com/apache/lucene/pull/729


   IndexOrDocValuesQuery was used as a fallbackQuery of 
IndexSortSortedNumericDocValuesRangeQuery in Elasticsearch, but When 
IndexSortSortedNumericDocValuesRangeQuery can't take advantage of index sort, 
the fallbackQuery(IndexOrDocValuesQuery)  always only supply Scorer by 
indexQuery, beacuse IndexSortSortedNumericDocValuesRangeQuery did not implement 
Weight#scorerSupplier(LeafReaderContext).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org