[jira] [Commented] (LUCENE-10425) count aggregation optimization inside one segment in log scenario
[ https://issues.apache.org/jira/browse/LUCENE-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501913#comment-17501913 ] jianping weng commented on LUCENE-10425: > I'm not sure #687 actually helps compared to what we are already doing. [~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now, it can speed up getting min/max doc Id when index sort ascend enabled instead of use docValue binary search > count aggregation optimization inside one segment in log scenario > - > > Key: LUCENE-10425 > URL: https://issues.apache.org/jira/browse/LUCENE-10425 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search >Reporter: jianping weng >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > In log scenario, we usually want to know the doc count of documents between > every time intervals. One possible optimized method is to sort the docuemt in > ascend order according to @timestamp field in one segment. then we can use > this pr [https://github.com/apache/lucene/pull/687] to find out the min/max > docId in on time interval. > If there is no other filter query, the doc count of one time interval is (max > docId- min docId +1) > if there is only one another term filter query, we can use this pr > [https://github.com/apache/lucene/pull/688 > |https://github.com/apache/lucene/pull/688]to get the diff value of index, > when we call advance(minId) and advance(maxId), the diff value is also the > doc count of one time interval > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10425) count aggregation optimization inside one segment in log scenario
[ https://issues.apache.org/jira/browse/LUCENE-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501913#comment-17501913 ] jianping weng edited comment on LUCENE-10425 at 3/6/22, 9:35 AM: - > I'm not sure #687 actually helps compared to what we are already doing. [~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now? it can speed up getting min/max doc Id when index sort ascend enabled instead of use docValue binary search was (Author: JIRAUSER285389): > I'm not sure #687 actually helps compared to what we are already doing. [~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now, it can speed up getting min/max doc Id when index sort ascend enabled instead of use docValue binary search > count aggregation optimization inside one segment in log scenario > - > > Key: LUCENE-10425 > URL: https://issues.apache.org/jira/browse/LUCENE-10425 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search >Reporter: jianping weng >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > In log scenario, we usually want to know the doc count of documents between > every time intervals. One possible optimized method is to sort the docuemt in > ascend order according to @timestamp field in one segment. then we can use > this pr [https://github.com/apache/lucene/pull/687] to find out the min/max > docId in on time interval. > If there is no other filter query, the doc count of one time interval is (max > docId- min docId +1) > if there is only one another term filter query, we can use this pr > [https://github.com/apache/lucene/pull/688 > |https://github.com/apache/lucene/pull/688]to get the diff value of index, > when we call advance(minId) and advance(maxId), the diff value is also the > doc count of one time interval > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-10425) count aggregation optimization inside one segment in log scenario
[ https://issues.apache.org/jira/browse/LUCENE-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501913#comment-17501913 ] jianping weng edited comment on LUCENE-10425 at 3/6/22, 9:38 AM: - > I'm not sure #687 actually helps compared to what we are already doing. [~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now? it can speed up getting min/max doc Id when index sort ascend enabled instead of using docValue binary search was (Author: JIRAUSER285389): > I'm not sure #687 actually helps compared to what we are already doing. [~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now? it can speed up getting min/max doc Id when index sort ascend enabled instead of use docValue binary search > count aggregation optimization inside one segment in log scenario > - > > Key: LUCENE-10425 > URL: https://issues.apache.org/jira/browse/LUCENE-10425 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search >Reporter: jianping weng >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > In log scenario, we usually want to know the doc count of documents between > every time intervals. One possible optimized method is to sort the docuemt in > ascend order according to @timestamp field in one segment. then we can use > this pr [https://github.com/apache/lucene/pull/687] to find out the min/max > docId in on time interval. > If there is no other filter query, the doc count of one time interval is (max > docId- min docId +1) > if there is only one another term filter query, we can use this pr > [https://github.com/apache/lucene/pull/688 > |https://github.com/apache/lucene/pull/688]to get the diff value of index, > when we call advance(minId) and advance(maxId), the diff value is also the > doc count of one time interval > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] Yuti-G closed pull request #732: Fix typo in the documentation of TaxonomyReader
Yuti-G closed pull request #732: URL: https://github.com/apache/lucene/pull/732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] msokolov commented on pull request #732: Fix typo in the documentation of TaxonomyReader
msokolov commented on pull request #732: URL: https://github.com/apache/lucene/pull/732#issuecomment-1060071053 not sure why you closed, looks good to me, I'll commit -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] msokolov merged pull request #732: Fix typo in the documentation of TaxonomyReader
msokolov merged pull request #732: URL: https://github.com/apache/lucene/pull/732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] Yuti-G closed pull request #733: fix a typo in the Javadoc of TaxonomyReader
Yuti-G closed pull request #733: URL: https://github.com/apache/lucene/pull/733 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] Yuti-G commented on pull request #732: Fix typo in the documentation of TaxonomyReader
Yuti-G commented on pull request #732: URL: https://github.com/apache/lucene/pull/732#issuecomment-1060104593 Thanks @msokolov! I forgot to change account in GitHub Desktop and accidentally pushed this pr using my two GitHub accounts. I thought it would be better to use one account, and pushed #733 (closed now). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] Yuti-G removed a comment on pull request #732: Fix typo in the documentation of TaxonomyReader
Yuti-G removed a comment on pull request #732: URL: https://github.com/apache/lucene/pull/732#issuecomment-1060104593 Thanks @msokolov! I forgot to change account in GitHub Desktop and accidentally pushed this pr using my two GitHub accounts. I thought it would be better to use one account, and pushed #733 (closed now). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] Yuti-G commented on pull request #732: Fix typo in the documentation of TaxonomyReader
Yuti-G commented on pull request #732: URL: https://github.com/apache/lucene/pull/732#issuecomment-1060105020 > not sure why you closed, looks good to me, I'll commit Thanks @msokolov! I forgot to change account in GitHub Desktop and accidentally pushed this pr using my two GitHub accounts. I thought it would be better to use one account, and pushed https://github.com/apache/lucene/pull/733 (closed now). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076 ] kkewwei commented on LUCENE-10448: -- [~vigyas] I count the burst write rate of no-pause bytes with high pressure of writing: {code:java} [2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49], [detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]] {code} *callTimes=852* means that MergeRateLimiter.pause is called 852 times. *ignorePauseTimes=49* means that there are 49 no-pause times in 852 times. *detailBytes(mb)* means the detail no-pause bytes, total count is 49. *detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*. *biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited rate(29.2 MB/sec throttle), we can see that the max instant rate is 460.67938mb/s, which is 10 times the limited rate. The burst write rate (in addition to/ instead of) the no-pause-write frequency is about 0-10%, It depends on the writing pressure, In my test, the write thread is relatively busy. This is how I count the statistics. {code:java} @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { if (bytesSinceLastPause == 0) { // writing time start at writing startTime = System.nanoTime(); } bytesSinceLastPause += length; delegate.writeBytes(b, offset, length); checkRate(); } private void checkRate() throws IOException { if (bytesSinceLastPause > currentMinPauseCheckBytes) { // count the last time. rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime); bytesSinceLastPause = 0; currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); } } {code} > MergeRateLimiter doesn't always limit instant rate. > --- > > Key: LUCENE-10448 > URL: https://issues.apache.org/jira/browse/LUCENE-10448 > Project: Lucene - Core > Issue Type: Bug > Components: core/other >Affects Versions: 8.11.1 >Reporter: kkewwei >Priority: Major > > We can see the code in *MergeRateLimiter*: > {code:java} > private long maybePause(long bytes, long curNS) throws > MergePolicy.MergeAbortedException { > > double rate = mbPerSec; > double secondsToPause = (bytes / 1024. / 1024.) / rate; > long targetNS = lastNS + (long) (10 * secondsToPause); > long curPauseNS = targetNS - curNS; > // We don't bother with thread pausing if the pause is smaller than 2 > msec. > if (curPauseNS <= MIN_PAUSE_NS) { > // Set to curNS, not targetNS, to enforce the instant rate, not > // the "averaged over all history" rate: > lastNS = curNS; > return -1; > } >.. > } > {code} > If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, > then the *maybePause* is called in 7:05 again, so the value of > *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than > *curNS*, no matter how big the bytes is, we will return -1 and ignore to > pause. > I count the total times(callTimes) calling *maybePause* and ignored pause > times(ignorePauseTimes) and detail ignored bytes(detailBytes): > {code:java} > [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] > [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 > docs],
[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076 ] kkewwei edited comment on LUCENE-10448 at 3/7/22, 4:59 AM: --- [~vigyas] I count the burst write rate of no-pause bytes with high pressure of writing: {code:java} [2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49], [detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]] {code} *callTimes=852* means that MergeRateLimiter.pause is called 852 times. *ignorePauseTimes=49* means that there are 49 no-pause times in 852 times. *detailBytes(mb)* means the detail no-pause bytes, total count is 49. *detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*. *biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited rate(29.2 MB/sec throttle), we can see that the max instant rate is 460.67938mb/s, which is 10 times the limited rate. The burst write rate (in addition to/ instead of) the no-pause-write frequency is about 0-10%, It depends on the writing pressure, In my test, the write thread is relatively busy. This is how I count the statistics. {code:java} @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { if (bytesSinceLastPause == 0) { // writing time start at writing startTime = System.nanoTime(); } bytesSinceLastPause += length; delegate.writeBytes(b, offset, length); checkRate(); } private void checkRate() throws IOException { if (bytesSinceLastPause > currentMinPauseCheckBytes) { // count the lasted time. rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime); bytesSinceLastPause = 0; currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); } } {code} with the lasted time and writing bytes, It's easy to compute the instant rate. was (Author: kkewwei): [~vigyas] I count the burst write rate of no-pause bytes with high pressure of writing: {code:java} [2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49], [detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], [bi
[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076 ] kkewwei edited comment on LUCENE-10448 at 3/7/22, 4:59 AM: --- [~vigyas] I count the burst write rate of no-pause bytes with high pressure of writing: {code:java} [2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49], [detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]] {code} *callTimes=852* means that MergeRateLimiter.pause is called 852 times. *ignorePauseTimes=49* means that there are 49 no-pause times in 852 times. *detailBytes(mb)* means the detail no-pause bytes, total count is 49. *detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*. *biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited rate(29.2 MB/sec throttle), we can see that the max instant rate is 460.67938mb/s, which is 10 times the limited rate. The burst write rate (in addition to/ instead of) the no-pause-write frequency is about 0-10%, It depends on the writing pressure, In my test, the write thread is relatively busy. This is how I count the statistics. {code:java} @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { if (bytesSinceLastPause == 0) { // writing time start at writing startTime = System.nanoTime(); } bytesSinceLastPause += length; delegate.writeBytes(b, offset, length); checkRate(); } private void checkRate() throws IOException { if (bytesSinceLastPause > currentMinPauseCheckBytes) { // count the last time. rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime); bytesSinceLastPause = 0; currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); } } {code} with the lasted time and writing bytes, It's easy to compute the instant rate. was (Author: kkewwei): [~vigyas] I count the burst write rate of no-pause bytes with high pressure of writing: {code:java} [2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49], [detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], [bigg
[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076 ] kkewwei edited comment on LUCENE-10448 at 3/7/22, 6:26 AM: --- [~vigyas] I count the burst write rate of no-pause bytes with high pressure of writing: {code:java} [2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49], [detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]] {code} *callTimes=852* means that MergeRateLimiter.pause is called 852 times. *ignorePauseTimes=49* means that there are 49 no-pause times in 852 times. *detailBytes(mb)* means the detail no-pause bytes, total count is 49. *detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*. *biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited rate(29.2 MB/sec throttle), we can see that the max instant rate is 460.67938mb/s, which is 10 times the limited rate. The burst write rate (in addition to/ instead of) the no-pause-write frequency is about 0-10%, It depends on the writing pressure, In my test, the write thread is relatively busy. This is how I count the statistics. {code:java} @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { if (bytesSinceLastPause == 0) { // writing time start at writing startTime = System.nanoTime(); } bytesSinceLastPause += length; delegate.writeBytes(b, offset, length); checkRate(); } private void checkRate() throws IOException { if (bytesSinceLastPause > currentMinPauseCheckBytes) { // count the lasted time. rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime); bytesSinceLastPause = 0; currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); } } {code} with the lasted time and writing bytes, It's easy to compute the instant rate. was (Author: kkewwei): [~vigyas] I count the burst write rate of no-pause bytes with high pressure of writing: {code:java} [2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] [log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49], [detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], [biggerThanLimitedRate(mb/s) = [460.