date:20220306

[jira] [Commented] (LUCENE-10425) count aggregation optimization inside one segment in log scenario

2022-03-06 Thread jianping weng (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501913#comment-17501913
 ] 

jianping weng commented on LUCENE-10425:


> I'm not sure #687 actually helps compared to what we are already doing.

[~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now, it 
can speed up getting min/max doc Id when index sort ascend enabled instead of 
use docValue binary search

> count aggregation optimization inside one segment in log scenario
> -
>
> Key: LUCENE-10425
> URL: https://issues.apache.org/jira/browse/LUCENE-10425
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Reporter: jianping weng
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> In log scenario, we usually want to know the doc count of documents between 
> every time intervals. One possible optimized method is to sort the docuemt in 
> ascend order according to @timestamp field in one segment. then we can use    
> this pr [https://github.com/apache/lucene/pull/687] to find out the min/max 
> docId in on time interval.
> If there is no other filter query, the doc count of one time interval is (max 
> docId- min docId +1)
> if there is only one another term filter query, we can use this pr 
> [https://github.com/apache/lucene/pull/688 
> |https://github.com/apache/lucene/pull/688]to get the diff value of index, 
> when we call advance(minId) and advance(maxId), the diff value is also the 
> doc count of one time interval
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-10425) count aggregation optimization inside one segment in log scenario

2022-03-06 Thread jianping weng (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501913#comment-17501913
 ] 

jianping weng edited comment on LUCENE-10425 at 3/6/22, 9:35 AM:
-

> I'm not sure #687 actually helps compared to what we are already doing.

[~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now? it 
can speed up getting min/max doc Id when index sort ascend enabled instead of 
use docValue binary search


was (Author: JIRAUSER285389):
> I'm not sure #687 actually helps compared to what we are already doing.

[~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now, it 
can speed up getting min/max doc Id when index sort ascend enabled instead of 
use docValue binary search

> count aggregation optimization inside one segment in log scenario
> -
>
> Key: LUCENE-10425
> URL: https://issues.apache.org/jira/browse/LUCENE-10425
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Reporter: jianping weng
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> In log scenario, we usually want to know the doc count of documents between 
> every time intervals. One possible optimized method is to sort the docuemt in 
> ascend order according to @timestamp field in one segment. then we can use    
> this pr [https://github.com/apache/lucene/pull/687] to find out the min/max 
> docId in on time interval.
> If there is no other filter query, the doc count of one time interval is (max 
> docId- min docId +1)
> if there is only one another term filter query, we can use this pr 
> [https://github.com/apache/lucene/pull/688 
> |https://github.com/apache/lucene/pull/688]to get the diff value of index, 
> when we call advance(minId) and advance(maxId), the diff value is also the 
> doc count of one time interval
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-10425) count aggregation optimization inside one segment in log scenario

2022-03-06 Thread jianping weng (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501913#comment-17501913
 ] 

jianping weng edited comment on LUCENE-10425 at 3/6/22, 9:38 AM:
-

> I'm not sure #687 actually helps compared to what we are already doing.

[~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now? it 
can speed up getting min/max doc Id when index sort ascend enabled instead of 
using docValue binary search


was (Author: JIRAUSER285389):
> I'm not sure #687 actually helps compared to what we are already doing.

[~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now? it 
can speed up getting min/max doc Id when index sort ascend enabled instead of 
use docValue binary search

> count aggregation optimization inside one segment in log scenario
> -
>
> Key: LUCENE-10425
> URL: https://issues.apache.org/jira/browse/LUCENE-10425
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/search
>Reporter: jianping weng
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> In log scenario, we usually want to know the doc count of documents between 
> every time intervals. One possible optimized method is to sort the docuemt in 
> ascend order according to @timestamp field in one segment. then we can use    
> this pr [https://github.com/apache/lucene/pull/687] to find out the min/max 
> docId in on time interval.
> If there is no other filter query, the doc count of one time interval is (max 
> docId- min docId +1)
> if there is only one another term filter query, we can use this pr 
> [https://github.com/apache/lucene/pull/688 
> |https://github.com/apache/lucene/pull/688]to get the diff value of index, 
> when we call advance(minId) and advance(maxId), the diff value is also the 
> doc count of one time interval
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] Yuti-G closed pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-06 Thread GitBox



Yuti-G closed pull request #732:
URL: https://github.com/apache/lucene/pull/732


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] msokolov commented on pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-06 Thread GitBox



msokolov commented on pull request #732:
URL: https://github.com/apache/lucene/pull/732#issuecomment-1060071053


   not sure why you closed, looks good to me, I'll commit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] msokolov merged pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-06 Thread GitBox



msokolov merged pull request #732:
URL: https://github.com/apache/lucene/pull/732


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] Yuti-G closed pull request #733: fix a typo in the Javadoc of TaxonomyReader

2022-03-06 Thread GitBox



Yuti-G closed pull request #733:
URL: https://github.com/apache/lucene/pull/733


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] Yuti-G commented on pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-06 Thread GitBox



Yuti-G commented on pull request #732:
URL: https://github.com/apache/lucene/pull/732#issuecomment-1060104593


   Thanks @msokolov! I forgot to change account in GitHub Desktop and 
accidentally pushed this pr using my two GitHub accounts. I thought it would be 
better to use one account, and pushed #733 (closed now).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] Yuti-G removed a comment on pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-06 Thread GitBox



Yuti-G removed a comment on pull request #732:
URL: https://github.com/apache/lucene/pull/732#issuecomment-1060104593


   Thanks @msokolov! I forgot to change account in GitHub Desktop and 
accidentally pushed this pr using my two GitHub accounts. I thought it would be 
better to use one account, and pushed #733 (closed now).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] Yuti-G commented on pull request #732: Fix typo in the documentation of TaxonomyReader

2022-03-06 Thread GitBox



Yuti-G commented on pull request #732:
URL: https://github.com/apache/lucene/pull/732#issuecomment-1060105020


   > not sure why you closed, looks good to me, I'll commit
   
   Thanks @msokolov! I forgot to change account in GitHub Desktop and 
accidentally pushed this pr using my two GitHub accounts. I thought it would be 
better to use one account, and pushed https://github.com/apache/lucene/pull/733 
(closed now).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-06 Thread kkewwei (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076
 ] 

kkewwei commented on LUCENE-10448:
--

[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 
290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the last time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}






> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs],

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-06 Thread kkewwei (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/7/22, 4:59 AM:
---

[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 
290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the lasted time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [bi

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-06 Thread kkewwei (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/7/22, 4:59 AM:
---

[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 
290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the last time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [bigg

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-06 Thread kkewwei (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/7/22, 6:26 AM:
---

[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the lasted time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [biggerThanLimitedRate(mb/s) = [460.

[jira] [Commented] (LUCENE-10425) count aggregation optimization inside one segment in log scenario

[jira] [Comment Edited] (LUCENE-10425) count aggregation optimization inside one segment in log scenario

[jira] [Comment Edited] (LUCENE-10425) count aggregation optimization inside one segment in log scenario

[GitHub] [lucene] Yuti-G closed pull request #732: Fix typo in the documentation of TaxonomyReader

[GitHub] [lucene] msokolov commented on pull request #732: Fix typo in the documentation of TaxonomyReader

[GitHub] [lucene] msokolov merged pull request #732: Fix typo in the documentation of TaxonomyReader

[GitHub] [lucene] Yuti-G closed pull request #733: fix a typo in the Javadoc of TaxonomyReader

[GitHub] [lucene] Yuti-G commented on pull request #732: Fix typo in the documentation of TaxonomyReader

[GitHub] [lucene] Yuti-G removed a comment on pull request #732: Fix typo in the documentation of TaxonomyReader

[GitHub] [lucene] Yuti-G commented on pull request #732: Fix typo in the documentation of TaxonomyReader

[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

14 matches

Site Navigation

Mail list logo

Footer information