Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-20 Thread via GitHub
Jackie-Jiang merged PR #11943: URL: https://github.com/apache/pinot/pull/11943 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-19 Thread via GitHub
jadami10 commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1398503477 ## pinot-spi/src/main/java/org/apache/pinot/spi/stream/StreamMetadataProvider.java: ## @@ -43,6 +45,18 @@ public interface StreamMetadataProvider extends Closeable {

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-17 Thread via GitHub
Jackie-Jiang commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1397830314 ## pinot-spi/src/main/java/org/apache/pinot/spi/stream/StreamMetadataProvider.java: ## @@ -43,6 +45,18 @@ public interface StreamMetadataProvider extends Closeable

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-17 Thread via GitHub
jadami10 commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1397778891 ## pinot-spi/src/main/java/org/apache/pinot/spi/stream/StreamMetadataProvider.java: ## @@ -43,6 +45,18 @@ public interface StreamMetadataProvider extends Closeable {

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-17 Thread via GitHub
Jackie-Jiang commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1397652658 ## pinot-spi/src/main/java/org/apache/pinot/spi/stream/StreamMetadataProvider.java: ## @@ -43,6 +45,18 @@ public interface StreamMetadataProvider extends Closeable

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-17 Thread via GitHub
jadami10 commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1397594029 ## pinot-spi/src/main/java/org/apache/pinot/spi/stream/StreamMetadataProvider.java: ## @@ -43,6 +45,18 @@ public interface StreamMetadataProvider extends Closeable {

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-15 Thread via GitHub
Jackie-Jiang commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1394997095 ## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java: ## @@ -518,40 +523,56 @@ private void commit

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-09 Thread via GitHub
jadami10 commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1387995741 ## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java: ## @@ -518,40 +523,56 @@ private void commitSegm

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-08 Thread via GitHub
Jackie-Jiang commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1387514958 ## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java: ## @@ -518,40 +523,56 @@ private void commit

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-08 Thread via GitHub
jadami10 commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1387424953 ## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java: ## @@ -518,40 +523,56 @@ private void commitSegm

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-07 Thread via GitHub
Jackie-Jiang commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1386096273 ## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java: ## @@ -518,40 +523,56 @@ private void commit

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-07 Thread via GitHub
jadami10 commented on code in PR #11943: URL: https://github.com/apache/pinot/pull/11943#discussion_r1385821102 ## pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java: ## @@ -518,40 +523,56 @@ private void commitSegm

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-06 Thread via GitHub
Jackie-Jiang commented on PR #11943: URL: https://github.com/apache/pinot/pull/11943#issuecomment-1795755694 @mcvsubbu Thanks for taking time writing this program! > According to this, it takes slightly more number of iterations to stabilize to the right segment size if we apply the a

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-06 Thread via GitHub
mcvsubbu commented on PR #11943: URL: https://github.com/apache/pinot/pull/11943#issuecomment-1795513639 I wrote this program to test out some hypothesis. According to this, it takes slightly more number of iterations to stabilize to the right segment size if we apply the algorithm for all

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-03 Thread via GitHub
Jackie-Jiang commented on PR #11943: URL: https://github.com/apache/pinot/pull/11943#issuecomment-1793362417 > No, you want to count only one segment in the group that finishes together. That is because of the formula we use where we pay most attention to the past segments and less attentio

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-03 Thread via GitHub
mcvsubbu commented on PR #11943: URL: https://github.com/apache/pinot/pull/11943#issuecomment-1793263916 > > [Without looking at the code changes] Using the smallest partitionID is because of the algorthm that optimizes the segment size. All partition IDs commit roughly at the same time, so

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-03 Thread via GitHub
Jackie-Jiang commented on PR #11943: URL: https://github.com/apache/pinot/pull/11943#issuecomment-1793246547 > [Without looking at the code changes] Using the smallest partitionID is because of the algorthm that optimizes the segment size. All partition IDs commit roughly at the same time,

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-03 Thread via GitHub
mcvsubbu commented on PR #11943: URL: https://github.com/apache/pinot/pull/11943#issuecomment-1793234655 [Without looking at the code changes] Using the smallest partitionID is because of the algorthm that optimizes the segment size. All partition IDs commit roughly at the same time, so w

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-02 Thread via GitHub
codecov-commenter commented on PR #11943: URL: https://github.com/apache/pinot/pull/11943#issuecomment-1791771062 ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/11943?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) R

Re: [PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-02 Thread via GitHub
Jackie-Jiang commented on PR #11943: URL: https://github.com/apache/pinot/pull/11943#issuecomment-1791750445 cc @jadami10 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[PR] Optimize segment commit to not read partition group metadata [pinot]

2023-11-02 Thread via GitHub
Jackie-Jiang opened a new pull request, #11943: URL: https://github.com/apache/pinot/pull/11943 Currently when committing a real-time segment, controller needs to read partition group metadata for all partitions from upstream, which can be very slow for stream with lots of partitions. Th