[GitHub] [pinot] walterddr commented on a diff in pull request #10154: [multistage] make mailboxID JSON serialized

2023-01-20 Thread GitBox
walterddr commented on code in PR #10154: URL: https://github.com/apache/pinot/pull/10154#discussion_r1082874473 ## pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/JsonMailboxIdentifier.java: ## @@ -0,0 +1,132 @@ +/** + * Licensed to the Apache Software Foundati

[GitHub] [pinot] walterddr merged pull request #10154: [multistage] make mailboxID JSON serialized

2023-01-20 Thread GitBox
walterddr merged PR #10154: URL: https://github.com/apache/pinot/pull/10154 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.ap

[GitHub] [pinot] agavra commented on a diff in pull request #10154: [multistage] make mailboxID JSON serialized

2023-01-20 Thread GitBox
agavra commented on code in PR #10154: URL: https://github.com/apache/pinot/pull/10154#discussion_r1082866431 ## pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/JsonMailboxIdentifier.java: ## @@ -0,0 +1,132 @@ +/** + * Licensed to the Apache Software Foundation

[GitHub] [pinot] agavra commented on a diff in pull request #10154: [multistage] make mailboxID JSON serialized

2023-01-20 Thread GitBox
agavra commented on code in PR #10154: URL: https://github.com/apache/pinot/pull/10154#discussion_r1082859240 ## pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/JsonMailboxIdentifier.java: ## @@ -0,0 +1,132 @@ +/** + * Licensed to the Apache Software Foundation

[GitHub] [pinot] walterddr commented on a diff in pull request #10154: [multistage] make mailboxID JSON serialized

2023-01-20 Thread GitBox
walterddr commented on code in PR #10154: URL: https://github.com/apache/pinot/pull/10154#discussion_r1082852866 ## pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/JsonMailboxIdentifier.java: ## @@ -0,0 +1,132 @@ +/** + * Licensed to the Apache Software Foundati

[GitHub] [pinot] GSharayu commented on a diff in pull request #10132: add API to get progress of subtasks with given state

2023-01-20 Thread GitBox
GSharayu commented on code in PR #10132: URL: https://github.com/apache/pinot/pull/10132#discussion_r1082845704 ## pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTaskRestletResource.java: ## @@ -419,7 +423,7 @@ public Map getSubtaskConfigs( @GET

[GitHub] [pinot] klsince commented on a diff in pull request #10089: Reload conditional skip

2023-01-20 Thread GitBox
klsince commented on code in PR #10089: URL: https://github.com/apache/pinot/pull/10089#discussion_r1082844023 ## pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java: ## @@ -403,6 +417,16 @@ public void reloadSegment(String segmentName, IndexLo

[GitHub] [pinot] klsince commented on a diff in pull request #10089: Reload conditional skip

2023-01-20 Thread GitBox
klsince commented on code in PR #10089: URL: https://github.com/apache/pinot/pull/10089#discussion_r1082834046 ## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/loader/SegmentDirectoryLoader.java: ## @@ -42,4 +41,13 @@ SegmentDirectory load(URI indexDir, SegmentDi

[GitHub] [pinot] klsince commented on a diff in pull request #10089: Reload conditional skip

2023-01-20 Thread GitBox
klsince commented on code in PR #10089: URL: https://github.com/apache/pinot/pull/10089#discussion_r1082833141 ## pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java: ## @@ -403,6 +417,16 @@ public void reloadSegment(String segmentName, IndexLo

[GitHub] [pinot] KKcorps commented on pull request #10155: Make a system property to call system.exit() for LaunchDataIngestionJobCommand

2023-01-20 Thread GitBox
KKcorps commented on PR #10155: URL: https://github.com/apache/pinot/pull/10155#issuecomment-1398468246 This System.exit call was added in the first place to fail the spark job. Otherwise it just returned success even in the case of failure -- This is an automated message from the Apache

[GitHub] [pinot] ankitsultana commented on issue #10129: [multistage] Projection Not Pushed Down for Agg Queries with No Grouping Sets

2023-01-20 Thread GitBox
ankitsultana commented on issue #10129: URL: https://github.com/apache/pinot/issues/10129#issuecomment-1398417863 @walterddr : Can we consider using `RelFieldTrimmer`? I raised this: https://github.com/apache/pinot/pull/10156 It seems to work and I am testing it out on our clust

[GitHub] [pinot] ankitsultana opened a new pull request, #10156: [multistage] Trim Unused Fields Before Optimizer Runs

2023-01-20 Thread GitBox
ankitsultana opened a new pull request, #10156: URL: https://github.com/apache/pinot/pull/10156 This seems to fix projection pushdown issues. PR is not really final since I am also reading more about this. There's a method to enable this via `SqlRelConverterConfig.config().withTrimUnusedFie

[GitHub] [pinot] Ferix9288 commented on pull request #10139: JMX to Prom Exporter - More Inclusive Table Regex

2023-01-20 Thread GitBox
Ferix9288 commented on PR #10139: URL: https://github.com/apache/pinot/pull/10139#issuecomment-1398240019 @Jackie-Jiang changed as you described and confirmed working with a custom image in our `dev` environment 🚀 -- This is an automated message from the Apache Git Service. To respond t

[GitHub] [pinot] saurabhd336 commented on a diff in pull request #10089: Reload conditional skip

2023-01-20 Thread GitBox
saurabhd336 commented on code in PR #10089: URL: https://github.com/apache/pinot/pull/10089#discussion_r1082367099 ## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/SegmentPreProcessor.java: ## @@ -207,11 +207,7 @@ private boolean needProce

[GitHub] [pinot] navina commented on pull request #10136: Decouple dependency on using peer segment download scheme for doing metadata pushes to controller

2023-01-19 Thread GitBox
navina commented on PR #10136: URL: https://github.com/apache/pinot/pull/10136#issuecomment-1398019400 > Please check the test failure. I think it is related to the current change Yes will fix it. Ty -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [pinot] codecov-commenter commented on pull request #10155: Make a system property to call system.exit() for LaunchDataIngestionJobCommand

2023-01-19 Thread GitBox
codecov-commenter commented on PR #10155: URL: https://github.com/apache/pinot/pull/10155#issuecomment-1398016275 # [Codecov](https://codecov.io/gh/apache/pinot/pull/10155?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Soft

[GitHub] [pinot] xiangfu0 opened a new pull request, #10155: Make a system property to call system.exit() for LaunchDataIngestionJobCommand

2023-01-19 Thread GitBox
xiangfu0 opened a new pull request, #10155: URL: https://github.com/apache/pinot/pull/10155 Use a system property to call system.exit() for LaunchDataIngestionJobCommand. The reason is SparkJob may fail due to this system.exit() call. -- This is an automated message from the Ap

[GitHub] [pinot] codecov-commenter commented on pull request #10154: [multistage] make mailboxID JSON serialized

2023-01-19 Thread GitBox
codecov-commenter commented on PR #10154: URL: https://github.com/apache/pinot/pull/10154#issuecomment-1397915490 # [Codecov](https://codecov.io/gh/apache/pinot/pull/10154?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Soft

[GitHub] [pinot] agavra opened a new pull request, #10154: [multistage] make mailboxID JSON serialized

2023-01-19 Thread GitBox
agavra opened a new pull request, #10154: URL: https://github.com/apache/pinot/pull/10154 Minor PR to change the serialization of mailbox identifier to a JSON format. For example: ``` {"jobId":"-7324288832996730535_2","from":"localhost:52683","to":"localhost:52683"} ``` This

[GitHub] [pinot] codecov-commenter commented on pull request #10153: allow editing pools in instance details ui

2023-01-19 Thread GitBox
codecov-commenter commented on PR #10153: URL: https://github.com/apache/pinot/pull/10153#issuecomment-139776 # [Codecov](https://codecov.io/gh/apache/pinot/pull/10153?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Soft

[GitHub] [pinot] jadami10 opened a new pull request, #10153: allow editing pools in instance details ui

2023-01-19 Thread GitBox
jadami10 opened a new pull request, #10153: URL: https://github.com/apache/pinot/pull/10153 This is a simple ui feature + bugfix. Today there's two issues: 1) there's no way from the UI to easily create or update pools from the instance details config page 2) if you are using pools, an

[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #10151: Simplify UTF-8 comparison on null character

2023-01-19 Thread GitBox
Jackie-Jiang commented on code in PR #10151: URL: https://github.com/apache/pinot/pull/10151#discussion_r1082047825 ## pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/readerwriter/ValueReaderComparisonTest.java: ## @@ -156,22 +156,27 @@ void testSu

[GitHub] [pinot] jadami10 commented on pull request #9994: [feature] Add a Tracker class to support aggregate-worst case consumption delay…

2023-01-19 Thread GitBox
jadami10 commented on PR #9994: URL: https://github.com/apache/pinot/pull/9994#issuecomment-1397833536 I tried deploying this change to one of our clusters with #10101 or #10121. We currently track ingestion lag via our custom stream ingestion plugin. For the most part these match up. But I

[GitHub] [pinot] Jackie-Jiang commented on pull request #10151: Simplify UTF-8 comparison on null character

2023-01-19 Thread GitBox
Jackie-Jiang commented on PR #10151: URL: https://github.com/apache/pinot/pull/10151#issuecomment-139782 With the current code, `length == bytes.length && length != 0 && bytes[length - 1] == 0` case is not handled (the new added `ValueReaderComparisonTest` is flaky right now). To keep t

[GitHub] [pinot] sajjad-moradi commented on a diff in pull request #10121: Add a tracker for end-to-end consumption delay of events.

2023-01-19 Thread GitBox
sajjad-moradi commented on code in PR #10121: URL: https://github.com/apache/pinot/pull/10121#discussion_r1082035842 ## pinot-spi/src/main/java/org/apache/pinot/spi/stream/RowMetadata.java: ## @@ -49,6 +49,17 @@ public interface RowMetadata { */ long getRecordIngestionTim

[GitHub] [pinot] sajjad-moradi commented on a diff in pull request #10121: Add a tracker for end-to-end consumption delay of events.

2023-01-19 Thread GitBox
sajjad-moradi commented on code in PR #10121: URL: https://github.com/apache/pinot/pull/10121#discussion_r1082029067 ## pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/IngestionDelayTracker.java: ## @@ -76,6 +76,15 @@ public class IngestionDelayTracker {

[GitHub] [pinot] Jackie-Jiang closed pull request #10144: Adding duration unit in trivy timeout

2023-01-19 Thread GitBox
Jackie-Jiang closed pull request #10144: Adding duration unit in trivy timeout URL: https://github.com/apache/pinot/pull/10144 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [pinot] Jackie-Jiang commented on pull request #10144: Adding duration unit in trivy timeout

2023-01-19 Thread GitBox
Jackie-Jiang commented on PR #10144: URL: https://github.com/apache/pinot/pull/10144#issuecomment-1397822649 This change is included in #10150 which also fixes the trivy failure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [pinot] Jackie-Jiang merged pull request #10150: Fix json-smart version

2023-01-19 Thread GitBox
Jackie-Jiang merged PR #10150: URL: https://github.com/apache/pinot/pull/10150 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot

[GitHub] [pinot] mcvsubbu commented on issue #10147: Pauseless Consumption

2023-01-19 Thread GitBox
mcvsubbu commented on issue #10147: URL: https://github.com/apache/pinot/issues/10147#issuecomment-1397819480 How many rows would the new consuming segment be setup to consume? (Currently that is decided when the prev segment is committed). -- This is an automated message from the Apache

[GitHub] [pinot] Jackie-Jiang commented on pull request #10139: JMX to Prom Exporter - More Inclusive Table Regex

2023-01-19 Thread GitBox
Jackie-Jiang commented on PR #10139: URL: https://github.com/apache/pinot/pull/10139#issuecomment-1397807868 Whitespaces in table name is definitely not recommended, but I don't think we reject it right now. Actually it is also not recommended to have '-' in table name or column name becaus

[GitHub] [pinot] Jackie-Jiang commented on issue #10147: Pauseless Consumption

2023-01-19 Thread GitBox
Jackie-Jiang commented on issue #10147: URL: https://github.com/apache/pinot/issues/10147#issuecomment-1397794076 Have you considered using the regular state transition to achieve this? Essentially we can create the new consuming segment in IdealState when the commit starts. Nothing needs t

[GitHub] [pinot] Jackie-Jiang commented on pull request #10151: Simplify UTF-8 comparison on null character

2023-01-19 Thread GitBox
Jackie-Jiang commented on PR #10151: URL: https://github.com/apache/pinot/pull/10151#issuecomment-1397787608 > Does `LoaderTest.testPadding` pass with this change? I'm all for simplification (assuming whatever the motivation for adding `LoaderTest.testPadding` was no longer matters) but thi

[GitHub] [pinot] navina opened a new issue, #10152: Support consumer de-aggregation in Kinesis

2023-01-19 Thread GitBox
navina opened a new issue, #10152: URL: https://github.com/apache/pinot/issues/10152 Kinesis supports producing aggregated (aka batched) record. Thus, kinesis consumer also has support for de-aggregating the records. Refer - https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-consume

[GitHub] [pinot] richardstartin commented on pull request #10151: Simplify UTF-8 comparison on null character

2023-01-19 Thread GitBox
richardstartin commented on PR #10151: URL: https://github.com/apache/pinot/pull/10151#issuecomment-1397748773 Does `LoaderTest.testPadding` pass with this change? I'm all for simplification (assuming the whatever the motivation for adding `LoaderTest.testPadding` was no longer matters) but

[GitHub] [pinot] walterddr commented on a diff in pull request #10151: Simplify UTF-8 comparison on null character

2023-01-19 Thread GitBox
walterddr commented on code in PR #10151: URL: https://github.com/apache/pinot/pull/10151#discussion_r1081975613 ## pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/readerwriter/ValueReaderComparisonTest.java: ## @@ -156,22 +156,27 @@ void testSuper

[GitHub] [pinot] navina commented on a diff in pull request #10136: Decouple dependency on using peer segment download scheme for doing metadata pushes to controller

2023-01-19 Thread GitBox
navina commented on code in PR #10136: URL: https://github.com/apache/pinot/pull/10136#discussion_r1081975469 ## pinot-spi/src/main/java/org/apache/pinot/spi/config/table/SegmentsValidationAndRetentionConfig.java: ## @@ -49,9 +49,13 @@ public class SegmentsValidationAndRetention

[GitHub] [pinot] codecov-commenter commented on pull request #10151: Simplify UTF-8 comparison on null character

2023-01-19 Thread GitBox
codecov-commenter commented on PR #10151: URL: https://github.com/apache/pinot/pull/10151#issuecomment-1397743731 # [Codecov](https://codecov.io/gh/apache/pinot/pull/10151?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Soft

[GitHub] [pinot] Jackie-Jiang merged pull request #9957: [multistage] [feature] Add a query option to pass some v1 limit

2023-01-19 Thread GitBox
Jackie-Jiang merged PR #9957: URL: https://github.com/apache/pinot/pull/9957 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.a

[GitHub] [pinot] Jackie-Jiang opened a new pull request, #10151: Simplify UTF-8 comparison on null character

2023-01-19 Thread GitBox
Jackie-Jiang opened a new pull request, #10151: URL: https://github.com/apache/pinot/pull/10151 Null character (byte zero) is not allowed in string values ingested into Pinot (all the string values ingested will be sanitized and all the characters after the first null character will be trun

[GitHub] [pinot] agavra commented on a diff in pull request #10120: [multistage] support aggregations that require intermediate representations

2023-01-19 Thread GitBox
agavra commented on code in PR #10120: URL: https://github.com/apache/pinot/pull/10120#discussion_r1081932230 ## pinot-query-planner/src/main/java/org/apache/calcite/rel/rules/PinotReduceAggregateFunctionsRule.java: ## @@ -0,0 +1,201 @@ +/** + * Licensed to the Apache Software F

[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #10136: Decouple dependency on using peer segment download scheme for doing metadata pushes to controller

2023-01-19 Thread GitBox
Jackie-Jiang commented on code in PR #10136: URL: https://github.com/apache/pinot/pull/10136#discussion_r1081822321 ## pinot-spi/src/main/java/org/apache/pinot/spi/config/table/SegmentsValidationAndRetentionConfig.java: ## @@ -49,9 +49,13 @@ public class SegmentsValidationAndRet

[GitHub] [pinot] agavra commented on a diff in pull request #10120: [multistage] support aggregations that require intermediate representations

2023-01-19 Thread GitBox
agavra commented on code in PR #10120: URL: https://github.com/apache/pinot/pull/10120#discussion_r1081825016 ## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/AggregationFunctionType.java: ## @@ -65,6 +65,7 @@ public enum AggregationFunctionType { STDDEVSAMP("s

[GitHub] [pinot] navina commented on pull request #10144: Adding duration unit in trivy timeout

2023-01-19 Thread GitBox
navina commented on PR #10144: URL: https://github.com/apache/pinot/pull/10144#issuecomment-1397475658 > I understand that the initial idea was to increase the timeout but we are using this PR to change the way trivy is triggered. Therefore I would suggest to change the title of the PR

[GitHub] [pinot] saurabhd336 commented on a diff in pull request #10089: Reload conditional skip

2023-01-19 Thread GitBox
saurabhd336 commented on code in PR #10089: URL: https://github.com/apache/pinot/pull/10089#discussion_r1081698660 ## pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java: ## @@ -403,6 +417,16 @@ public void reloadSegment(String segmentName, Ind

[GitHub] [pinot] saurabhd336 commented on a diff in pull request #10089: Reload conditional skip

2023-01-19 Thread GitBox
saurabhd336 commented on code in PR #10089: URL: https://github.com/apache/pinot/pull/10089#discussion_r1081688733 ## pinot-core/src/test/java/org/apache/pinot/core/data/manager/BaseTableDataManagerTest.java: ## @@ -191,7 +191,6 @@ public void testReloadSegmentUseLocalCopy()

[GitHub] [pinot] walterddr commented on issue #10129: [multistage] Projection Not Pushed Down for Agg Queries with No Grouping Sets

2023-01-19 Thread GitBox
walterddr commented on issue #10129: URL: https://github.com/apache/pinot/issues/10129#issuecomment-1397438501 did a bit of research on this topic: the problem only occurs on `COUNT(*)` or `COUNT(1)`, `COUNT(col)` this is b/c calcite will optimize out the argument since all columns in

[GitHub] [pinot] 61yao commented on a diff in pull request #9957: [multistage] [feature] Add a query option to pass some v1 limit

2023-01-19 Thread GitBox
61yao commented on code in PR #9957: URL: https://github.com/apache/pinot/pull/9957#discussion_r1081642038 ## pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/MultiStageEngineIntegrationTest.java: ## @@ -99,6 +99,14 @@ public void testGeneratedQueries()

[GitHub] [pinot] snleee merged pull request #10106: clean up the logic of reading guage metrics

2023-01-19 Thread GitBox
snleee merged PR #10106: URL: https://github.com/apache/pinot/pull/10106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apach

[GitHub] [pinot] npawar commented on pull request #10144: Adding duration unit in trivy timeout

2023-01-19 Thread GitBox
npawar commented on PR #10144: URL: https://github.com/apache/pinot/pull/10144#issuecomment-1397373177 Not sure if periodic task is the best way to go forward here. I'm okay if we merge the flagged PR, and then separately fix the vulnerability detected (either the same PR author does it, or

[GitHub] [pinot] zhtaoxiang commented on a diff in pull request #10106: clean up the logic of reading guage metrics

2023-01-19 Thread GitBox
zhtaoxiang commented on code in PR #10106: URL: https://github.com/apache/pinot/pull/10106#discussion_r1081609829 ## pinot-common/src/main/java/org/apache/pinot/common/metrics/AbstractMetrics.java: ## @@ -481,20 +461,6 @@ public long getValueOfTableGauge(final String tableName,

[GitHub] [pinot] zhtaoxiang commented on a diff in pull request #10106: clean up the logic of reading guage metrics

2023-01-19 Thread GitBox
zhtaoxiang commented on code in PR #10106: URL: https://github.com/apache/pinot/pull/10106#discussion_r1081607958 ## pinot-common/src/test/java/org/apache/pinot/common/metrics/MetricValueUtils.java: ## @@ -0,0 +1,128 @@ +/** + * Licensed to the Apache Software Foundation (ASF) u

[GitHub] [pinot] klsince commented on a diff in pull request #10089: Reload conditional skip

2023-01-19 Thread GitBox
klsince commented on code in PR #10089: URL: https://github.com/apache/pinot/pull/10089#discussion_r1081607056 ## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/SegmentPreProcessor.java: ## @@ -207,11 +207,7 @@ private boolean needProcessSt

[GitHub] [pinot] walterddr commented on pull request #10148: [multistage] flag all columns as nullable

2023-01-19 Thread GitBox
walterddr commented on PR #10148: URL: https://github.com/apache/pinot/pull/10148#issuecomment-1397353721 this have introduce new set of issues. so we really need a proper null detection on columns. will shelf this PR for now -- This is an automated message from the Apache Git Service. To

[GitHub] [pinot] jugomezv commented on a diff in pull request #10106: clean up the logic of reading guage metrics

2023-01-19 Thread GitBox
jugomezv commented on code in PR #10106: URL: https://github.com/apache/pinot/pull/10106#discussion_r1081597788 ## pinot-common/src/test/java/org/apache/pinot/common/metrics/MetricValueUtils.java: ## @@ -0,0 +1,128 @@ +/** + * Licensed to the Apache Software Foundation (ASF) und

[GitHub] [pinot] jugomezv commented on a diff in pull request #10106: clean up the logic of reading guage metrics

2023-01-19 Thread GitBox
jugomezv commented on code in PR #10106: URL: https://github.com/apache/pinot/pull/10106#discussion_r1081595797 ## pinot-common/src/main/java/org/apache/pinot/common/metrics/AbstractMetrics.java: ## @@ -481,20 +461,6 @@ public long getValueOfTableGauge(final String tableName, f

[GitHub] [pinot] walterddr merged pull request #10094: [multistage] [debuggability] OpChain and operator stats

2023-01-19 Thread GitBox
walterddr merged PR #10094: URL: https://github.com/apache/pinot/pull/10094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.ap

[GitHub] [pinot] walterddr commented on a diff in pull request #10144: Adding duration unit in trivy timeout

2023-01-19 Thread GitBox
walterddr commented on code in PR #10144: URL: https://github.com/apache/pinot/pull/10144#discussion_r1081538383 ## .github/workflows/pinot_vuln_check.yml: ## @@ -19,18 +19,8 @@ name: Pinot Dependencies on: - push: -branches: - - master - pull_request: -branch

[GitHub] [pinot] ankitsultana commented on a diff in pull request #10148: [multistage] flag all columns as nullable

2023-01-19 Thread GitBox
ankitsultana commented on code in PR #10148: URL: https://github.com/apache/pinot/pull/10148#discussion_r1081312765 ## pinot-query-planner/src/main/java/org/apache/pinot/query/type/TypeFactory.java: ## @@ -54,23 +54,23 @@ public RelDataType createRelDataTypeFromSchema(Schema sc

[GitHub] [pinot] ankitsultana commented on issue #10140: [multistage] Expensive RoundRobinScheduler::hasNext Calls

2023-01-19 Thread GitBox
ankitsultana commented on issue #10140: URL: https://github.com/apache/pinot/issues/10140#issuecomment-1396973491 Side-note: I think entries in seenMail may get leaked if `_releaseTs` is set to a finite value. The reason is that the OpChain may hang up and the sender may send the data after

[GitHub] [pinot] Ferix9288 commented on pull request #10139: JMX to Prom Exporter - More Inclusive Table Regex

2023-01-19 Thread GitBox
Ferix9288 commented on PR #10139: URL: https://github.com/apache/pinot/pull/10139#issuecomment-1396918251 Thanks @richardstartin for the input! @Jackie-Jiang I added `\\S+` regex to all other instances in which there was a `table` label; good catch on that. Also are we sure we should

[GitHub] [pinot] xiangfu0 merged pull request #10149: Fixing the metadata tar gz file path

2023-01-19 Thread GitBox
xiangfu0 merged PR #10149: URL: https://github.com/apache/pinot/pull/10149 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apa

[GitHub] [pinot] richardstartin commented on pull request #10139: JMX to Prom Exporter - More Inclusive Table Regex

2023-01-19 Thread GitBox
richardstartin commented on PR #10139: URL: https://github.com/apache/pinot/pull/10139#issuecomment-1396760434 I don't have much of an opinion on what these regular expressions should be, I just took what was there before (which was working for us) and split it into different files so we do

[GitHub] [pinot] Ferix9288 commented on pull request #10139: JMX to Prom Exporter - More Inclusive Table Regex

2023-01-19 Thread GitBox
Ferix9288 commented on PR #10139: URL: https://github.com/apache/pinot/pull/10139#issuecomment-1396699401 Hey @Jackie-Jiang , much appreciated for the prompt review! Ah very nice. So possibly a non-greedy search like (.*?)_(\w+)? And np, I will fix the other regexes that also contain

[GitHub] [pinot] gortiz commented on pull request #10144: Adding duration unit in trivy timeout

2023-01-18 Thread GitBox
gortiz commented on PR #10144: URL: https://github.com/apache/pinot/pull/10144#issuecomment-1396567329 I understand that the initial idea was to increase the timeout but we are using this PR to change the way trivy is triggered. Therefore I would suggest to change the title of the PR --

[GitHub] [pinot] gortiz commented on a diff in pull request #10144: Adding duration unit in trivy timeout

2023-01-18 Thread GitBox
gortiz commented on code in PR #10144: URL: https://github.com/apache/pinot/pull/10144#discussion_r1080906416 ## .github/workflows/pinot_vuln_check.yml: ## @@ -19,18 +19,8 @@ name: Pinot Dependencies on: - push: -branches: - - master - pull_request: -branches:

[GitHub] [pinot] codecov-commenter commented on pull request #10150: Fix json-smart version

2023-01-18 Thread GitBox
codecov-commenter commented on PR #10150: URL: https://github.com/apache/pinot/pull/10150#issuecomment-1396548970 # [Codecov](https://codecov.io/gh/apache/pinot/pull/10150?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Soft

[GitHub] [pinot] codecov-commenter commented on pull request #10149: Fixing the metadata tar gz file path

2023-01-18 Thread GitBox
codecov-commenter commented on PR #10149: URL: https://github.com/apache/pinot/pull/10149#issuecomment-1396526998 # [Codecov](https://codecov.io/gh/apache/pinot/pull/10149?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Soft

[GitHub] [pinot] zhtaoxiang commented on a diff in pull request #10106: clean up the logic of reading guage metrics

2023-01-18 Thread GitBox
zhtaoxiang commented on code in PR #10106: URL: https://github.com/apache/pinot/pull/10106#discussion_r1080869461 ## pinot-common/src/test/java/org/apache/pinot/common/metrics/MetricValueUtils.java: ## @@ -0,0 +1,128 @@ +/** + * Licensed to the Apache Software Foundation (ASF) u

[GitHub] [pinot] xiangfu0 opened a new pull request, #10150: Fix json-smart version

2023-01-18 Thread GitBox
xiangfu0 opened a new pull request, #10150: URL: https://github.com/apache/pinot/pull/10150 Instructions: 1. The PR has to be tagged with at least one of the following labels (*): 1. `feature` 2. `bugfix` 3. `performance` 4. `ui` 5. `backward-incompat` 6

[GitHub] [pinot] xiangfu0 commented on pull request #10144: Adding duration unit in trivy timeout

2023-01-18 Thread GitBox
xiangfu0 commented on PR #10144: URL: https://github.com/apache/pinot/pull/10144#issuecomment-1396496351 My only concern is that who will be responsible for the trivy check results and version bump. -- This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [pinot] codecov-commenter commented on pull request #10148: [multistage] flag all columns as nullable

2023-01-18 Thread GitBox
codecov-commenter commented on PR #10148: URL: https://github.com/apache/pinot/pull/10148#issuecomment-1396491709 # [Codecov](https://codecov.io/gh/apache/pinot/pull/10148?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Soft

[GitHub] [pinot] navina commented on a diff in pull request #10144: Adding duration unit in trivy timeout

2023-01-18 Thread GitBox
navina commented on code in PR #10144: URL: https://github.com/apache/pinot/pull/10144#discussion_r1080845287 ## .github/workflows/pinot_vuln_check.yml: ## @@ -19,18 +19,8 @@ name: Pinot Dependencies on: - push: -branches: - - master - pull_request: -branches:

[GitHub] [pinot] navina commented on pull request #10136: Decouple dependency on using peer segment download scheme for doing metadata pushes to controller

2023-01-18 Thread GitBox
navina commented on PR #10136: URL: https://github.com/apache/pinot/pull/10136#issuecomment-1396486769 @Jackie-Jiang I don't think we need 2 new configs, as we discussed. We can make do with 1 new config. Please review! -- This is an automated message from the Apache Git Service. To resp

[GitHub] [pinot] xiangfu0 opened a new pull request, #10149: Fixing the metadata tar gz file path

2023-01-18 Thread GitBox
xiangfu0 opened a new pull request, #10149: URL: https://github.com/apache/pinot/pull/10149 Fixing https://apache-pinot.slack.com/archives/C011C9JHN7R/p1674101384733239 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [pinot] walterddr opened a new pull request, #10148: [multistage] flag all columns as nullable

2023-01-18 Thread GitBox
walterddr opened a new pull request, #10148: URL: https://github.com/apache/pinot/pull/10148 currently, there's no way to tell whether a table column is nullable setting the nullability for all columns on v2 engine as default true (previously default false), b/c - having nullabili

[GitHub] [pinot] shwin commented on a diff in pull request #10034: Create metadata only tarball for metadata push job

2023-01-18 Thread GitBox
shwin commented on code in PR #10034: URL: https://github.com/apache/pinot/pull/10034#discussion_r1080812390 ## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/SegmentPushUtils.java: ## @@ -284,7 +284,22 @@ public static void sendSegmentUriAndMetadata(Seg

[GitHub] [pinot] shwin commented on a diff in pull request #10034: Create metadata only tarball for metadata push job

2023-01-18 Thread GitBox
shwin commented on code in PR #10034: URL: https://github.com/apache/pinot/pull/10034#discussion_r1080811688 ## pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/SegmentPushUtils.java: ## @@ -284,7 +284,22 @@ public static void sendSegmentUriAndMetadata(Seg

[GitHub] [pinot] jasperjiaguo commented on issue #9277: [Feature] Support for Correlation Function

2023-01-18 Thread GitBox
jasperjiaguo commented on issue #9277: URL: https://github.com/apache/pinot/issues/9277#issuecomment-1396440724 NaN could be due to Math.sqrt(negative_number) or 0.0/0.0 We have recently discovered this impl of covariance/correlation has numerical stability issue when E[x^2] ~ E[x]^2 >>

[GitHub] [pinot] sajjad-moradi commented on issue #10147: Pauseless Consumption

2023-01-18 Thread GitBox
sajjad-moradi commented on issue #10147: URL: https://github.com/apache/pinot/issues/10147#issuecomment-1396417093 CC: @mcvsubbu @npawar @Jackie-Jiang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [pinot] sajjad-moradi opened a new issue, #10147: Pauseless Consumption

2023-01-18 Thread GitBox
sajjad-moradi opened a new issue, #10147: URL: https://github.com/apache/pinot/issues/10147 # Background Currently when commit protocol starts for a partition, the realtime server stops consumption and till the new consuming->online helix message for the next consuming segment arrives, m

[GitHub] [pinot] snleee opened a new issue, #10146: Add the support for `STDDEV_POP/SAMP, VAR_POP/DEV` for star tree

2023-01-18 Thread GitBox
snleee opened a new issue, #10146: URL: https://github.com/apache/pinot/issues/10146 We recently added the `STDDEV_POP/SAMP, VAR_POP/DEV` aggregation function. It would be great if we can support this in the star tree index. We need to implement `ValueAggregator` interface for stddev

[GitHub] [pinot] mcvsubbu commented on issue #10137: Determine Number of Rows for a Realtime Segment Based on a Config

2023-01-18 Thread GitBox
mcvsubbu commented on issue #10137: URL: https://github.com/apache/pinot/issues/10137#issuecomment-1396336852 We should really think about configs that add no value. IMO this config adds no value. I think the non-expert users will come back asking us to tune this number because they rebalan

[GitHub] [pinot] siddharthteotia commented on issue #9277: [Feature] Support for Correlation Function

2023-01-18 Thread GitBox
siddharthteotia commented on issue #9277: URL: https://github.com/apache/pinot/issues/9277#issuecomment-1396328556 May be you are running into divide by 0 problem ? Did you try step into the code to understand where it is turning into NaN ? cc @jasperjiaguo / @SabrinaZhaozyf -- T

[GitHub] [pinot] xiangfu0 merged pull request #10145: Fix docker build script tagging

2023-01-18 Thread GitBox
xiangfu0 merged PR #10145: URL: https://github.com/apache/pinot/pull/10145 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apa

[GitHub] [pinot] Jackie-Jiang commented on issue #10137: Determine Number of Rows for a Realtime Segment Based on a Config

2023-01-18 Thread GitBox
Jackie-Jiang commented on issue #10137: URL: https://github.com/apache/pinot/issues/10137#issuecomment-1396323369 Most users are not using the recommendation engine though. What I'm trying to convey is that currently there is not a straight forward way to configure the rows per segment. Fix

[GitHub] [pinot] xiangfu0 opened a new pull request, #10145: Fix docker build script tagging

2023-01-18 Thread GitBox
xiangfu0 opened a new pull request, #10145: URL: https://github.com/apache/pinot/pull/10145 Fix the tagging variables generation from the build code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [pinot] siddharthteotia commented on a diff in pull request #10120: [multistage] support aggregations that require intermediate representations

2023-01-18 Thread GitBox
siddharthteotia commented on code in PR #10120: URL: https://github.com/apache/pinot/pull/10120#discussion_r1080718243 ## pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/AggregationFunctionType.java: ## @@ -65,6 +65,7 @@ public enum AggregationFunctionType { STDD

[GitHub] [pinot] siddharthteotia merged pull request #10120: [multistage] support aggregations that require intermediate representations

2023-01-18 Thread GitBox
siddharthteotia merged PR #10120: URL: https://github.com/apache/pinot/pull/10120 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pi

[GitHub] [pinot] siddharthteotia closed issue #10109: Support DistinctSum and DistinctAvg aggregation functions for MV columns

2023-01-18 Thread GitBox
siddharthteotia closed issue #10109: Support DistinctSum and DistinctAvg aggregation functions for MV columns URL: https://github.com/apache/pinot/issues/10109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [pinot] siddharthteotia merged pull request #10128: Support for DistinctSumMV and DistinctAvgMV aggregation functions

2023-01-18 Thread GitBox
siddharthteotia merged PR #10128: URL: https://github.com/apache/pinot/pull/10128 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pi

[GitHub] [pinot] siddharthteotia commented on pull request #10128: Support for DistinctSumMV and DistinctAvgMV aggregation functions

2023-01-18 Thread GitBox
siddharthteotia commented on PR #10128: URL: https://github.com/apache/pinot/pull/10128#issuecomment-1396312237 only text / javadoc changed in the latest commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [pinot] mcvsubbu commented on issue #10137: Determine Number of Rows for a Realtime Segment Based on a Config

2023-01-18 Thread GitBox
mcvsubbu commented on issue #10137: URL: https://github.com/apache/pinot/issues/10137#issuecomment-1396307559 > @mcvsubbu The proposed way is very useful when we want to limit the CPU usage for all the consumers on a given server, but for certain use cases, CPU might not be the bottleneck (

[GitHub] [pinot] 61yao commented on a diff in pull request #10094: [multistage] [debuggability] OpChain and operator stats

2023-01-18 Thread GitBox
61yao commented on code in PR #10094: URL: https://github.com/apache/pinot/pull/10094#discussion_r1080713806 ## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/OperatorStats.java: ## @@ -0,0 +1,77 @@ +/** + * Licensed to the Apache Software Foundation (

[GitHub] [pinot] 61yao commented on a diff in pull request #10094: [multistage] [debuggability] OpChain and operator stats

2023-01-18 Thread GitBox
61yao commented on code in PR #10094: URL: https://github.com/apache/pinot/pull/10094#discussion_r1080712321 ## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/SortOperator.java: ## @@ -134,8 +150,10 @@ private void consumeInputBlocks() { for (

[GitHub] [pinot] vvivekiyer commented on pull request #10128: Support for DistinctSumMV and DistinctAvgMV aggregation functions

2023-01-18 Thread GitBox
vvivekiyer commented on PR #10128: URL: https://github.com/apache/pinot/pull/10128#issuecomment-1396304548 Addressed review comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [pinot] walterddr commented on a diff in pull request #10144: Adding duration unit in trivy timeout

2023-01-18 Thread GitBox
walterddr commented on code in PR #10144: URL: https://github.com/apache/pinot/pull/10144#discussion_r1080695552 ## .github/workflows/pinot_vuln_check.yml: ## @@ -19,18 +19,8 @@ name: Pinot Dependencies on: - push: -branches: - - master - pull_request: -branch

[GitHub] [pinot] snleee commented on pull request #10144: Adding duration unit in trivy timeout

2023-01-18 Thread GitBox
snleee commented on PR #10144: URL: https://github.com/apache/pinot/pull/10144#issuecomment-1396273909 I'm fine with the change but we will need to make one time effort to clean up the test. I think that may involve in bumping up multiple libraries. -- This is an automated message from th

[GitHub] [pinot] Jackie-Jiang merged pull request #10087: [Clean up] Remove the support for non-zero padding

2023-01-18 Thread GitBox
Jackie-Jiang merged PR #10087: URL: https://github.com/apache/pinot/pull/10087 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot

[GitHub] [pinot] jtao15 commented on a diff in pull request #10136: Decouple dependency on using peer segment download scheme for doing metadata pushes to controller

2023-01-18 Thread GitBox
jtao15 commented on code in PR #10136: URL: https://github.com/apache/pinot/pull/10136#discussion_r1080680161 ## pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/SegmentCommitterFactory.java: ## @@ -46,25 +46,34 @@ public SegmentCommitterFactory(Logger segmen

  1   2   3   4   5   6   7   8   9   10   >