INNOCENT-BOY opened a new pull request, #12138: URL: https://github.com/apache/pinot/pull/12138
The Cisco WAP Storage Team, we found a step that needs optimization. One of Pinot Clusters is unstable in a period of time. We firstly tried to find the error log in pinot server log. But we see lots of unnecessary error log that is transformer error log. During troubleshooting, we found the root cause: 1. Some table share the same kafka topic. Each table use filter config to ingest data. Such as below config: `"filterConfig": { "filterFunction": "Groovy({featureName != \"wap_unified_monitor\"},featureName)" },` 2. CompositeTransformer contains Transformers. Some records is already marked as Skipped reocrd. But those records still need to tranfomer by remaining transformers. I think this will bring misleading error log and unnecessary compute. `Stream.of(new ExpressionTransformer(tableConfig, schema), new FilterTransformer(tableConfig), new SchemaConformingTransformer(tableConfig, schema), new DataTypeTransformer(tableConfig, schema), new TimeValidationTransformer(tableConfig, schema), new SpecialValueTransformer(schema), new NullValueTransformer(tableConfig, schema), new SanitizationTransformer(schema)).filter(t -> !t.isNoOp()) .collect(Collectors.toList())` So we add a patch to CompositeTransformer. Please pinot maintainer help us to review this PR. Thanks in advance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org