saurabhd336 commented on PR #12243: URL: https://github.com/apache/pinot/pull/12243#issuecomment-1968307666
@Jackie-Jiang The two enrichers that have been added 1) CLP enricher: Can generate and add the 3 clp specific columns for a given string field in the record. The way this differs from transforms, is that transform functions only allow computing one column value at a time. We'll end up having to run clp transformation 3 times for each individual column. This just makes it simpler. Right now CLP encoding only works for json formatted, stream messages based realtime tables. This makes it possible to use CLP with offline tables / protobuf formatted stream messages etc. (https://docs.pinot.apache.org/basics/data-import/clp) 2) generateColumn: Enrich using a an existing transform function / groovy script. This is similar to transformations in many ways, but a usecase that couldn't be solved with existing transforms is when we want to generate an array of records using a groovy transform, and then unnest that array to explode into multiple rows. Right now, unnesting of an array field precedes record transformation. Without this enricher, users will have to use an external system to generate the array field and enrich the record before ingesting into pinot. Other enrichers we may add in the future 1) Enrich using one or more dimension table lookups for realtime tables. Again this can potentially be achieved using transformations, but we'll need a transform config each for all the columns being enriched. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org