kirkrodrigues opened a new pull request, #9942: URL: https://github.com/apache/pinot/pull/9942
At a high-level, the plugin takes two inputs: a JSON record and a list of fields (unstructured log messages) to encode with [CLP](https://github.com/y-scope/clp). The plugin will extract and encode the user-specified fields into CLP's three-column format and store the output in a Pinot `GenericRow` object. This is part of the change requested in #9819 and described in this [design doc](https://docs.google.com/document/d/10H1j5Ev3KoT_TScafOMO0BKU_h3NijuGmb13x7Cy8Ag). # Release notes * New plugin added: `pinot-clp-log` to encode user-specified fields with CLP during ingestion. * Users can use the plugin by specifying these configuration options in their `tableIndexConfig.streamConfigs`: ```json "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.clplog.CLPLogMessageDecoder", "stream.kafka.decoder.prop.fieldsForClpEncoding": "<field-names>" ``` where `<field-names>` is a comma-separated list of fields you wish to encode with CLP. # Testing performed * Validated fields are encoded correctly using the added unit test. * Created a table with the following settings: ```json "tableIndexConfig": { ..., "streamConfigs": { ..., "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.clplog.CLPLogMessageDecoder", "stream.kafka.decoder.prop.fieldsForClpEncoding": "message", } } ``` * Ingested logs through Kafka and ensured log events containing the message field were transformed such that the `message` field was replaced with CLP's three fields: `message_logtype`, `message_dictionaryVars`, and `message_encodedVars`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org