kirkrodrigues opened a new pull request, #9942:
URL: https://github.com/apache/pinot/pull/9942

   At a high-level, the plugin takes two inputs: a JSON record and a list of 
fields (unstructured log messages) to encode with 
[CLP](https://github.com/y-scope/clp). The plugin will extract and encode the 
user-specified fields into CLP's three-column format and store the output in a 
Pinot `GenericRow` object.
   
   This is part of the change requested in #9819 and described in this [design 
doc](https://docs.google.com/document/d/10H1j5Ev3KoT_TScafOMO0BKU_h3NijuGmb13x7Cy8Ag).
   
   # Release notes
   * New plugin added: `pinot-clp-log` to encode user-specified fields with CLP 
during ingestion.
   * Users can use the plugin by specifying these configuration options in 
their `tableIndexConfig.streamConfigs`:
   
     ```json
     "stream.kafka.decoder.class.name": 
"org.apache.pinot.plugin.inputformat.clplog.CLPLogMessageDecoder",
     "stream.kafka.decoder.prop.fieldsForClpEncoding": "<field-names>"
     ```
      
      where `<field-names>` is a comma-separated list of fields you wish to 
encode with CLP.
   
   # Testing performed
   * Validated fields are encoded correctly using the added unit test.
   * Created a table with the following settings:
   
   ```json
   "tableIndexConfig": {
       ...,
       "streamConfigs": {
           ...,
           "stream.kafka.decoder.class.name": 
"org.apache.pinot.plugin.inputformat.clplog.CLPLogMessageDecoder",
           "stream.kafka.decoder.prop.fieldsForClpEncoding": "message",
       }
   }
   ```
   
   * Ingested logs through Kafka and ensured log events containing the message 
field were transformed such that the `message` field was replaced with CLP's 
three fields: `message_logtype`, `message_dictionaryVars`, and 
`message_encodedVars`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to