kkrugler opened a new pull request #7222:
URL: https://github.com/apache/pinot/pull/7222


   ## Description
   First cut at resolving #7090, to support generating segment names based on 
the input file path and a specified pattern/template.
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade? (Assume upgrade order: 
Controller, Broker, Server, Minion)
   No
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   No
   
   Does this PR otherwise need attention when creating release notes?
   
   * [X] Yes (Please label this PR as **<code>release-notes</code>** and 
complete the section on Release Notes)
   ## Release Notes
   A new `inputFile` type is supported for the `segmentNameGeneratorSpec` 
section of a batch job file. This type supports naming the resulting segment 
file based on the input file name & path.
   ## Documentation
   To be added:
   
   The `inputFile` type supports naming the resulting segment file based on the 
input file name & path. Two parameters must be specified in the `configs` 
section of the `segmentNameGeneratorSpec` section.:
   
    * `file.path.pattern`: A Java regular expression used to match against the 
input file URI.
    * `segment.name.template`: A string template that supports 
`${filePathPattern:\<match group>}` substition.
   
   For example, to set the segment name to be the same as the input file name 
(without the trailing `.gz`), use:
   ``` yaml
   segmentNameGeneratorSpec:
     type: inputFile
     configs:
       file.path.pattern: '.+/(.+)\.gz'
       segment.name.template: '\${filePathPattern:\1}'
   ```
   
   <!-- If you have introduced a new feature or configuration, please add it to 
the documentation as well.
   See 
https://docs.pinot.apache.org/developers/developers-and-contributors/update-document
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to