sajjad-moradi opened a new pull request #6546:
URL: https://github.com/apache/incubator-pinot/pull/6546


   ## Description
   Currently Real Time Provisioning Helper tool takes a completed segment as an 
input. With the changes in this PR, a user can provide data characteristics 
instead of an actual segment. With this option, the tool does a preprocessing 
step and generates a segment based on the provided characteristics. After the 
segment is generated, it just uses that segment to provide insight on the 
memory footprint as usual.
   
   That main changes in the code:
   - refactored a few existing `Generator`s in `data/generator` package and 
also added a couple of new ones
   - added `Segment Generator` to `Memory Estimator`
   - modified `RealtimeProvisioingHelperCommand`
   ## Testing Done
   - Unit tests
   - Ran `pinot-admin RealtimeProvisioningHelper` locally with the same files 
provided in the unit tests (1M rows):
   ```bash
   2021/02/04 15:13:39.243 INFO [RealtimeProvisioningHelperCommand] [main] 
Executing command: RealtimeProvisioningHelper -tableConfigFile 
table-config.json -numPartitions 10 -pushFrequency null -numHosts 
2,4,6,8,10,12,14,16 -numHours 2,3,4,5,6,7,8,9,10,11,12 -schemaFile schema.json 
-dataCharacteristicsFile data-characteristics.json -ingestionRate 150 
-maxUsableHostMemory 48G -retentionHours 48
   2021/02/04 15:13:41.549 INFO [MemoryEstimator$SegmentGenerator] [main] 
Successfully generated data file: 
/var/folders/sd/fgc60hhj2994pk9vm1xw235h000xqy/T/2021-02-04_15:13:39-csv/output_0.csv
   2021/02/04 15:13:41.549 INFO [MemoryEstimator$SegmentGenerator] [main] 
Started creating segment from file: 
/var/folders/sd/fgc60hhj2994pk9vm1xw235h000xqy/T/2021-02-04_15:13:39-csv/output_0.csv
   2021/02/04 15:13:49.084 INFO [MemoryEstimator$SegmentGenerator] [main] 
Successfully created segment: testTable_18667_18766_0 at directory: 
/var/folders/sd/fgc60hhj2994pk9vm1xw235h000xqy/T/2021-02-04_15:13:39-segment/testTable_18667_18766_0
   2021/02/04 15:13:49.085 INFO [MemoryEstimator$SegmentGenerator] [main] 
Verifying the segment by loading it
   2021/02/04 15:13:49.161 INFO [MemoryEstimator$SegmentGenerator] [main] 
Successfully loaded segment: testTable_18667_18766_0 of size: 18766286 bytes
   
   ============================================================
   RealtimeProvisioningHelper -tableConfigFile table-config.json -numPartitions 
10 -pushFrequency null -numHosts 2,4,6,8,10,12,14,16 -numHours 
2,3,4,5,6,7,8,9,10,11,12 -schemaFile schema.json -dataCharacteristicsFile 
data-characteristics.json -ingestionRate 150 -maxUsableHostMemory 48G 
-retentionHours 48
   
   Note:
   
   * Table retention and push frequency ignored for determining retentionHours 
since it is specified in command
   * See https://docs.pinot.apache.org/operators/operating-pinot/tuning/realtime
   2021/02/04 15:13:53.141 INFO [RealtimeProvisioningHelperCommand] [main] 
   Memory used per host (Active/Mapped)
   
   numHosts --> 2               |4               |6               |8            
   |10              |12              |14              |16              |
   numHours
    2 --------> 7.67G/17.86G    |4.09G/9.53G     |2.56G/5.95G     |2.05G/4.76G  
   |1.53G/3.57G     |1.53G/3.57G     |1.53G/3.57G     |1.02G/2.38G     |
    3 --------> 8.03G/18.22G    |4.28G/9.72G     |2.68G/6.07G     |2.14G/4.86G  
   |1.61G/3.64G     |1.61G/3.64G     |1.61G/3.64G     |1.07G/2.43G     |
    4 --------> 8.38G/18.58G    |4.47G/9.91G     |2.79G/6.19G     |2.24G/4.95G  
   |1.68G/3.72G     |1.68G/3.72G     |1.68G/3.72G     |1.12G/2.48G     |
    5 --------> 9.02G/18.93G    |4.81G/10.1G     |3.01G/6.31G     |2.41G/5.05G  
   |1.8G/3.79G      |1.8G/3.79G      |1.8G/3.79G      |1.2G/2.52G      |
    6 --------> 9.1G/19.29G     |4.85G/10.29G    |3.03G/6.43G     |2.43G/5.14G  
   |1.82G/3.86G     |1.82G/3.86G     |1.82G/3.86G     |1.21G/2.57G     |
    7 --------> 9.59G/20.5G     |5.12G/10.93G    |3.2G/6.83G      |2.56G/5.47G  
   |1.92G/4.1G      |1.92G/4.1G      |1.92G/4.1G      |1.28G/2.73G     |
    8 --------> 9.81G/20G       |5.23G/10.67G    |3.27G/6.67G     |2.62G/5.33G  
   |1.96G/4G        |1.96G/4G        |1.96G/4G        |1.31G/2.67G     |
    9 --------> 11.01G/21.21G   |5.87G/11.31G    |3.67G/7.07G     |2.94G/5.66G  
   |2.2G/4.24G      |2.2G/4.24G      |2.2G/4.24G      |1.47G/2.83G     |
   10 --------> 10.8G/20.71G    |5.76G/11.05G    |3.6G/6.9G       |2.88G/5.52G  
   |2.16G/4.14G     |2.16G/4.14G     |2.16G/4.14G     |1.44G/2.76G     |
   11 --------> 11.87G/21.21G   |6.33G/11.31G    |3.96G/7.07G     |3.16G/5.66G  
   |2.37G/4.24G     |2.37G/4.24G     |2.37G/4.24G     |1.58G/2.83G     |
   12 --------> 11.23G/21.43G   |5.99G/11.43G    |3.74G/7.14G     |3G/5.71G     
   |2.25G/4.29G     |2.25G/4.29G     |2.25G/4.29G     |1.5G/2.86G      |
   2021/02/04 15:13:53.142 INFO [RealtimeProvisioningHelperCommand] [main] 
   Optimal segment size
   
   numHosts --> 2               |4               |6               |8            
   |10              |12              |14              |16              |
   numHours
    2 --------> 19.33M          |19.33M          |19.33M          |19.33M       
   |19.33M          |19.33M          |19.33M          |19.33M          |
    3 --------> 29M             |29M             |29M             |29M          
   |29M             |29M             |29M             |29M             |
    4 --------> 38.66M          |38.66M          |38.66M          |38.66M       
   |38.66M          |38.66M          |38.66M          |38.66M          |
    5 --------> 48.33M          |48.33M          |48.33M          |48.33M       
   |48.33M          |48.33M          |48.33M          |48.33M          |
    6 --------> 57.99M          |57.99M          |57.99M          |57.99M       
   |57.99M          |57.99M          |57.99M          |57.99M          |
    7 --------> 67.66M          |67.66M          |67.66M          |67.66M       
   |67.66M          |67.66M          |67.66M          |67.66M          |
    8 --------> 77.32M          |77.32M          |77.32M          |77.32M       
   |77.32M          |77.32M          |77.32M          |77.32M          |
    9 --------> 86.99M          |86.99M          |86.99M          |86.99M       
   |86.99M          |86.99M          |86.99M          |86.99M          |
   10 --------> 96.65M          |96.65M          |96.65M          |96.65M       
   |96.65M          |96.65M          |96.65M          |96.65M          |
   11 --------> 106.32M         |106.32M         |106.32M         |106.32M      
   |106.32M         |106.32M         |106.32M         |106.32M         |
   12 --------> 115.98M         |115.98M         |115.98M         |115.98M      
   |115.98M         |115.98M         |115.98M         |115.98M         |
   2021/02/04 15:13:53.144 INFO [RealtimeProvisioningHelperCommand] [main] 
   Consuming memory
   
   numHosts --> 2               |4               |6               |8            
   |10              |12              |14              |16              |
   numHours
    2 --------> 1.16G           |632.16M         |395.1M          |316.08M      
   |237.06M         |237.06M         |237.06M         |158.04M         |
    3 --------> 1.66G           |904.1M          |565.06M         |452.05M      
   |339.04M         |339.04M         |339.04M         |226.02M         |
    4 --------> 2.15G           |1.15G           |735.02M         |588.02M      
   |441.01M         |441.01M         |441.01M         |294.01M         |
    5 --------> 2.65G           |1.41G           |904.98M         |723.99M      
   |542.99M         |542.99M         |542.99M         |361.99M         |
    6 --------> 3.15G           |1.68G           |1.05G           |859.96M      
   |644.97M         |644.97M         |644.97M         |429.98M         |
    7 --------> 3.65G           |1.95G           |1.22G           |995.93M      
   |746.94M         |746.94M         |746.94M         |497.96M         |
    8 --------> 4.15G           |2.21G           |1.38G           |1.11G        
   |848.92M         |848.92M         |848.92M         |565.95M         |
    9 --------> 4.64G           |2.48G           |1.55G           |1.24G        
   |950.9M          |950.9M          |950.9M          |633.93M         |
   10 --------> 5.14G           |2.74G           |1.71G           |1.37G        
   |1.03G           |1.03G           |1.03G           |701.92M         |
   11 --------> 5.64G           |3.01G           |1.88G           |1.5G         
   |1.13G           |1.13G           |1.13G           |769.9M          |
   12 --------> 6.14G           |3.27G           |2.05G           |1.64G        
   |1.23G           |1.23G           |1.23G           |837.89M         |
   2021/02/04 15:13:53.145 INFO [RealtimeProvisioningHelperCommand] [main] 
   Total number of segments queried per host (for all partitions)
   
   numHosts --> 2               |4               |6               |8            
   |10              |12              |14              |16              |
   numHours
    2 --------> 360             |192             |120             |96           
   |72              |72              |72              |48              |
    3 --------> 240             |128             |80              |64           
   |48              |48              |48              |32              |
    4 --------> 180             |96              |60              |48           
   |36              |36              |36              |24              |
    5 --------> 150             |80              |50              |40           
   |30              |30              |30              |20              |
    6 --------> 120             |64              |40              |32           
   |24              |24              |24              |16              |
    7 --------> 105             |56              |35              |28           
   |21              |21              |21              |14              |
    8 --------> 90              |48              |30              |24           
   |18              |18              |18              |12              |
    9 --------> 90              |48              |30              |24           
   |18              |18              |18              |12              |
   10 --------> 75              |40              |25              |20           
   |15              |15              |15              |10              |
   11 --------> 75              |40              |25              |20           
   |15              |15              |15              |10              |
   12 --------> 60              |32              |20              |16           
   |12              |12              |12              |8               |
   
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to