Manis99803 opened a new issue #7843: URL: https://github.com/apache/pinot/issues/7843
Should the Dimension table honor replication at all or not (for quota), because looks like it don't seem to apply it. For example, say if I am having a pinot cluster with 4 servers, on creating a dimension table with storage configured to be 200 MB. Since this is a dimension table, the table is replicated on all the servers. Now, if we populate this dimension table with segments having 100k records and the uncompressed size of this segment being ~23 MB (232 byte for each record in the segment). So as per the calculation, the expectation was the table will be able to hold around ~900k records. However, that is not the case. The table is not processing more than 2 segments, i.e, when trying to push 3rd segment Pinot is throwing an error, looks like the following calculations are happening: ``` 23 MB (Size of each segment) * 2 (Number of segment already present) * 4 (since we have 4 nodes and dimension table is copied on all the nodes): = 23 * 2 * 4 = 184 MB Size of the new incoming uncompressed segment = 23 MB, so the newly estimated size = 184 + 23 = 207, this is greater than 200 MB that is why Pinot is not processing any more segment of 100k records. ``` On checking the code configuration [Code Link](https://github.com/apache/pinot/blob/master/pinot-controller/src/main/java/org/apache/pinot/controller/validation/StorageQuotaChecker.java#L83), numReplicas is being fetched from the table configuration and the allowedSize is getting calculated. If a table is replicated on 4 servers then the replication factor should be taken to be 4 while computing the allowedSize and not to be taken from the table configuration as the user might just specify 1 as the replication factor despite having more than 1 server in the setup -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org