[ 
https://issues.apache.org/jira/browse/IMPALA-11810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-11810:
-----------------------------------
    Component/s: Frontend
                     (was: fe)

> The mem estimate is incorrect in HdfsScanNode 
> ----------------------------------------------
>
>                 Key: IMPALA-11810
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11810
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 4.2.0, Impala 4.1.1
>            Reporter: jhkcool
>            Priority: Major
>         Attachments: after_used_materialized.png, current_totalBytes.png
>
>
> About perInstanceMemEstimate calculate in the below method of the 
> HdfsScanNode class:
> {code:java}
> @Override  
> public void computeNodeResourceProfile(TQueryOptions queryOptions) {
> ...
>     long avgScanRangeBytes = (long) Math.ceil(sumValues(totalBytesPerFs_) / 
> (double) scanRangeSize);
> ...
> }{code}
> All table data file sizes are used in the calculation of scan hdfs memory 
> consumption, it is unreasonable. Because not all data in the table needs to 
> be read into the memory, but only the materialized fields are read, so only 
> the data file size occupied by the materialized field needs to be calculated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to