unigof opened a new issue, #17407: URL: https://github.com/apache/dolphinscheduler/issues/17407
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened When I configured worker grafana, I found almost all task instance be dispatched to same worker node. There are error for calculating HostWeight when using `LowerWeightHostManager`, details as below: ``` HostWeight.java private double calculateWeight(double cpuUsage, double memoryUsage, double diskUsage, double threadPoolUsage, long startTime) { double calculatedWeight = 100 - (cpuUsage * CPU_USAGE_FACTOR + memoryUsage * MEMORY_USAGE_FACTOR + diskUsage * DISK_USAGE_FACTOR + threadPoolUsage * THREAD_USAGE_FACTOR); long uptime = System.currentTimeMillis() - startTime; if (uptime > 0 && uptime < Constants.WARM_UP_TIME) { // If the warm-up is not over, add the weight return calculatedWeight * Constants.WARM_UP_TIME / uptime; } return calculatedWeight; } ``` The `calculatedWeight` was reduced by 100, so the smallest value represent the biggest weight. But when select worker node using a smallest value: ``` LowerWeightRoundRobin.java public HostWeight doSelect(Collection<HostWeight> sources) { double totalWeight = 0; double lowWeight = 0; HostWeight lowerNode = null; for (HostWeight hostWeight : sources) { totalWeight += hostWeight.getWeight(); hostWeight.setCurrentWeight(hostWeight.getCurrentWeight() + hostWeight.getWeight()); if (lowerNode == null || lowWeight > hostWeight.getCurrentWeight()) { lowerNode = hostWeight; lowWeight = hostWeight.getCurrentWeight(); } } if (lowerNode != null) { lowerNode.setCurrentWeight(lowerNode.getCurrentWeight() + totalWeight); } return lowerNode; } ``` ### What you expected to happen fix it. ### How to reproduce Run some task and check grafana, it's config as below: ``` sum(increase(ds_task_execution_count_by_type_total{application="worker-server", namespace=~"$namespace", app_name=~"$app_name", cluster=~"$cluster", instance=~"$instance", task_type="SHELL"}[5m])) by (instance) ``` And hostweight and workergroup use default. ### Anything else We use ds 3.2.2 version in our production env, so need to fix it. ### Version 3.2.x ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
