csp33 opened a new issue, #36326:
URL: https://github.com/apache/airflow/issues/36326

   ### Apache Airflow version
   
   2.8.0
   
   ### If "Other Airflow 2 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   When a task has a long name (or it's inside one or many task group(s) with 
long names), the name of the pod that is created for the task is different than 
the task hostname (from Airflow Metadata DB)
   
   ### What you think should happen instead?
   
   The task hostname should be the same as the pod name
   
   ### How to reproduce
   
   1. Run the following DAG
     ```python
     import time
   from airflow.decorators import task, task_group
   from airflow.models import DAG
   from datetime import datetime, timedelta
   
   with DAG(
       dag_id="test_long_task_name",
       schedule=None,
       default_args={
           "depends_on_past": False,
           "wait_for_downstream": False,
           "start_date": datetime(2023, 12, 20),
           "retry_delay": timedelta(seconds=10),
           "retries": 1,
       },
       catchup=False,
       max_active_runs=1,
       render_template_as_native_obj=True,
   ) as dag:
   
       @task_group
       def task_group_with_long_name():
           @task_group
           def inner_task_group_with_long_name():
               @task
               def task1():
                   pass
   
               @task
               def task2():
                   pass
   
               task1()
               task2()
   
           inner_task_group_with_long_name()
   
       task_group_with_long_name()
     ```
   
![image](https://github.com/apache/airflow/assets/28935464/c5888712-2d4f-4c79-9d13-3f24ea412664)
   
   2. Check the hostname property for both tasks:
   
![image](https://github.com/apache/airflow/assets/28935464/60950e2f-e3db-4243-8921-6b4826d7d795)
   
![image](https://github.com/apache/airflow/assets/28935464/95e49871-d91f-4df3-a9e3-aab700cb90bc)
   
   3. Check the pod name for both tasks (scheduler logs):
   ```
   [2023-12-20T09:45:05.497+0000] {kubernetes_executor_utils.py:396} INFO - 
Creating kubernetes pod for job is 
TaskInstanceKey(dag_id='test_long_task_name', 
task_id='task_group_with_long_name.inner_task_group_with_long_name.task2', 
run_id='manual__2023-12-20T09:45:00+00:00', try_number=1, map_index=-1), with 
pod name 
test-long-task-name-task-group-with-long-name-inner-task-group-with-lon-efj7zr6z,
 annotations: <omitted>
   ```
   
   ```
   [2023-12-20T09:45:05.776+0000] {kubernetes_executor_utils.py:396} INFO - 
Creating kubernetes pod for job is 
TaskInstanceKey(dag_id='test_long_task_name', 
task_id='task_group_with_long_name.inner_task_group_with_long_name.task1', 
run_id='manual__2023-12-20T09:45:00+00:00', try_number=1, map_index=-1), with 
pod name 
test-long-task-name-task-group-with-long-name-inner-task-group-with-lon-diwm37yp,
 annotations: <omitted>
   ```
   
   
   ## Task1
   - hostname: `test-long-task-name-task-group-with-long-name-inner-task-group`
   - pod name: 
`test-long-task-name-task-group-with-long-name-inner-task-group-with-lon-diwm37yp`
   ## Task2
   - hostname: `test-long-task-name-task-group-with-long-name-inner-task-group`
   - pod name: 
`test-long-task-name-task-group-with-long-name-inner-task-group-with-lon-efj7zr6z`
   
   ### Operating System
   
   Airflow in Kubernetes
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-cncf-kubernetes==7.11.0
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   This issue also happens in previous Airflow versions.
   
   After some research, I've found out the following:
   - The function that is used to determine the hostname is defined with the 
[hostname_callable](https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#hostname-callable)
 parameter.
   - There's a maximum size for this hostname 
[[source](https://stackoverflow.com/questions/8724954/what-is-the-maximum-number-of-characters-for-a-host-name-in-unix)].
 In the Airflow image, this limit is set to **64**.
   - In the k8s provider, the pod name maximum size is set to 80 
[[source](https://github.com/apache/airflow/blob/2.8.0/airflow/providers/cncf/kubernetes/kubernetes_helper_functions.py#L62)]
   
   
   So, for me, there are two possible approaches to solve this issue
   1. Modify the k8s provider pod name maximum size to 64, matching 
`HOST_NAME_MAX`.
   2. Increase the `HOST_NAME_MAX` limit in the Airflow image.
   
   I believe the first one is the safest
   
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to