Elliott Shugerman created AIRFLOW-3973:
------------------------------------------
Summary: `airflow initdb` logs errors when `Variable` is used in
DAGs and Postgres is used for the internal database
Key: AIRFLOW-3973
URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
Project: Apache Airflow
Issue Type: Bug
Reporter: Elliott Shugerman
Assignee: Elliott Shugerman
h2. Example
{{ERROR [airflow.models.DagBag] Failed to import:
/home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): File
"/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
line 1236, in _execute_context cursor, statement, parameters, context File
"/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
line 536, in do_execute cursor.execute(statement, parameters)
psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM
variable}}
h2. Explanation
The first thing {{airflow initdb}} does is run the Alembic migrations. All
migrations are run in one transaction. Most tables, including the {{variable}}
table, are defined in the initial migration. A [later
migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}}
calls its {{collect_dags}} method, which scans the DAGs directory and attempts
to load all DAGs it finds. When it loads a DAG that uses a {{Variable}}, it
will query the database to see if that {{Variable}} is defined in the
{{variable}} table. It's not clear to me how exactly the connection for that
query is created, but I think it is a fair assumption that it does _not_ use
the same transaction that is used to run the migrations. Since the migrations
are not yet complete, and all migrations are run in one transaction, the
migration that creates the {{variable}} table has not yet been committed, and
therefore the table does not exist to any other connection/transaction. This
raises {{ProgrammingError}}, which is caught and logged by {{collect_dags}}.
NOTE: This does not occur with the default SQLite database.
h2. Proposed Solution
Run each Alembic migration in its own transaction. I will be opening a pull
request which accomplishes this shortly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)