Under which category would you file this issue?
Airflow Core
Apache Airflow version
3.2.0.dev0
What happened and how to reproduce it?
In Airflow 3, logical_date is nullable for many run types. The scheduler's auto-pausing logic in dagrun.py relies on .order_by(DagRun.logical_date.desc()), which is non-deterministic for NULLs and fails to isolate Manual vs. Scheduled runs.
Steps to Reproduce:
Set max_consecutive_failed_dag_runs=3.
A healthy scheduled run succeeds (Run A).
3 Manual test runs fail with logical_date=None (Runs B, C, D).
On Postgres/MySQL, the query order_by(logical_date.desc()) handles NULLs inconsistently. Often, the old success (Run A) is returned in the "top 3," preventing the auto-pause.
In other cases, manual test failures "pollute" the count and pause the production schedule unnecessarily.
What you think should happen instead?
Deterministic Ordering: The scheduler should use a stable ordering mechanism that accounts for nullable logical_date by using order_by(DagRun.logical_date.desc().nulls_last(), DagRun.id.desc()) or prioritizing the run_after column.
RunType Isolation: Evaluation of consecutive failures should be isolated by run_type (e.g., only scheduled runs should trigger a production auto-pause) to prevent manual test failures from impacting automated schedules.
The DagRun model in Airflow 3 was updated to make logical_date optional, but the logic in _check_last_n_dagruns_failed in airflow/models/dagrun.py was not updated to handle this change. Specifically, the query at line 868: .order_by(DagRun.logical_date.desc()) is a regression that was missed when similar fixes were applied in PR #47301. It incorrectly treats all run types as a single chronological sequence and uses an unstable sort on a nullable column.
Operating System
macOS
Deployment
Virtualenv installation
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
N/A
Official Helm Chart version
Not Applicable
Kubernetes Version
N/A
Helm Chart configuration
N/A
Docker Image customizations
N/A
Anything else?
This issue was identified while researching the impact of nullable logical_date on core scheduler stability in Airflow 3. It appears to be a regression that was missed when similar fixes for nullable dates were applied elsewhere (such as in PR #47301 for get_previous_scheduled_dagrun). This problem occurs every time manual and scheduled runs are mixed in the history of a DAG with auto-pausing enabled.
Are you willing to submit PR?
Code of Conduct
Under which category would you file this issue?
Airflow Core
Apache Airflow version
3.2.0.dev0
What happened and how to reproduce it?
In Airflow 3, logical_date is nullable for many run types. The scheduler's auto-pausing logic in dagrun.py relies on .order_by(DagRun.logical_date.desc()), which is non-deterministic for NULLs and fails to isolate Manual vs. Scheduled runs.
Steps to Reproduce:
Set max_consecutive_failed_dag_runs=3.
A healthy scheduled run succeeds (Run A).
3 Manual test runs fail with logical_date=None (Runs B, C, D).
On Postgres/MySQL, the query order_by(logical_date.desc()) handles NULLs inconsistently. Often, the old success (Run A) is returned in the "top 3," preventing the auto-pause.
In other cases, manual test failures "pollute" the count and pause the production schedule unnecessarily.
What you think should happen instead?
Deterministic Ordering: The scheduler should use a stable ordering mechanism that accounts for nullable logical_date by using order_by(DagRun.logical_date.desc().nulls_last(), DagRun.id.desc()) or prioritizing the run_after column.
RunType Isolation: Evaluation of consecutive failures should be isolated by run_type (e.g., only scheduled runs should trigger a production auto-pause) to prevent manual test failures from impacting automated schedules.
The DagRun model in Airflow 3 was updated to make logical_date optional, but the logic in _check_last_n_dagruns_failed in airflow/models/dagrun.py was not updated to handle this change. Specifically, the query at line 868: .order_by(DagRun.logical_date.desc()) is a regression that was missed when similar fixes were applied in PR #47301. It incorrectly treats all run types as a single chronological sequence and uses an unstable sort on a nullable column.
Operating System
macOS
Deployment
Virtualenv installation
Apache Airflow Provider(s)
No response
Versions of Apache Airflow Providers
N/A
Official Helm Chart version
Not Applicable
Kubernetes Version
N/A
Helm Chart configuration
N/A
Docker Image customizations
N/A
Anything else?
This issue was identified while researching the impact of nullable logical_date on core scheduler stability in Airflow 3. It appears to be a regression that was missed when similar fixes for nullable dates were applied elsewhere (such as in PR #47301 for get_previous_scheduled_dagrun). This problem occurs every time manual and scheduled runs are mixed in the history of a DAG with auto-pausing enabled.
Are you willing to submit PR?
Code of Conduct