Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAG-level permissions set in Web UI disappear from roles on DAG sync #32839

Closed
2 tasks done
TruthIsNear opened this issue Jul 25, 2023 · 6 comments · Fixed by #33632
Closed
2 tasks done

DAG-level permissions set in Web UI disappear from roles on DAG sync #32839

TruthIsNear opened this issue Jul 25, 2023 · 6 comments · Fixed by #33632
Assignees
Labels
affected_version:2.6 Issues Reported for 2.6 area:core kind:bug This is a clearly a bug
Milestone

Comments

@TruthIsNear
Copy link

TruthIsNear commented Jul 25, 2023

Apache Airflow version

2.6.3

What happened

Versions: 2.6.2, 2.6.3, main

PR #30340 introduced a bug that happens whenever a DAG gets updated or a new DAG is added

Potential fix: Adding the code that was removed in PR #30340 back to airflow/models/dagbag.py fixes the issue. I've tried it on the current main branch using Breeze.

What you think should happen instead

Permissions set in Web UI stay whenever a DAG sync happens

How to reproduce

  1. Download docker-compose.yaml:
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.6.2/docker-compose.yaml'
  1. Create dirs and set the right Airflow user:
mkdir -p ./dags ./logs ./plugins ./config && \
echo -e "AIRFLOW_UID=$(id -u)" > .env
  1. Add test_dag.py to ./dags:
import datetime

import pendulum

from airflow import DAG
from airflow.operators.bash import BashOperator

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    catchup=False,
    dagrun_timeout=datetime.timedelta(minutes=60),
) as dag:
    test = BashOperator(
        task_id="test",
        bash_command="echo 1",
    )
if __name__ == "__main__":
    dag.test()
  1. Run docker compose: docker compose up
  2. Create role in Web UI: Security > List Roles > Add a new record:
    Name: test
    Permissions: can read on DAG:test
  3. Update test_dag.py: change bash_command="echo 1" to bash_command="echo 2"
  4. Check test role's permissions: can read on DAG:test will be removed

Another option is to add a new dag instead of changing the existing one:
6. Add another dag to ./dags, code doesn't matter
7. Restart scheduler: docker restart [scheduler container name]
9. Check test role's permissions: can read on DAG:test will be removed

Operating System

Ubuntu 22.04.1 LTS

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

Docker 24.0.2

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@TruthIsNear TruthIsNear added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jul 25, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 25, 2023

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@potiuk potiuk modified the milestones: Airflow 2.7.0, Airflow 2.6.4 Jul 25, 2023
@potiuk potiuk removed the needs-triage label for new issues that we didn't triage yet label Jul 25, 2023
@potiuk potiuk added the affected_version:2.6 Issues Reported for 2.6 label Jul 25, 2023
@potiuk
Copy link
Member

potiuk commented Jul 25, 2023

Same problem as in #32178 #32837. Feel free to bring it back.

@SamWheating
Copy link
Contributor

Just catching up on this issue and figured I'd write out some thoughts here -

We have two places to define permissions (manually via the UI/CLI and in code via DAG.access_control) and there's no stance on which one is authoritative which leads to all sorts of unexpected behaviour (mostly around not cleaning up roles enough, or cleaning up roles too aggressively)

I would propose making the access control for a given DAG authoritative as long as it is explicitly set. This way, if a DAG was updated, removing access_control from a DAG would not trigger a cleanup, but setting access_control={} would.

So we have three cases. A DAG defined like this:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    access_control={'role1': {'can_read'}, 'role2': {'can_read', 'can_edit', 'can_delete'}}
) as dag:

Would overwrite any existing dag-level permissions for this DAG, ensuring that the roles listed here are the only ones with permissions specific to this DAG.

While a DAG with explicitly empty access_contol, like this:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    access_control={}
) as dag:

Would explicitly clear out all DAG-level permissions for this role.

While a DAG without any access_control defined:

with DAG(
    dag_id="test",
    schedule="0 0 * * *",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
) as dag:

Would just defer to the existing DAG-level roles in the database.

This presents its own hazards around dangling and stray permissions, but I think that any approach here is going to have some sort of downside. The important part is that the approach to DAG-level access is clearly-defined and well documented (neither of which is true at the moment).

Thoughts? I likely missed some consideration here, but I am working on an implementation PR which I will share shortly.

@biebiep
Copy link

biebiep commented Aug 23, 2023

@SamWheating

What happens to blanket roles in Scenario 2: empty access control if it is going to be authoritative and empty? Do all default rights/roles also get a block on them?

Or do things like the Admin role still supercede these specific accesses?

@SamWheating
Copy link
Contributor

What happens to blanket roles in Scenario 2: empty access control if it is going to be authoritative and empty? Do all default rights/roles also get a block on them?

Good question - currently the can (read|edit|delete) on DAG roles are treated as all-encompassing (something like can read on DAG:*), Here's the relevant portion of the auth code where DAG-level access is only checked if Airflow-wide access isn't permitted:

for perm in perms:
if perm in (
(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
(permissions.ACTION_CAN_EDIT, permissions.RESOURCE_DAG),
(permissions.ACTION_CAN_DELETE, permissions.RESOURCE_DAG),
):
can_access_all_dags = self.has_access(*perm)
if can_access_all_dags:
continue
action = perm[0]
if self.can_access_some_dags(action, dag_id):
continue
return False

I think that this is a good policy, and access-controls at defined in DAG code should only pertain to DAG-level access, not airflow-wide access.

So in the case of access_control={}, all of the dag-specific rules would be wiped, but the Airflow-wise permissions would still apply.

@potiuk
Copy link
Member

potiuk commented Aug 24, 2023

I think this is good and simple way of solving the problem. Thanks for looking into it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.6 Issues Reported for 2.6 area:core kind:bug This is a clearly a bug
Projects
None yet
5 participants