Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat dag-defined access_control as authoritative if defined #33632

Merged
merged 6 commits into from
Aug 24, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions airflow/example_dags/example_dag_level_access_control.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Example DAG demonstrating the usage of the BashOperator."""
from __future__ import annotations

import datetime

import pendulum

from airflow import DAG
from airflow.operators.bash import BashOperator
from airflow.operators.empty import EmptyOperator

# [START howto_fine_grained_access_control]
with DAG(
dag_id="example_fine_grained_access",
start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
access_control={
"Viewer": {"can_edit", "can_create", "can_delete"},
}
) as fine_grained_access:
# [END howto_fine_grained_access_control]

task = EmptyOperator(
task_id="run_this",
)


# [START howto_no_fine_grained_access_control]
with DAG(
dag_id="example_no_fine_grained_access",
start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
access_control={},
) as no_fine_grained_access:
# [END howto_no_fine_grained_access_control]

task = EmptyOperator(
task_id="run_this",
)

# [START howto_indifferent_fine_grained_access_control]
with DAG(
dag_id="example_indifferent_to_fine_grained_access",
start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
access_control={},
) as indifferent_to_fine_grained_access:
# [END howto_indifferent_fine_grained_access_control]

task = EmptyOperator(
task_id="run_this",
)
22 changes: 4 additions & 18 deletions airflow/www/security.py
Original file line number Diff line number Diff line change
Expand Up @@ -681,26 +681,12 @@ def sync_perm_for_dag(
dag_resource_name = permissions.resource_name_for_dag(dag_id)
for dag_action_name in self.DAG_ACTIONS:
self.create_permission(dag_action_name, dag_resource_name)

def _revoke_all_stale_permissions(resource: Resource):
existing_dag_perms = self.get_resource_permissions(resource)
for perm in existing_dag_perms:
non_admin_roles = [role for role in perm.role if role.name != "Admin"]
for role in non_admin_roles:
self.log.info(
"Revoking '%s' on DAG '%s' for role '%s'",
perm.action,
dag_resource_name,
role.name,
)
self.remove_permission_from_role(role, perm)

if access_control:

if access_control is not None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most important change in this PR - if access control will proceed into the following block if and only if access_control is a non-empty dict, whereas if access control is not None will proceed given any dict, including an empty one.

self.log.info("Syncing DAG-level permissions for DAG '%s'", dag_resource_name)
self._sync_dag_view_permissions(dag_resource_name, access_control)
else:
resource = self.get_resource(dag_resource_name)
if resource:
_revoke_all_stale_permissions(resource)
self.log.info("Not syncing DAG-level permissions for DAG '%s' as access control is unset.", dag_resource_name)

def _sync_dag_view_permissions(self, dag_id: str, access_control: dict[str, Collection[str]]) -> None:
"""
Expand Down
36 changes: 36 additions & 0 deletions docs/apache-airflow/security/access-control.rst
Original file line number Diff line number Diff line change
Expand Up @@ -229,3 +229,39 @@ List Plugins Plugins.can_read
List Task Reschedules Task Reschedules.can_read Admin
List Triggers Triggers.can_read Admin
====================================== ======================================================================= ============

These DAG-level controls can be set directly through the UI / CLI, or encoded in the dags themselves through the access_control arg.

Order of precedence for DAG-level permissions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Since DAG-level access control can be configured in multiple places, conflicts are inevitable and a clear resolution strategy is required.

As a result, Airflow considers the `access_control` argument supplied on a DAG itself to be completely authoritative if present, which has a few effects:

- `access_control` will overwrite any previously existing DAG-level permissions if it is any value other than `None`.

.. exampleinclude:: /../../airflow/example_dags/example_dag_level_access_control.py
:language: python
:start-after: [START howto_fine_grained_access_control]
:end-before: [END howto_fine_grained_access_control]

- Setting `access_control={}` will wipe any existing DAG-level permissions for a given DAG from the DB.

.. exampleinclude:: /../../airflow/example_dags/example_dag_level_access_control.py
:language: python
:start-after: [START howto_no_fine_grained_access_control]
:end-before: [END howto_no_fine_grained_access_control]

- Removing the access_control block from a DAG altogether (or setting it to `None`) can leave dangling permissions.
Copy link
Member

@potiuk potiuk Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. This is a side effect that we can accept.

One small comment. Maybe a good idea would be to also accompany it with a new CLI command airflow dags clean-permissions and a new API call (eventually) to do it. Those can be separate PRs, but having them explicitly spelled out as way to clean permissions on a DAG that previously had permissions is a way to workaround the problem with dangling permissions.

I see that as an edge case, with this change there is already a way to clean permissions by setting access_control to {} - but I can see a situation that somoene would like to recover from the historical changes.


.. exampleinclude:: /../../airflow/example_dags/example_dag_level_access_control.py
:language: python
:start-after: [START howto_indifferent_fine_grained_access_control]
:end-before: [END howto_indifferent_fine_grained_access_control]

In the case that there is no `access_control` defined on the DAG itself, Airflow will defer to existing permissions defined in the DB, which
may have been set through the UI, CLI or by previous access_control args on the DAG in question.

In all cases, system-wide roles such as `Can edit on DAG` take precedence over dag-level access controls, such that they can be considered `Can edit on DAG: *`