Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MwaaTriggerDagRunOperator and MwaaHook to Amazon Provider Package #46579

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
60 changes: 60 additions & 0 deletions docs/apache-airflow-providers-amazon/operators/mwaa.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

.. http://www.apache.org/licenses/LICENSE-2.0

.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.

==================================================
Amazon Managed Workflows for Apache Airflow (MWAA)
==================================================

`Amazon Managed Workflows for Apache Airflow (MWAA) <https://aws.amazon.com/managed-workflows-for-apache-airflow/>`__
is a managed service for Apache Airflow that lets you use your current, familiar Apache Airflow platform to orchestrate
your workflows. You gain improved scalability, availability, and security without the operational burden of managing
underlying infrastructure.

Prerequisite Tasks
------------------

.. include:: ../_partials/prerequisite_tasks.rst

Generic Parameters
------------------

.. include:: ../_partials/generic_parameters.rst

Operators
---------

.. _howto/operator:MwaaTriggerDagRunOperator:

Trigger a DAG run in an Amazon MWAA environment
===============================================

To trigger a DAG run in an Amazon MWAA environment you can use the
:class:`~airflow.providers.amazon.aws.operators.mwaa.MwaaTriggerDagRunOperator`

In the following example, the task ``trigger_dag_run`` triggers a dag run for a DAG with with the ID ``hello_world`` in
the environment ``MyAirflowEnvironment``.

.. exampleinclude:: /../../providers/tests/system/amazon/aws/example_mwaa.py
:language: python
:dedent: 4
:start-after: [START howto_operator_mwaa_trigger_dag_run]
:end-before: [END howto_operator_mwaa_trigger_dag_run]

References
----------

* `AWS boto3 library documentation for MWAA <https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/mwaa.html>`__
1 change: 1 addition & 0 deletions docs/spelling_wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1099,6 +1099,7 @@ muldelete
Multinamespace
mutex
mv
mwaa
mypy
Mysql
mysql
Expand Down
81 changes: 81 additions & 0 deletions providers/src/airflow/providers/amazon/aws/hooks/mwaa.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""This module contains AWS MWAA hook."""

from __future__ import annotations

from botocore.exceptions import ClientError

from airflow.providers.amazon.aws.hooks.base_aws import AwsBaseHook


class MwaaHook(AwsBaseHook):
"""
Interact with AWS Manager Workflows for Apache Airflow.

Provide thin wrapper around :external+boto3:py:class:`boto3.client("mwaa") <MWAA.Client>`

Additional arguments (such as ``aws_conn_id``) may be specified and
are passed down to the underlying AwsBaseHook.

.. seealso::
- :class:`airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
"""

def __init__(self, *args, **kwargs) -> None:
kwargs["client_type"] = "mwaa"
super().__init__(*args, **kwargs)

def invoke_rest_api(
self,
env_name: str,
path: str,
method: str,
body: dict | None = None,
query_params: dict | None = None,
) -> dict:
"""
Invoke the REST API on the Airflow webserver with the specified inputs.

.. seealso::
- :external+boto3:py:meth:`MWAA.Client.invoke_rest_api`

:param env_name: name of the MWAA environment
:param path: Apache Airflow REST API endpoint path to be called
:param method: HTTP method used for making Airflow REST API calls
:param body: Request body for the Apache Airflow REST API call
:param query_params: Query parameters to be included in the Apache Airflow REST API call
"""
body = body or {}
api_kwargs = {
"Name": env_name,
"Path": path,
"Method": method,
# Filter out keys with None values because Airflow REST API doesn't accept requests otherwise
"Body": {k: v for k, v in body.items() if v is not None},
"QueryParameters": query_params if query_params else {},
}
try:
result = self.get_conn().invoke_rest_api(**api_kwargs)
result.pop("ResponseMetadata", None)
return result
except ClientError as e:
to_log = e.response
to_log.pop("ResponseMetadata", None)
to_log.pop("Error", None)
self.log.error(to_log)
raise e
111 changes: 111 additions & 0 deletions providers/src/airflow/providers/amazon/aws/operators/mwaa.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""This module contains AWS MWAA operators."""

from __future__ import annotations

from collections.abc import Sequence
from typing import TYPE_CHECKING

from airflow.providers.amazon.aws.hooks.mwaa import MwaaHook
from airflow.providers.amazon.aws.operators.base_aws import AwsBaseOperator
from airflow.providers.amazon.aws.utils.mixins import aws_template_fields

if TYPE_CHECKING:
from airflow.utils.context import Context


class MwaaTriggerDagRunOperator(AwsBaseOperator[MwaaHook]):
"""
Trigger a Dag Run for a Dag in an Amazon MWAA environment.

.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:MwaaTriggerDagRunOperator`

:param env_name: The MWAA environment name (templated)
:param trigger_dag_id: The ID of the DAG to be triggered (templated)
:param trigger_run_id: The Run ID. The value of this field can be set only when creating the object. This
together with trigger_dag_id are a unique key. (templated)
:param logical_date: The logical date (previously called execution date). This is the time or interval
covered by this DAG run, according to the DAG definition. The value of this field can be set only when
creating the object. This together with trigger_dag_id are a unique key. (templated)
:param data_interval_start: The beginning of the interval the DAG run covers
:param data_interval_end: The end of the interval the DAG run covers
:param conf: Additional configuration parameters. The value of this field can be set only when creating
the object. (templated)
:param note: Contains manually entered notes by the user about the DagRun. (templated)
"""

aws_hook_class = MwaaHook
template_fields: Sequence[str] = aws_template_fields(
"env_name",
"trigger_dag_id",
"trigger_run_id",
"logical_date",
"data_interval_start",
"data_interval_end",
"conf",
"note",
)
template_fields_renderers = {"conf": "json"}
ui_color = "#6ad3fa"

def __init__(
self,
*,
env_name: str,
trigger_dag_id: str,
trigger_run_id: str | None = None,
logical_date: str | None = None,
data_interval_start: str | None = None,
data_interval_end: str | None = None,
conf: dict | None = None,
note: str | None = None,
**kwargs,
):
super().__init__(**kwargs)
self.env_name = env_name
self.trigger_dag_id = trigger_dag_id
self.trigger_run_id = trigger_run_id
self.logical_date = logical_date
self.data_interval_start = data_interval_start
self.data_interval_end = data_interval_end
self.conf = conf if conf else {}
self.note = note

def execute(self, context: Context) -> dict:
"""
Trigger a Dag Run for the Dag in the Amazon MWAA environment.

:param context: the Context object
:return: dict with information about the Dag run
For details of the returned dict, see :py:meth:`botocore.client.MWAA.invoke_rest_api`
"""
return self.hook.invoke_rest_api(
env_name=self.env_name,
path=f"/dags/{self.trigger_dag_id}/dagRuns",
method="POST",
body={
"dag_run_id": self.trigger_run_id,
"logical_date": self.logical_date,
"data_interval_start": self.data_interval_start,
"data_interval_end": self.data_interval_end,
"conf": self.conf,
"note": self.note,
},
)
6 changes: 6 additions & 0 deletions providers/src/airflow/providers/amazon/provider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -450,6 +450,9 @@ operators:
- integration-name: Amazon Managed Service for Apache Flink
python-modules:
- airflow.providers.amazon.aws.operators.kinesis_analytics
- integration-name: Amazon Managed Workflows for Apache Airflow (MWAA)
python-modules:
- airflow.providers.amazon.aws.operators.mwaa
- integration-name: Amazon Simple Storage Service (S3)
python-modules:
- airflow.providers.amazon.aws.operators.s3
Expand Down Expand Up @@ -658,6 +661,9 @@ hooks:
- integration-name: Amazon CloudWatch Logs
python-modules:
- airflow.providers.amazon.aws.hooks.logs
- integration-name: Amazon Managed Workflows for Apache Airflow (MWAA)
python-modules:
- airflow.providers.amazon.aws.hooks.mwaa
- integration-name: Amazon OpenSearch Serverless
python-modules:
- airflow.providers.amazon.aws.hooks.opensearch_serverless
Expand Down
Loading