Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soft_fail | operator is skipped in all cases and not only "data" related fail #40787

Closed
1 of 2 tasks
raphaelauv opened this issue Jul 15, 2024 · 13 comments · Fixed by #40915
Closed
1 of 2 tasks

soft_fail | operator is skipped in all cases and not only "data" related fail #40787

raphaelauv opened this issue Jul 15, 2024 · 13 comments · Fixed by #40915
Assignees
Labels
area:core good first issue kind:bug This is a clearly a bug pending-response provider:amazon AWS/Amazon - related issues stale Stale PRs per the .github/workflows/stale.yml policy file

Comments

@raphaelauv
Copy link
Contributor

raphaelauv commented Jul 15, 2024

Apache Airflow version

2.9.2

What happened?

S3KeySensor with soft_fail is not failing on configuration errors (missing credentials or airflow connection )

What you think should happen instead?

S3KeySensor with soft_fail should fail on missing configuration and only skip on missing S3 data

How to reproduce

from datetime import timedelta
from airflow import DAG
from airflow.providers.amazon.aws.sensors.s3 import S3KeySensor

with DAG(
        dag_id="xx",
        schedule_interval=None,
):

    S3KeySensor(
        task_id="s3_sensor",
        bucket_key=f"a/*",
        bucket_name="toto",
        wildcard_match=True,
        soft_fail=True,
        timeout=10,
        poke_interval=15)
[2024-07-15T10:02:01.886+0000] {baseoperator.py:400} WARNING - S3KeySensor.execute cannot be called outside TaskInstance!
[2024-07-15T10:02:01.886+0000] {s3.py:117} INFO - Poking for key : s3://toto/a/*
[2024-07-15T10:02:01.905+0000] {base_aws.py:587} WARNING - Unable to find AWS Connection ID 'aws_default', switching to empty.
[2024-07-15T10:02:01.906+0000] {base_aws.py:164} INFO - No connection ID provided. Fallback on boto3 credential strategy (region_name=None). See: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html
[2024-07-15T10:02:03.675+0000] {taskinstance.py:441} INFO - ::group::Post task execution logs
[2024-07-15T10:02:03.675+0000] {taskinstance.py:2506} INFO - Skipping due to soft_fail is set to True.
[2024-07-15T10:02:03.685+0000] {taskinstance.py:1206} INFO - Marking task as SKIPPED.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@raphaelauv raphaelauv added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jul 15, 2024
@dosubot dosubot bot added the provider:amazon AWS/Amazon - related issues label Jul 15, 2024
@eladkal eladkal added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Jul 15, 2024
@ellisms
Copy link
Contributor

ellisms commented Jul 15, 2024

I'll take a look.

@raphaelauv
Copy link
Contributor Author

thanks @ellisms

also the connector do not fail if creds are incorrect or there is missing conf like endpoint_url ( example when using minio )

@Seoji
Copy link

Seoji commented Jul 20, 2024

I think it looks fine
other Sensor, something like databricks, mongo, redis work same as S3.
they made their own hooks in poke method.
if it is wrong, we should change whold sensor :)

@raphaelauv
Copy link
Contributor Author

It's S3keySensor not S3CredsSensor.

It's about checking if data is present or not in the bucket . soft_fail means and only means in case of timeout and no data then the operator is skipped.

@Seoji
Copy link

Seoji commented Jul 20, 2024

I understood, thankyou for your explanation.
I miss understood purpose of soft_fail.
#33401
I think this PR make this side effect
thankyou again raphaelauv

@raphaelauv raphaelauv changed the title S3KeySensor - soft_fail | operator is skipped if missing connection or creds soft_fail | operator is skipped in all cases and not only "data" related fail Jul 21, 2024
@raphaelauv
Copy link
Contributor Author

@Seoji yes the problem come from this PR , using the parameter soft_fail to skip any exception in the BaseSensor class was not a good idea.

Every operator should manage itself the soft_fail parameter OR we must introduce another parameter for the case were an user want to skip no matter the exception

wdyt @Lee-W ?

@Lee-W
Copy link
Member

Lee-W commented Jul 21, 2024

As soft_fail is part of the BaseSensor and can be handled as a whole, I think it would still be better if we could handle it there. The current description of it is soft_fail – Set to true to mark the task as SKIPPED on failure

We should probably introduce a new parameter for special cases. If most of the sensors aren't supposed to work the current soft_fail way, then maybe we should start the discussion on whether to keep soft_fail in BaseSensor.

@raphaelauv
Copy link
Contributor Author

@potiuk could you please re-open the issue , since PR was reverted , thanks

@potiuk potiuk reopened this Jul 30, 2024
@potiuk
Copy link
Member

potiuk commented Jul 30, 2024

Did.

Copy link

This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Aug 15, 2024
@raphaelauv
Copy link
Contributor Author

No stale

@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Aug 16, 2024
Copy link

This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Aug 30, 2024
Copy link

github-actions bot commented Sep 6, 2024

This issue has been closed because it has not received response from the issue author.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core good first issue kind:bug This is a clearly a bug pending-response provider:amazon AWS/Amazon - related issues stale Stale PRs per the .github/workflows/stale.yml policy file
Projects
None yet
6 participants