Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FTPFileTransmitOperator #26974

Merged
merged 13 commits into from
Dec 6, 2022
Merged

Add FTPFileTransmitOperator #26974

merged 13 commits into from
Dec 6, 2022

Conversation

RachitSharma2001
Copy link
Contributor

The corresponding issue: #26531

I have added the FTPOperator, which allows for writing to files in FTP and downloading them to the local directory.

I have also added tests in the tests/providers/ftp/operators directory. In addition, I modified the scripts/docker/entrypoint_ci.sh file to pull up a local FTP server in ci in order to perform these tests. I have made sure to follow the suggested format for adding new airflow operators from here.

After this PR is approved and merged, I will add the FTPSOperator as well.

@boring-cyborg
Copy link

boring-cyborg bot commented Oct 10, 2022

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

Connection(
conn_id="ftp_default",
conn_type="ftp",
host="localhost",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an ftp_default connection, as it previously did not exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though the unit tests no longer use a local FTP Server, I still think that having this FTP Default connection would be useful, as without this, defining the following piece of code will throw an error indicating that the "connection with id 'ftp_default' does not exist": hook = FTPHook()

Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation would be awesome too :)

@RachitSharma2001 RachitSharma2001 force-pushed the main branch 2 times, most recently from 8f9f3fb to da4a16c Compare October 14, 2022 21:17
@RachitSharma2001
Copy link
Contributor Author

@vincbeck Thank you for your suggestions. I have integrated the suggested changes. As for documentation, do you mean that I should add some operator file to docs/apache-airflow-providers-ftp that outlines how to use the FTPOperator, in a similar way to the Tableau Operator from here?

@vincbeck
Copy link
Contributor

Exactly :)

@RachitSharma2001 RachitSharma2001 force-pushed the main branch 6 times, most recently from 38ed8db to bdd1d6b Compare October 15, 2022 17:24
@RachitSharma2001
Copy link
Contributor Author

@vincbeck I have added documentation to docs/apache-airflow-providers-ftp/operators (here) that specifies how to use the FTPOperator. Let me know if these changes look good and if there is anything else you think should be added.

@RachitSharma2001 RachitSharma2001 force-pushed the main branch 4 times, most recently from 493cca5 to 11a191e Compare October 16, 2022 18:12
@eladkal eladkal changed the title Add FTP Operator Add FTPFileTransmitOperator Nov 17, 2022
@potiuk
Copy link
Member

potiuk commented Nov 27, 2022

@RachitSharma2001 - How about changing to AIP-47 and rebasing that one ?

@RachitSharma2001
Copy link
Contributor Author

@potiuk Sorry for late reply, got back from vacation today. That sounds good, I will change to AIP-47 and rebase.

create_intermediate_dirs=True,
)
# [END howto_operator_ftp_get]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you don't have any teardown tasks here, but I'd prefer that you still add the watcher task and a task chain, just to match other AIP-47 spec tests and so that future additions are more straightforward

@o-nikolas
Copy link
Contributor

Some checks also failed, do you mind re-running them @RachitSharma2001 (rebasing your branch will do it)

@potiuk
Copy link
Member

potiuk commented Dec 5, 2022

Yep. I woudl have rebased myself, but there are conflicts (so you need to rebase abway @RachitSharma2001

@RachitSharma2001
Copy link
Contributor Author

@potiuk @o-nikolas I have added the requested changes and updated my branch. Let me know if the current changes look good and if there is anything else I should add.

Copy link
Contributor

@o-nikolas o-nikolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good!

Though your build is failing due to the infamous:

The hosted runner: GitHub Actions 209 lost communication with the server. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

I was hitting this in some of my PRs lately too. I did a bunch of deep dive but was not able to come up with a RCA. Try rebasing form main a few times to see if you can get a pass.

@RachitSharma2001
Copy link
Contributor Author

@o-nikolas Thank you! Although, it seems that the tests were reran and now all the tests are passing.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NICE!. I know @eladkal had previously a doubt whether those shoudl not be local_to or _to_local set of transfer operators, but I personaly do not see the need for consistency (and I consider the local_* a bit strange now when I think of it.

WDYT Everyone.?

@o-nikolas
Copy link
Contributor

NICE!. I know @eladkal had previously a doubt whether those shoudl not be local_to or _to_local set of transfer operators, but I personaly do not see the need for consistency (and I consider the local_* a bit strange now when I think of it.

WDYT Everyone.?

I don't feel too strongly either way, I'm personally happy with the current name 👍

@eladkal
Copy link
Contributor

eladkal commented Dec 6, 2022

but I personaly do not see the need for consistency (and I consider the local_* a bit strange now when I think of it.

Consistency is important. Many users are using multiple providers lack of consistency affects development velocity as you always need to look up how specific provider/service name things. If all use same convention then you don't need to look it up.
If we are not happy with current convention then we can start the process of slowly changing. I have no strong feelings for either way I just want to keep things crystal clear.

Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
thank you @RachitSharma2001

@eladkal eladkal merged commit 39f501d into apache:main Dec 6, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Dec 6, 2022

Awesome work, congrats on your first merged pull request!

potiuk added a commit to potiuk/airflow that referenced this pull request Dec 6, 2022
jedcunningham pushed a commit that referenced this pull request Dec 6, 2022
@pierrejeambrun pierrejeambrun added this to the Airflow 2.6.0 milestone Jan 9, 2023
@pierrejeambrun pierrejeambrun added the type:new-feature Changelog: New Features label Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants