Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bandit to pre-commit to detect common security issues #34247

Merged
merged 6 commits into from
Nov 10, 2023

Conversation

hussein-awala
Copy link
Member

closes: #34241

This PR adds bandit to pre-commit hooks and static checks, to detect common security issues.

I'm opening it as draft to get the report, then try to fix/ignore the issues in separate PRs.

setup.py Outdated
"ruff>=0.0.219",
"yamllint",
]
_devel_only_static_checks = ["pre-commit", "black", "ruff>=0.0.219", "yamllint", "bandit"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are tools that are installed in the image. You likely need to add bandit as dependency in the .pre-commit-config.yml as "dependencies" (see other examples).

@@ -1030,3 +1030,12 @@ repos:
files: ^airflow/migrations/versions/.*\.py$|^docs/apache-airflow/migrations-ref\.rst$
additional_dependencies: ['rich>=12.4.4']
## ONLY ADD PRE-COMMITS HERE THAT REQUIRE CI IMAGE
- id: bandit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment above - this pre-commit does not require Breeze CI image so it should be added before those that need it.

Copy link
Member

@potiuk potiuk Sep 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See:

## ADD MOST PRE-COMMITS ABOVE THAT LINE

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pre-commit fails with this error:

Executable `bandit` not found

I tried both CI image pre-commits and normal pre-commits, and it failed in both cases, knowing that I added bandit to _devel_only_static_checks list, do I need to open the PR from a branch in Airflow repo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - you need to use additional_dependencies . The way how regular pre-commits work is that pre-commit knows the language python and knows how to create an environment for it automatically. This is what additional_dependencies is for. In case of python precommit, this is really a specification of virtualenv it should create (so you need to add bandid as requirement THERE not in airfllow dependencies.

It understands other languages - for example for node you can specify npm dependencies in additional_dependencies field.

Pre-commit is actually even smart enough to re-use virtualenvs it creates between different checks if they have the same requirements or even between different projects that use pre-commit - because the venvs for pre-commit are created in ~/.cache/pre-commit/ in your home directory.

You can see exactly what happens when you remove ~/.cache/pre-commit folder. The first thing pre-commit will do it will find all the distinct venvs and node env and other language envs and will create all of them. It will take few minutes but you will see what's happening and you will understand what's going on.

Copy link
Member

@potiuk potiuk Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just to add - basically with pre-commit you have 20 or so pre-commit venvs that are automatically updated and maintained by pre-commit (distinct set of requirements). They have nothing to do with either airflow requirements nor breeze image requirements :). It's just using (well) the approach that for every tool (every check) you should have separate venv that you run it in. And pre-commit does all the heavy lifting to:

a) maintain them (upgrade when you change the requirements
b) share them between checks if several checks use the same set of requirements
c) make sure each check uses the right venv..

Copy link
Member

@potiuk potiuk Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to add even more: system pre-commits are special - they will just run whatever you put there as a script. This is how we are adding the pre-commits that are using breeze CI image to run stuff - those scripts of ours will make sure the image is locally built and they will run docker run to run whatever they need to run as part of the check. There the environment managerment responsibility is passed from pre-commit to breeze essentially.

Copy link
Member

@potiuk potiuk Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or we rather used to do it) -> I recalled that I converted all such "scripts" to python to make it easier. So those are now python ones with very small set of requirements ['rich>=12.4.4', 'inputimeout', 'pyyaml'] for examplte that run bootstrapping python script (./scripts/ci/pre_commit/pre_commit_mypy.py for example), but essentially what the script does is it runs some docker command:

  cmd_result = run_command(
        [
            "docker",
            "run",
            "-t",
....

So the bootstrapping script uses the small ['rich>=12.4.4', 'inputimeout', 'pyyaml'] venv - to launch docker command in airflow's CI image....

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Oct 25, 2023
@github-actions github-actions bot closed this Oct 30, 2023
@potiuk potiuk reopened this Oct 30, 2023
@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Oct 31, 2023
@potiuk
Copy link
Member

potiuk commented Nov 9, 2023

I guess conflicts/rebase needed now :)

@Taragolis
Copy link
Contributor

Just wondering, any reason why we can't use ruff implementation https://docs.astral.sh/ruff/rules/#flake8-bandit-s ?

@potiuk
Copy link
Member

potiuk commented Nov 9, 2023

Just wondering, any reason why we can't use ruff implementation https://docs.astral.sh/ruff/rules/#flake8-bandit-s ?

Sounds like good idea :)

@hussein-awala
Copy link
Member Author

Just wondering, any reason why we can't use ruff implementation https://docs.astral.sh/ruff/rules/#flake8-bandit-s ?

I checked it when I started working on this PR, and I found that the implementation is still in progress, and there are a few bugs (astral-sh/ruff#1646), it could be a good future optimization or an alternative solution when we decide what are the rules we want to check if they are all supported by ruff plugin.

@potiuk
Copy link
Member

potiuk commented Nov 10, 2023

image

@potiuk
Copy link
Member

potiuk commented Nov 10, 2023

looking good

@hussein-awala
Copy link
Member Author

Yes, I just restarted some CI jobs that failed because of resources/network issues, I will mark it ready for review once all CI check are green.

@hussein-awala hussein-awala marked this pull request as ready for review November 10, 2023 17:00
@potiuk potiuk merged commit 4b1e494 into apache:main Nov 10, 2023
71 checks passed
@ephraimbuddy ephraimbuddy added the type:misc/internal Changelog: Misc changes that should appear in change log label Nov 20, 2023
@ephraimbuddy ephraimbuddy added this to the Airflow 2.8.0 milestone Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Embed tools for analysing security issues in Python in our workflow
4 participants