Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move plyvel to google provider extra #15812

Merged
merged 2 commits into from
May 19, 2021
Merged

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented May 12, 2021

Not sure if I did this correctly, or how to test installation of providers but reading up now.

Hmmm...

Tried this:

./breeze prepare-provider-packages google

Got a failure with no log:

Preparing official version of provider with no suffixes

-----------------------------------------------------------------------------------
########## Generate setup files for 'google' ##########
The tag providers-google/3.0.0 exists. Not preparing the package.
===================================================================================

Summary of prepared packages:

    Errors:
google

===================================================================================

There were errors when preparing packages. Exiting!
########################################################################################################################
 [IN CONTAINER]   EXITING /opt/airflow/scripts/in_container/run_prepare_provider_packages.sh WITH EXIT CODE 1
########################################################################################################################

ERROR: The previous step completed with error. Please take a look at output above

@dstandish dstandish requested a review from potiuk May 12, 2021 23:47
@boring-cyborg boring-cyborg bot added area:providers kind:documentation provider:google Google (including GCP) related issues labels May 12, 2021
@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@potiuk
Copy link
Member

potiuk commented May 13, 2021

I think we need a little more :)

The providers are prepared and tested during CI build (We have a special job "Build and test povider packages").

And In fact the job did fail: https://github.com/apache/airflow/pull/15812/checks?check_run_id=2571762641

The job failed because our CI does not install plyvel any more since this is an optional dependency now.

But for CI image we run all the tests and check if all providers are fully importable, so we should also add plyvel as depenedncy installed during CI image build. This can be done simply as yet another extra for "airflow" itself - similarly as we have virtualenv, statsd, rabbitmq etc. Simply adding it as one more extra in setup.py and adding it to the "CORE_EXTRAS_REQUIREMENTS" in setup.py should do the job.

I believe this is also the reason why level_db tests failed as well and doc build failed (because neither can import plyvel).

@potiuk
Copy link
Member

potiuk commented May 13, 2021

Just make sure that you run pre-commits as well. After you add 'plyvel' extra, you will also have to update "extra documentation" and pre-commit will fail if you did not :)

@dstandish
Copy link
Contributor Author

i will look into this thank you :)

@dstandish
Copy link
Contributor Author

dstandish commented May 18, 2021

@potiuk it seems that it may be more elegant to create a google.leveldb provider and pull leveld out of google, similar to what we see with apache. then there's no need to modify core extra requirements. (since leveldb isn't very "core")

what do you think?

doing so also means we don't have an airflow level db extra and a google provider leveldb extra, and avoids the decision of whether the airflow leveldb extra should also install the google provider etc.

@dstandish dstandish requested a review from kaxil as a code owner May 18, 2021 18:39
@potiuk
Copy link
Member

potiuk commented May 18, 2021

@potiuk it seems that it may be more elegant to create a google.leveldb provider and pull leveld out of google, similar to what we see with apache. then there's no need to modify core extra requirements. (since leveldb isn't very "core")

what do you think?

I think this would not play very well with the current hierarchy. I think We have not foreseen that we can have booth "providerX" and "providerX.something". I am afraid it might lead to some problems. I think - looking at the current "google" provider structure, splittiing it similarly to "microsoft" could have been better.

Maybe (but that's somethig that likely needs a bit more thought) we should split it to "google.cloud" , "google.firebase", etc. eventually.

For now I think the 'extra' is quick and pretty much backwards-compatible solution (with the right dependencies installed). Splitting it to independent providers will be much more laborious and can have some unforeseen cosequences (like the need to extract a "common" part to even separate provider.

@dstandish
Copy link
Contributor Author

For now I think the 'extra' is quick and pretty much backwards-compatible solution (with the right dependencies installed). Splitting it to independent providers will be much more laborious and can have some unforeseen cosequences (like the need to extract a "common" part to even separate provider.

For leveldb, there isn't any common part, i think. It does not use GCP base hooks, because it actually has nothing to do with GCP.

So it might actually be a simple job to create a distinct provider for this one. I'm not sure we really even need google in the name. Why couldn't it just be a leveldb provider? It came out of google but it's an open source project.

Just exploring options here. Already pushed the change with the approach you suggested. Will let the build go before rebasing with those conflicts (cus it seems there's something in master broken)

@potiuk
Copy link
Member

potiuk commented May 18, 2021

Yep, leveldb is indeed very separate from the rest. I think master is OK, but we have another major outage of Github Actions in a full swing: https://www.githubstatus.com/incidents/m16jzl31gnqt

BTW. you need to rebase anyway.

@dstandish
Copy link
Contributor Author

dstandish commented May 18, 2021

Yep, leveldb is indeed very separate from the rest. I think master is OK, but we have another major outage of Github Actions in a full swing: https://www.githubstatus.com/incidents/m16jzl31gnqt

BTW. you need to rebase anyway.

Yeah i know i need to rebase but when i did locally at pre-commit i found this error:

Run mypy.................................................................................Failed
- hook id: mypy
- exit code: 2

+ export PYTHONPATH=/opt/airflow
+ PYTHONPATH=/opt/airflow
+ mypy_args=()
+ for filename in "$@"
+ [[ setup.py == docs/* ]]
+ mypy_args+=("-m" "$(filename_to_python_module "$filename")")
++ filename_to_python_module setup.py
++ file=setup.py
++ no_leading_dotslash=setup.py
++ no_py=setup
++ no_init=setup
++ echo setup
+ mypy --namespace-packages -m setup
Unable to load the config, contains a configuration error.
setup.cfg:193: error: Error importing plugin 'airflow.mypy.plugin.decorators':
Unable to configure handler 'task': Cannot resolve
'airflow.utils.log.file_task_handler.FileTaskHandler': No module named 'httpx'
    plugins =
    ^
Found 1 error in 1 file (checked 1 source file)
+ in_container_script_end
+ EXIT_CODE=2
+ [[ 2 != 0 ]]
+ [[ false == \t\r\u\e ]]
+ [[ false == \t\r\u\e ]]
+ in_container_clear_tmp
+ rm -rf /tmp/core-js-banners /tmp/v8-compile-cache-0 /tmp/yarn--1620688528816-0.4993168394804959 /tmp/yarn--1620688528816-0.8367507637222573 /tmp/yarn--1620688607348-0.5560835416794161
+ in_container_fix_ownership
+ [[ Darwin == \L\i\n\u\x ]]

Plyvel does not build on macOS without levelDB installed in system.  Its better to make it an optional install.
@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@potiuk
Copy link
Member

potiuk commented May 19, 2021

Unable to configure handler 'task': Cannot resolve
'airflow.utils.log.file_task_handler.FileTaskHandler': No module named 'httpx'
    plugins =

Yeah. You need to rebuild the image when asked :)

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label May 19, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@potiuk
Copy link
Member

potiuk commented May 19, 2021

I think this looks cool for now. I've opened #15933 to discuss what we do with Google Provider.

@potiuk potiuk merged commit 76a80bb into apache:master May 19, 2021
potiuk pushed a commit to potiuk/airflow that referenced this pull request Jun 22, 2021
Plyvel does not build on macOS without levelDB installed in system.  Its better to make it an optional install.

(cherry picked from commit 76a80bb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers full tests needed We need to run full set of tests for this PR to merge kind:documentation provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants