Welcome to the MetricFlow developer community, we're thrilled to have you aboard!
- Familiarize yourself with our Code of Conduct. In summary - be kind to each other. We're all here trying to make the data world a better place to work.
- Make sure you can sign our Contributor License Agreement. Unfortunately, we cannot accept PRs unless you have signed. If you are not able to sign the agreement you may still participate in our Slack community or interact with Issues. To sign the agreement simply put up a PR, and you will receive instructions there.
- Ensure you have Python
3.8
or3.9
. - Install the following required system dependencies:
- MySqlClient:
- Follow the instructions from MySQL
- Mac users might prefer to use Homebrew:
brew install mysql
- Postgres:
- Postgres provides pre-built packages for download and installation
- Mac users might prefer to use Homebrew:
brew install postgresql
- Docker:
- This is only required if you are developing with Postgres.
- Follow the instructions from Docker
- MySqlClient:
- Create a fork of the MetricFlow repo and clone it locally.
- Activate a Python virtual environment. While this is not required, it is strongly encouraged.
- We provide
make venv
andmake remove_venv
helpers for creating/deleting standard Python virtual envs. You may passVENV_NAME=your_custom_name
to override the defaultvenv
location. - conda users may prefer conda's environment management instead.
- We provide
- Install Poetry via
pip install poetry
- this is the tool we use to manage our build dependencies. Note, due to an issue with poetry configurations using virtualenvs.create.false some environments (e.g., Ubuntu) may experience dependency resolution problems with poetry 1.2.0, in which case you should make sure you have a virtual env set up prior to installation. If you really must use the global machine env you may be able to work around the problem by usingpip install poetry==1.1.15
. - Run
make install
to get all of your dependencies loaded and ready for development- This includes useful dev tools, including pre-commit for linting.
- You may run
pre-commit install
if you would like the linters to run prior to all local git commits
- OPTIONAL: install dbt dependencies. Developers working on dbt integrations will need to install these in order to work on those integrations and run the relevant tests. Any of the following commands should be sufficient for development purposes:
poetry install -E "dbt-postgres dbt-cloud"
poetry install -E "dbt-redshift dbt-cloud"
poetry install -E "dbt-snowflake dbt-cloud"
poetry install -E "dbt-bigquery dbt-cloud"
You're ready to start! Note all make
and poetry
commands should be run from your repository root unless otherwise indicated.
- Run some tests to make sure things happen:
- Run the full test suite:
make test
- Run a subset of tests based on path:
poetry run pytest metricflow/test/plan_conversion
- Run a subset of tests based on test name substring:
poetry run pytest -k "query" metricflow/test
- Run the full test suite:
- Now you may wish to break some tests. Make some local changes and run the relevant tests again and see if you broke them!
- Working with integration tests
- These tests are driven by a set of test configs in metricflow/test/integration/test_cases. They compare the output of a MetricFlow query against the output of a similar SQL query.
- These tests all run on consistent input data, which is created in the target warehouse via setup fixtures.
- Modify this file if you are looking to test boundary cases involving things like repeated rows of data.
- Let's break a test!
- Change a SQL query inside of metricflow/test/integration/test_cases/itest_simple.yaml
- Run the test case:
poetry run pytest -k "itest_simple.yaml" metricflow/test/integration
. Did it fail?
- Working with module and component tests
- These are generally laid out in a similar hierarchy to the main package.
- Let's try them out:
- Run the dataflow plan to sql plan conversion tests:
poetry run pytest metricflow/test/plan_conversion/test_dataflow_to_sql_plan.py
. - Modify something in the dataflow to sql plan converter logic. I like to throw exceptions just to make sure things blow up.
- Run the test again. Did anything break?
- Run the dataflow plan to sql plan conversion tests:
- Remember to clean up when you're done playing with the tests!
- Working with integration tests
- Make changes to the codebase and verify them through further testing, including test runs against other warehouse engines.
- To run tests against other engines you MUST have read and write access to an instance of the execution engine and database.
- Run the following commands in your shell, replacing the tags with the appropriate values:
export MF_SQL_ENGINE_URL=<YOUR_WAREHOUSE_CONNECTION_URL>
export MF_SQL_ENGINE_PASSWORD=<YOUR_WAREHOUSE_PASSWORD>
- Run
make test
to execute the entire test suite against the target engine. - By default, without
MF_SQL_ENGINE_URL
andMF_SQL_ENGINE_PASSWORD
set, your tests will run against SQLite.
- Run the linters with
make lint
at any time, but especially before submitting a PR. We use:Black
for formattingFlake8
for general Python lintingMyPy
for typechecking
- To see how your changes work with mnore interactive queries, use your repo-local CLI.
- Run
poetry run mf --help
- Follow the CLI help from there, just remember your local CLI is always
poetry run mf <COMMAND>
!
- Run
- Merge your changes into your fork of the MetricFlow repository
- Make a well-formed Pull Request (PR) from your fork into the main MetricFlow repository. If you're not clear on what a well-formed PR looks like, fear not! We will help you here and throughout the review process.
- Well-formed PRs are composed of one or more well-formed commits, and include clear indications of how they were tested and verified prior to submission.
- Well-formed commits are focused (loosely speaking they do one conceptual thing) and well-described.
- A good commit message - like a good PR message - will have three components:
- A succinct title explaining what the commit does
- A separate body describing WHY the change is being made
- Additional detail on what the commit does, if needed
- We want this because we believe the hardest part of a collaborative software project is not getting the computer to do what it's supposed to do. It's communicating to a human reader what you meant for the computer to do (and why!), and also getting the computer to do that thing.
- This helps you too - well-formed PRs get reviewed a lot faster and a lot more productively. We want your contribution experience to be as smooth as possible and this helps immensely!
- One of our core contributors will review your PR and either approve it or send it back with requests for updates
- Once the PR has been approved, our core contributors will merge it into the main project.
- You will get a shoutout in our changelog/release notes. Thank you for your contribution!