Skip to content

Project used by the Fivetran dbt package team to help automate package development and maintenance efforts.

Notifications You must be signed in to change notification settings

fivetran/dbt_package_automations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Fivetran dbt Package Automations

πŸ“£ What does this dbt package do?

This package is a combination of macros, bash scripts, and python scripts that are used to help expedite and automate the package development process. See the Contents below for the automations available within this package.

πŸ€” Who is the target user of this dbt package?

  • You use dbt
  • You are a member of the Fivetran dbt package team, or would like to expedite the package development process

🎯 How do I use the dbt package?

Step 1: Installing the Package

Include the following dbt_package_automations package version in your packages.yml

Check dbt Hub for the latest installation instructions, or read the dbt docs for more information on installing packages.

packages:
  - git: https://github.com/fivetran/dbt_package_automations.git
    revision: main
    warn-unpinned: false

Step 2: Using the Automations

See the specific details for each macros within the contents below.

πŸ“‹ Contents

Automation Macros

Bash Scripts

Automation Macros

These macros provide the scripts to automate parts of the model creation.

staging_models_automation (source)

This macro is intended to be used as a run-operation when generating Fivetran dbt package staging models and all around package framework creation. This macro will receive user input to create the necessary bash commands so they may all be ran at once. The output of this macro within the CLI will then be copied and pasted as a command to generate the staging models/macros.

Additionally, you can rerun this macro as it will create or replace what currently exists in the macro & model folders.

Things to note:

  • This macro will only work if you have already included your src.yml file.
  • Please double check your outputs as there may be timestamps & notes that are not relevant to the file.

Usage:

dbt run-operation staging_models_automation --args '{package: intercom, display_name: Intercom, source_schema: intercom_schema, source_database: intercom_database, tables: ["admin","company_history"]}'

CLI Output:

source dbt_packages/dbt_package_automations/kick_off_generator.sh  && 
source dbt_packages/dbt_package_automations/generate_source.sh '../dbt_intercom' intercom intercom_database intercom_schema '['admin', 'company_history']'  && 
source dbt_packages/dbt_package_automations/generate_columns.sh '../dbt_intercom' intercom intercom_database intercom_schema admin && 
source dbt_packages/dbt_package_automations/generate_columns.sh '../dbt_intercom' intercom intercom_database intercom_schema company_history && 
source dbt_packages/dbt_package_automations/generate_models.sh '../dbt_intercom' intercom intercom_database intercom_schema admin && 
source dbt_packages/dbt_package_automations/generate_models.sh '../dbt_intercom' intercom intercom_database intercom_schema company_history && 
source dbt_packages/dbt_package_automations/generate_docs_md.sh '../dbt_intercom' intercom intercom_database intercom_schema 'admin","company_history'  && 
source dbt_packages/dbt_package_automations/generate_files.sh intercom 'Intercom'  && 
source dbt_packages/dbt_package_automations/edit_dbt_project_yml.sh intercom 'admin,company_history'  && 
source dbt_packages/dbt_package_automations/edit_integrations_project_yml.sh intercom 'admin,company_history' 

Args:

  • package (required): Name of the package for which you are creating staging models/macros.
  • display_name (required): Display name of the package for which you are creating that will be used mainly in the README generation.
  • source_schema (required): Name of the source_schema from which the bash command will query.
  • source_database (required): Name of the source_database from which the bash command will query.
  • tables (required): List of the tables for which you want to create staging models/macros.

generate_columns_macro (source)

This macro is used to generate the macro used as an argument within the fill_staging_columns macro which will list all the expected columns within a respective table. The macro output will contain name and datatype; however, you may add an optional argument for alias if you wish to rename the column within the macro.

The macro should be run using dbt's run-operation functionality, as used below. It will print out the macro text, which can be copied and pasted into the relevant macro directory file within the package.

Usage:

dbt run-operation dbt_package_automations.generate_columns_macro --args '{"table_name": "promoted_tweet_report", "schema_name": "twitter_ads", "database_name": "intercom_database"}'

Output:

{% macro get_admin_columns() %}

{% set columns = [
    {"name": "email", "datatype": dbt_utils.type_string()},
    {"name": "id", "datatype": dbt_utils.type_string(), "alias": "admin_id"},
    {"name": "job_title", "datatype": dbt_utils.type_string()},
    {"name": "name", "datatype": dbt_utils.type_string()},
    {"name": "_fivetran_deleted", "datatype": "boolean"},
    {"name": "_fivetran_synced", "datatype": dbt_utils.type_timestamp()}
] %}

{{ return(columns) }}

{% endmacro %}

Args:

  • table_name (required): Name of the schema which the table you are running the macro for resides in.
  • schema_name (required): Name of the schema which the table you are running the macro for resides in.
  • database_name (optional): Name of the database which the table you are running the macro for resides in. If empty, the macro will default this value to target.database.

generate_docs (source)

This macro will generate a source command that leverages generate_docs.sh to do the following:

  • seeds, runs and creates documentation for integration tests models
  • moves catalog.json, index.html, manifest.json and run_results.json into a <project_name>/docs folder. When the source script is ran, this feature will remove existing files in the <project_name>/docs if any exists.

Requirements:

  • This script assumes that you are running in a directory that is adjacent to your project. For example, say you are working on dbt_apple_search_ads. You may run the macro & the source command from the CLI Output within your dev directory below.
β”œβ”€β”€ apple_search_ads
β”‚   β”œβ”€β”€ dbt_apple_search_ads
β”‚   β”œβ”€β”€ dbt_apple_search_ads_source
β”‚   └── dev
  • Make sure your integration_test profiles in ~/.dbt/profiles.yml is set for the appropriate project name.

Usage:

dbt run-operation generate_docs --args '{package: apple_search_ads_source}'

CLI Output:

source dbt_packages/dbt_package_automations/generate_docs.sh '../dbt_apple_search_ads_source'

Args:

  • package (required): Name of the package; include whether package is source or not


get_column_names_only (source)

This macro is used in the generate_models.sh script to further the staging_models_automation macro. This macro outputs all columns from the specified table, allowing generate_models.sh to prefill column fields in the final select statement.

Note this will retain the timestamp from the built-in formatting update from dbt 1.0.0. Therefore in the staging model resulting from generate_models.sh, you will need to manually delete the timestamp.

Usage:

dbt run-operation get_column_names_only --args '{table_name: log, schema_name: fivetran_log, database_name: database-name' 

CLI Output:

14:41:40      _fivetran_synced,
    connector_id,
    event,
    id,
    message_data,
    message_event,
    process_id,
    sequence_number,
    sync_id,
    time_stamp,
    transformation_id

Args:

  • table_name (required): Name of the table you are wanting to return column names and datatypes.
  • schema_name (required): Name of the schema where the above mentioned table resides.
  • database_name (optional): Name of the database where the above mentioned schema and table reside. By default this will be your target.database.

get_columns_for_macro (source)

This macro returns all column names and datatypes for a specified table within a database and is used as part of the generate_columns_macro.

Usage:

{{ fivetran_utils.get_columns_for_macro(table_name="team", schema_name="my_teams", database_name="my_database") }}

Args:

  • table_name (required): Name of the table you are wanting to return column names and datatypes.
  • schema_name (required): Name of the schema where the above mentioned table resides.
  • database_name (optional): Name of the database where the above mentioned schema and table reside. By default this will be your target.database.

Bash Scripts

generate_columns.sh (source)

This bash file can be used to setup or update packages to use the fill_staging_columns macro above. The bash script does the following:

  • Creates a .sql file in the macros directory for a source table and fills it with all the columns from the table.
    • Be sure your dbt_project.yml file does not contain any Warnings or Errors. If warnings or errors are present, the messages from the terminal will be printed above the macro within the .sql file in the macros directory.

The usage is as follows, assuming you are executing via a zsh terminal and in a dbt project directory that has already imported this repo as a dependency:

source dbt_packages/dbt_package_automations/generate_columns.sh "path/to/directory" file_prefix database_name schema_name table_name

As an example, assuming we are in a dbt project in an adjacent folder to dbt_apple_search_ads_source:

source dbt_packages/dbt_package_automations/generate_columns.sh '../dbt_apple_search_ads_source' stg_apple_search_ads intercom_database apple_search_ads campaign_history

In that example, it will:

  • Create a get_campaign_history_columns.sql file in the macros directory, with the necessary macro within it.

generate_docs.sh(source)

This bash file can be used to create or replace package documentation (<project_name>/docs).

Requirements:

  • This script assumes that you are running in a directory that is adjacent to your project. For example, say you are working on dbt_apple_search_ads. You may run the the source command within your dev directory below.
β”œβ”€β”€ apple_search_ads
β”‚   β”œβ”€β”€ dbt_apple_search_ads
β”‚   β”œβ”€β”€ dbt_apple_search_ads_source
β”‚   └── dev 
  • Make sure your integration_test profiles in ~/.dbt/profiles.yml is set for the appropriate project name.

Usage:

source dbt_packages/dbt_package_automations/generate_docs.sh '../dbt_apple_search_ads_source'

The bash script does the following:

  • seeds, runs and creates documentation for integration tests models
  • moves catalog.json, index.html, manifest.json and run_results.json into a <project_name>/docs folder.

generate_models.sh (source)

This bash file can be used to setup or update packages to use the generate_models macro above. The bash script assumes that there already exists a macro directory with all relevant get_<table_name>_columns.sql files created. The bash script does the following:

  • Creates a ..._tmp.sql file in the models/tmp directory and fills it with a select * from {{ var('table_name') }} where table_name is the name of the source table.
  • Creates or updates a .sql file in the models directory and fills it with the filled out version of the fill_staging_columns macro as shown above. You can then write whatever SQL you want around the macro to finishing off the staging file.
source dbt_packages/dbt_package_automations/generate_models.sh "path/to/directory" file_prefix database_name schema_name table_name

As an example, assuming we are in a dbt project in an adjacent folder to dbt_apple_search_ads_source:

source dbt_packages/dbt_package_automations/generate_models.sh '../dbt_apple_search_ads_source' stg_apple_search_ads intercom_database apple_search_ads campaign_history

With the above example, the script will:

  • Create a stg_apple_search_ads__campaign_history_tmp.sql file in the models/tmp directory, with select * from {{ var('campaign_history') }} in it.
  • Create or update a stg_apple_search_ads__campaign_history.sql file in the models directory with the pre-filled out fill_staging_columns macro.

πŸ™Œ How is this package maintained and can I contribute?

Package Maintenance

The Fivetran team maintaining this package only maintains the latest version of the package. We highly recommend you stay consistent with the latest version of the package and refer to the CHANGELOG and release notes for more information on changes across versions.

Contributions

These dbt packages are developed by a small team of analytics engineers at Fivetran. However, the packages are made better by community contributions!

We highly encourage and welcome contributions to this package. Check out this post on the best workflow for contributing to a package!

πŸͺ Are there any resources available?

  • If you encounter any questions or want to reach out for help, please refer to the GitHub Issue section to find the right avenue of support for you.
  • If you would like to provide feedback to the dbt package team at Fivetran, or would like to request a future dbt package to be developed, then feel free to fill out our Feedback Form.
  • Have questions or want to just say hi? Book a time during our office hours here or send us an email at solutions@fivetran.com.

About

Project used by the Fivetran dbt package team to help automate package development and maintenance efforts.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages