Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DuneSQL Migration: new macros, new dunesql CI, and translated + legacy evms_blocks model #3570

Merged
merged 41 commits into from
Jun 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
494d9cb
create schema macro for trino
Jun 19, 2023
b3230c5
trino create table
Jun 19, 2023
e173b0c
Adds a translated spell to test trino selector
Jun 19, 2023
ba3c595
changes tag name
Jun 19, 2023
1f30c96
Adds exclude tag names from CI workflow definition
Jun 19, 2023
872258e
Adds missing comma
Jun 19, 2023
e1507f8
Run alter table on every run
Jun 19, 2023
ad3fd1f
Adds delta_prod as database for source
Jun 19, 2023
7f44c46
DBT macro to override source database
Jun 19, 2023
388db3d
Only add delta_prod to source if we are running on trino
Jun 19, 2023
e37da90
Updates expose spell and mark as spell macros
Jun 19, 2023
acc540f
Jonas' comments
Jun 20, 2023
5e3183a
revert uniswap changes
Jun 20, 2023
225649b
Adds dbt trino to pipfile
Jun 20, 2023
72c20fc
alter view fix
Jun 20, 2023
f91e643
adds dunesql tag to evms_blocks model
Jun 21, 2023
c50a924
use tags instead of env var in macros to detect dunesql models
Jun 21, 2023
1a78e1f
Adds legacy evms_block file which contains the original spark model
Jun 21, 2023
4341046
Add dbt matrix strategy for spark and dunesql
a-monteiro Jun 21, 2023
2872398
Add alias macro
Jun 21, 2023
4d1885d
Use alias macro in evms blocks
Jun 21, 2023
d9380b0
Fix select/excludes
a-monteiro Jun 22, 2023
69ea2cb
Run sequentially
a-monteiro Jun 22, 2023
003ae1c
Fix exclude tag
a-monteiro Jun 22, 2023
80974a5
Test fix for broken spark selector
Jun 22, 2023
74494df
fixing bug where compile on spark had an empty select flag
Jun 22, 2023
7a73995
Refactor dbt compile
a-monteiro Jun 22, 2023
04e48fb
Updates commit manifest action to generate both dunsql and spark mani…
Jun 22, 2023
bc58a9c
Download manifest from correct location for dunesql runs
Jun 22, 2023
f15cc22
simplifies copy manifest logic
Jun 22, 2023
32f1e64
only run dunesql
Jun 22, 2023
06f8ab7
Fix commit manifest target location
Jun 22, 2023
52ff246
dunesql_check CI fix
Jun 22, 2023
d35ca4e
Only select state:new,tag:dunesql
jkylling Jun 22, 2023
3ffd65b
Move comma inside of $TAG. state:modified != state:modified,
jkylling Jun 22, 2023
67aac02
Removes 'or True' from expose_spell and mark_as_spell macros prior to…
Jun 23, 2023
fb89cb5
Test CI run on different s3 location
Jun 23, 2023
5dee4a7
Revert "Test CI run on different s3 location"
Jun 23, 2023
02c439a
prod s3 bucket
Jun 27, 2023
1b5d890
trailing /
Jun 27, 2023
b92a621
Prod location again
Jun 27, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 30 additions & 9 deletions .github/workflows/commit_manifest.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
name: Commit Manifest

on:
workflow_dispatch:
push:
Expand All @@ -10,7 +12,11 @@ concurrency:

jobs:
commit_manifest:
runs-on: [ self-hosted, linux, spellbook ]
runs-on: [ self-hosted, linux, spellbook-trino ]
strategy:
matrix:
engine: [ 'dunesql', 'spark' ]
max-parallel: 1

steps:
- uses: actions/setup-python@v3
Expand All @@ -21,25 +27,40 @@ jobs:

- name: Add git_sha to schema
run: "/runner/change_schema.sh wizard"


- name: Setup variables
run: |
if [[ "${{ matrix.engine }}" == "dunesql" ]]; then
printf "Using dunesql engine\n"
echo "PROFILE=--profile dunesql" >> $GITHUB_ENV
echo "COMPILE_TAG=--select tag:dunesql" >> $GITHUB_ENV
echo "S3_LOCATION=manifest-spellbook-dunesql" >> $GITHUB_ENV
elif [[ "${{ matrix.engine }}" == "spark" ]]; then
printf "Using spark engine\n"
echo "PROFILE=--profile spark" >> $GITHUB_ENV
echo "COMPILE_TAG=--exclude tag:dunesql" >> $GITHUB_ENV
echo "S3_LOCATION=manifest-spellbook" >> $GITHUB_ENV
echo
else
echo "Unknown engine: ${{ matrix.engine }}"
exit 1
fi

- name: dbt dependencies
run: "dbt deps"

- name: dbt compile to create prod manifest from main
run: "dbt compile --target-path ."
run: "dbt compile --target-path . $PROFILE $COMPILE_TAG"
jkylling marked this conversation as resolved.
Show resolved Hide resolved

- name: copy old manifest locally
run: "aws s3 cp s3://manifest-spellbook/manifest.json previous_manifest.json"

- name: copy old manifest to s3
run: "aws s3 cp previous_manifest.json s3://manifest-spellbook/previous_manifest.json"
run: "aws s3 cp s3://$S3_LOCATION/manifest.json s3://$S3_LOCATION/previous_manifest.json"

- name: Set environment variables
run: |
echo "GIT_SHA=$(echo ${{ github.sha }} | tr - _ | cut -c1-7)" >> $GITHUB_ENV

- name: upload git_sha versioned previous manifest (for manual catchup if jobs fail)
run: "aws s3 cp previous_manifest.json s3://manifest-spellbook/manifest_$GIT_SHA.json"
run: "aws s3 cp s3://$S3_LOCATION/manifest.json s3://$S3_LOCATION/manifest_$GIT_SHA.json"

- name: upload manifest
run: "aws s3 cp manifest.json s3://manifest-spellbook/manifest.json"
run: "aws s3 cp manifest.json s3://$S3_LOCATION/manifest.json"
47 changes: 35 additions & 12 deletions .github/workflows/dbt_slim_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,49 +16,72 @@ concurrency:

jobs:
dbt-test:
runs-on: [ self-hosted, linux, spellbook ]
runs-on: [ self-hosted, linux, spellbook-trino ]
strategy:
matrix:
engine: [ 'dunesql', 'spark' ]
max-parallel: 1
timeout-minutes: 90

steps:
- name: Check out repository code
uses: actions/checkout@v2
uses: actions/checkout@v3

- name: Set environment variables
- name: Setup variables
run: |
if [[ "${{ matrix.engine }}" == "dunesql" ]]; then
printf "Using dunesql engine\n"
echo "PROFILE=--profile dunesql" >> $GITHUB_ENV
echo "TAG=,tag:dunesql" >> $GITHUB_ENV
echo "COMPILE_TAG=--select tag:dunesql" >> $GITHUB_ENV
echo "S3_LOCATION=manifest-spellbook-dunesql" >> $GITHUB_ENV
elif [[ "${{ matrix.engine }}" == "spark" ]]; then
printf "Using spark engine\n"
echo "PROFILE=--profile spark" >> $GITHUB_ENV
echo "EXCLUDE=tag:dunesql" >> $GITHUB_ENV
echo "SINGLE_EXCLUDE=--exclude tag:dunesql" >> $GITHUB_ENV
echo "COMPILE_TAG=--exclude tag:dunesql" >> $GITHUB_ENV
echo "S3_LOCATION=manifest-spellbook" >> $GITHUB_ENV
echo
else
echo "Unknown engine: ${{ matrix.engine }}"
exit 1
fi
echo "GIT_SHA=$(echo ${{ github.sha }} | tr - _ | cut -c1-8)" >> $GITHUB_ENV

- name: Add git_sha to schema
run: "/runner/change_schema.sh git_$GIT_SHA"
run: "/runner/change_schema.sh git_${{ matrix.engine }}_$GIT_SHA"

- name: Get latest manifest
run: "aws s3 cp s3://manifest-spellbook/manifest.json manifest.json"
run: "aws s3 cp s3://$S3_LOCATION/manifest.json manifest.json"

- name: dbt dependencies
run: "dbt deps"

- name: dbt compile to create manifest to compare to
run: "dbt compile"
run: "dbt compile $PROFILE $COMPILE_TAG"

- name: dbt seed
run: "dbt seed --select state:modified --state ."
run: "dbt seed $PROFILE --select state:modified$TAG --state ."

- name: dbt run initial model(s)
run: "dbt -x run --select state:modified --exclude tag:prod_exclude --defer --state ."
run: "dbt -x run $PROFILE --select state:modified$TAG --exclude $EXCLUDE tag:prod_exclude --defer --state ."

- name: dbt test initial model(s)
run: "dbt test --select state:new state:modified --exclude tag:prod_exclude --defer --state ."
run: "dbt test $PROFILE --select state:new$TAG state:modified$TAG --exclude $EXCLUDE tag:prod_exclude --defer --state ."

- name: Set environment variable for incremental model count
run: |
echo "INC_MODEL_COUNT=$(echo dbt ls --select state:modified,config.materialized:incremental --state . --resource-type model | wc -l)" >> $GITHUB_ENV
echo "INC_MODEL_COUNT=$(echo dbt ls $PROFILE --select state:modified,config.materialized:incremental$TAG -SINGLE_EXCLUDE --state . --resource-type model | wc -l)" >> $GITHUB_ENV

- name: dbt run incremental model(s) if applicable
if: env.INC_MODEL_COUNT > 0
run: "dbt run --select state:modified,config.materialized:incremental --exclude tag:prod_exclude --defer --state ."
run: "dbt run $PROFILE --select state:modified,config.materialized:incremental$TAG --exclude $EXCLUDE tag:prod_exclude --defer --state ."

- name: dbt test incremental model(s) if applicable
if: env.INC_MODEL_COUNT > 0
run: "dbt test --select state:modified,config.materialized:incremental --exclude tag:prod_exclude --defer --state ."
run: "dbt test $PROFILE --select state:modified,config.materialized:incremental$TAG --exclude $EXCLUDE tag:prod_exclude --defer --state ."

- name: Run DuneSQL Check
if: matrix.engine != 'dunesql'
run: "/runner/dunesql_check.py --schema test_schema --pr_schema git_$GIT_SHA"
2 changes: 1 addition & 1 deletion Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ dbt-databricks = "===1.4.1"
numpy = "2.0.12"
pre-commit = "2.20.0"
pytest = "7.1.3"

dbt-trino = "1.4.2"

[requires]
python_version = "3.9"
Loading