DuneSQL Migration: new macros, new dunesql CI, and translated + legacy evms_blocks model #3570

couralex6 · 2023-06-19T12:46:28Z

Trino macro prior to trino migration

.github/workflows/dbt_slim_ci.yml

macros/dune/adapters.sql

macros/dune/source.sql

models/uniswap/ethereum/uniswap_ethereum_sources.yml

models/uniswap/ethereum/uniswap_v1_ethereum_trades.sql

jkylling · 2023-06-22T11:01:52Z

.github/workflows/dbt_slim_ci.yml


      - name: dbt test incremental model(s) if applicable
        if: env.INC_MODEL_COUNT > 0
-        run: "dbt test --select state:modified,config.materialized:incremental --exclude tag:prod_exclude --defer --state ."
+        run: "dbt test $PROFILE --select state:modified,config.materialized:incremental,$TAG --exclude $EXCLUDEtag:prod_exclude --defer --state ."


Suggested change

run: "dbt test $PROFILE --select state:modified,config.materialized:incremental,$TAG --exclude $EXCLUDEtag:prod_exclude --defer --state ."

run: "dbt test $PROFILE --select state:modified,config.materialized:incremental,$TAG --exclude ${EXCLUDE}tag:prod_exclude --defer --state ."

macros/expose_spells.sql

macros/mark_as_spell.sql

.github/workflows/commit_manifest.yml

macros/dune/schema.sql

jeff-dude · 2023-06-26T20:08:39Z

taking a few notes here to prep for merge, mostly for myself to follow:

modifying existing macros, especially mark_as_spell but also expose_spells to a lesser extent, typically make dbt think an entire full refresh is needed, as they impact all spells -- therefore, i'd recommend we disable the state:modified step in our orchestration for this merge, so we avoid unnecessary full refresh on all 1k+ spells
we should monitor PRs closely after merge to ensure commit_manifest workflow runs as expected with engine profiles and env variables added -- no real takeaway here other than be cognizant of PR activity after to ensure we see results as expected, as it'll impact all PR CI tests due to manifest comparisons and running only needed models per PR
ensure DBT_ENV_CUSTOM_ENV_S3_BUCKET is set as expected in dbt cloud for new trino project
ensure we educate users on how to leverage new alias config property across engines
is this our bucket for both dev PRs and prod?

…fests The existing logic for spark manifest is unchanged, we still upload the spark manifest to the same location. We run the same logic to upload the dunesql manifest to a different s3 manifest.

… merging

This reverts commit 17dab72.

jeff-dude · 2023-06-27T18:07:01Z

notes post-merge

critical path

~~commit manifest action failed due to permissions bug, access denied on dunesql run~~
- ~~it also fails on spark runs~~
~~dbt cloud orchestration is adding prod_ prefix to schema names when it should not~~
- this has been fixed, the prod value for target in dbt cloud hourly incremental job was missing, it is now added and schemas write as expected

not critical path, but maybe nice-to-have:

~~for the matrix engine, have both jobs (spark and dunesql) run simultaneously rather than sequential?~~
- should we actually only be running one or the other? if tag:dunesql is applied, only run dunesql ci? if no tag, only run spark? otherwise, won't we see one or the other fail each time on syntax errors?
- ~~example PR here. it's editing existing spark spells, but runs dunesql ci test as it's first in the matrix engine~~

potential clean-up:

~~drop the prod_evms.blocks view created in new hive metastore, since it was incorrectly built and added the prod_ prefix? i belive it will persist without manual drop~~

dot2dotseurat · 2023-06-27T20:27:01Z

regarding permissions bug, this is the PR from the s3 permissions I initially set up on the original manifest bucket. have we made a similar pr for the new bucket manifest-spellbook-dunesql?

hildobby · 2023-06-28T14:31:37Z

Hey @couralex6, I'm looking into adding Celo to the tables in the evms sector. What are the reasons for the creation of evms_blocks_legacy.sql as a separate file from evms_blocks.sql? Seems like the added 'dunesql' tag and the altered alias are the only two differences. What is the purpose of this change and why this specific model? Which file should I build on top of now?

jeff-dude · 2023-06-28T15:22:56Z

Hey @couralex6, I'm looking into adding Celo to the tables in the evms sector. What are the reasons for the creation of evms_blocks_legacy.sql as a separate file from evms_blocks.sql? Seems like the added 'dunesql' tag and the altered alias are the only two differences. What is the purpose of this change and why this specific model? Which file should I build on top of now?

we are still working on an updated contribution guide that will include most of these details for yourself and others. if you're looking to get ahead, here are some answers:

_legacy.sql files will maintain the spark logic and continue to run on spark engine, so the data maintains in sync until we deprecate spark completely. also within legacy file will be updated alias property in config, to match new alias macro
the translated to dunesql files will not contain _legacy and add a few things to help orchestration and table naming. anything written in dunesql syntax will need that tag for all orchestration needs. the alias macro (i.e. what the property you're referring to calls) has been edited to accept more params, including legacy flag, to help name tables with or without the legacy suffix.
this specific model was chosen since it's a view and quick to run in CI tests and was top of mind for me :)
build on both files for the time being, hopefully that will be short lived process to edit both

we will ensure to look closely in review on new PRs, then can use findings to help improve contribution guide

hildobby · 2023-06-29T17:15:27Z

Hey @couralex6, I'm looking into adding Celo to the tables in the evms sector. What are the reasons for the creation of evms_blocks_legacy.sql as a separate file from evms_blocks.sql? Seems like the added 'dunesql' tag and the altered alias are the only two differences. What is the purpose of this change and why this specific model? Which file should I build on top of now?

we are still working on an updated contribution guide that will include most of these details for yourself and others. if you're looking to get ahead, here are some answers:

_legacy.sql files will maintain the spark logic and continue to run on spark engine, so the data maintains in sync until we deprecate spark completely. also within legacy file will be updated alias property in config, to match new alias macro

the translated to dunesql files will not contain _legacy and add a few things to help orchestration and table naming. anything written in dunesql syntax will need that tag for all orchestration needs. the alias macro (i.e. what the property you're referring to calls) has been edited to accept more params, including legacy flag, to help name tables with or without the legacy suffix.

this specific model was chosen since it's a view and quick to run in CI tests and was top of mind for me :)

build on both files for the time being, hopefully that will be short lived process to edit both

we will ensure to look closely in review on new PRs, then can use findings to help improve contribution guide

Ah alright makes sense! I made a PR to add Celo but looks like it passed tests without needing me to touch anything new, lmk if there's something new that ci tests dont account for! #3640

jeff-dude · 2023-06-29T21:25:27Z

a PR to add Celo but looks like it passed tests without needing me to touc

will follow up on that PR when i can 🙏

…y evms_blocks model (duneanalytics#3570) * create schema macro for trino * trino create table * Adds a translated spell to test trino selector * changes tag name * Adds exclude tag names from CI workflow definition * Adds missing comma * Run alter table on every run * Adds delta_prod as database for source * DBT macro to override source database * Only add delta_prod to source if we are running on trino * Updates expose spell and mark as spell macros * Jonas' comments * revert uniswap changes * Adds dbt trino to pipfile * alter view fix * adds dunesql tag to evms_blocks model * use tags instead of env var in macros to detect dunesql models * Adds legacy evms_block file which contains the original spark model * Add dbt matrix strategy for spark and dunesql * Add alias macro * Use alias macro in evms blocks * Fix select/excludes * Run sequentially * Fix exclude tag * Test fix for broken spark selector * fixing bug where compile on spark had an empty select flag * Refactor dbt compile * Updates commit manifest action to generate both dunsql and spark manifests The existing logic for spark manifest is unchanged, we still upload the spark manifest to the same location. We run the same logic to upload the dunesql manifest to a different s3 manifest. * Download manifest from correct location for dunesql runs * simplifies copy manifest logic * only run dunesql * Fix commit manifest target location * dunesql_check CI fix * Only select state:new,tag:dunesql * Move comma inside of $TAG. state:modified != state:modified, * Removes 'or True' from expose_spell and mark_as_spell macros prior to merging * Test CI run on different s3 location * Revert "Test CI run on different s3 location" This reverts commit 17dab72. * prod s3 bucket * trailing / * Prod location again --------- Co-authored-by: André Monteiro <andre@dune.com> Co-authored-by: Jonas Irgens Kylling <jonas@dune.xyz>

jkylling reviewed Jun 20, 2023

View reviewed changes

jeff-dude added WIP work in progress dune team created by dune team labels Jun 21, 2023

couralex6 changed the title ~~[WIP] Trino macros~~ Translating evms_blocks to test dual ci spark-trino Jun 21, 2023

a-monteiro force-pushed the trino-macros branch 2 times, most recently from dec3bc2 to 13a8682 Compare June 22, 2023 09:33

jkylling reviewed Jun 22, 2023

View reviewed changes

a-monteiro force-pushed the trino-macros branch from a7f54c5 to a888b31 Compare June 22, 2023 14:01

couralex6 force-pushed the trino-macros branch from 54131b6 to 7d1c562 Compare June 23, 2023 09:49

jkylling reviewed Jun 23, 2023

View reviewed changes

macros/expose_spells.sql Outdated Show resolved Hide resolved

jkylling reviewed Jun 23, 2023

View reviewed changes

macros/mark_as_spell.sql Outdated Show resolved Hide resolved

.github/workflows/commit_manifest.yml Show resolved Hide resolved

jkylling approved these changes Jun 23, 2023

View reviewed changes

macros/dune/schema.sql Outdated Show resolved Hide resolved

jeff-dude added ready-for-merging and removed WIP work in progress labels Jun 23, 2023

couralex6 changed the title ~~Translating evms_blocks to test dual ci spark-trino~~ DuneSQL Migration: new macros, new dunesql CI, and translated + legacy evms_blocks model Jun 23, 2023

Alex Courouble added 14 commits June 27, 2023 10:29

create schema macro for trino

494d9cb

trino create table

b3230c5

Adds a translated spell to test trino selector

e173b0c

changes tag name

ba3c595

Adds exclude tag names from CI workflow definition

1f30c96

Adds missing comma

872258e

Run alter table on every run

e1507f8

Adds delta_prod as database for source

ad3fd1f

DBT macro to override source database

7f44c46

Only add delta_prod to source if we are running on trino

388db3d

Updates expose spell and mark as spell macros

e37da90

Jonas' comments

acc540f

revert uniswap changes

5e3183a

Adds dbt trino to pipfile

225649b

a-monteiro and others added 20 commits June 27, 2023 10:29

Fix select/excludes

d9380b0

Run sequentially

69ea2cb

Fix exclude tag

003ae1c

Test fix for broken spark selector

80974a5

fixing bug where compile on spark had an empty select flag

74494df

Refactor dbt compile

7a73995

Updates commit manifest action to generate both dunsql and spark mani…

04e48fb

…fests The existing logic for spark manifest is unchanged, we still upload the spark manifest to the same location. We run the same logic to upload the dunesql manifest to a different s3 manifest.

Download manifest from correct location for dunesql runs

bc58a9c

simplifies copy manifest logic

f15cc22

only run dunesql

32f1e64

Fix commit manifest target location

06f8ab7

dunesql_check CI fix

52ff246

Only select state:new,tag:dunesql

d35ca4e

Move comma inside of $TAG. state:modified != state:modified,

3ffd65b

Removes 'or True' from expose_spell and mark_as_spell macros prior to…

67aac02

… merging

Test CI run on different s3 location

fb89cb5

Revert "Test CI run on different s3 location"

5dee4a7

This reverts commit 17dab72.

prod s3 bucket

02c439a

trailing /

1b5d890

Prod location again

b92a621

couralex6 force-pushed the trino-macros branch from efbc398 to b92a621 Compare June 27, 2023 17:29

couralex6 merged commit e5fc627 into main Jun 27, 2023

couralex6 deleted the trino-macros branch June 27, 2023 17:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DuneSQL Migration: new macros, new dunesql CI, and translated + legacy evms_blocks model #3570

DuneSQL Migration: new macros, new dunesql CI, and translated + legacy evms_blocks model #3570

couralex6 commented Jun 19, 2023

jkylling Jun 22, 2023

jeff-dude commented Jun 26, 2023

jeff-dude commented Jun 27, 2023 •

edited

Loading

dot2dotseurat commented Jun 27, 2023

hildobby commented Jun 28, 2023 •

edited

Loading

jeff-dude commented Jun 28, 2023

hildobby commented Jun 29, 2023

jeff-dude commented Jun 29, 2023

	run: "dbt test $PROFILE --select state:modified,config.materialized:incremental,$TAG --exclude $EXCLUDEtag:prod_exclude --defer --state ."
	run: "dbt test $PROFILE --select state:modified,config.materialized:incremental,$TAG --exclude ${EXCLUDE}tag:prod_exclude --defer --state ."

DuneSQL Migration: new macros, new dunesql CI, and translated + legacy evms_blocks model #3570

DuneSQL Migration: new macros, new dunesql CI, and translated + legacy evms_blocks model #3570

Conversation

couralex6 commented Jun 19, 2023

jkylling Jun 22, 2023

Choose a reason for hiding this comment

jeff-dude commented Jun 26, 2023

jeff-dude commented Jun 27, 2023 • edited Loading

dot2dotseurat commented Jun 27, 2023

hildobby commented Jun 28, 2023 • edited Loading

jeff-dude commented Jun 28, 2023

hildobby commented Jun 29, 2023

jeff-dude commented Jun 29, 2023

jeff-dude commented Jun 27, 2023 •

edited

Loading

hildobby commented Jun 28, 2023 •

edited

Loading