Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPE-126 Migrate zerion_{chain}.trades tables #4437

Merged
merged 9 commits into from
Oct 4, 2023

Conversation

hildobby
Copy link
Collaborator

Thank you for contributing to Spellbook!

Please refer to the top of the readme in the root of Spellbook to learn how to contribute to Spellbook on DuneSQL.

@hildobby hildobby added dbt: dex covers the DEX dbt subproject DuneSQL migration labels Sep 27, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@hildobby hildobby added the ready-for-review this PR development is complete, please review label Sep 27, 2023
alias = alias('trades')
tags=['dunesql']
,schema = 'zerion_arbitrum'
,alias = alias('trades')
,partition_by = ['block_date']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can actually drop partitioning on all these tables because they are too small to benefit from it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sg, I will remove the partitioning now.

Btw, any colour on best practices for partitioning? It would be cool to have some ballpark numbers for:

  • minimum amount of entries where a dataset would benefit from partitioning
  • some range or sweetspot numbers for partition sizes

All I know is that block_month has proven to be a lot better for partitioning than block_date in some places apparently, but the size of those partitions will vary greatly based on the dataset. If it's a gas-level spell (lots of entries), maybe weekly is better for example 🤷‍♂️

Without much knowledge on how it looks in the backend, it's hard to assess when it's useful/less

Same thing applies for incremental_predicates if you have any tips there on when it should/shoudn't be used

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a great question. block_month is generally preferable to block_date but in most cases you could probably start without it all together. Some examples:

  • Big tables:fees_traces or solana tables -> block_date (these tend to be > 1TB)
  • Medium tables: dex.trades -> block_month (>50GB)
  • Small tables: this one -> no need for partitioning

We should use incremental predicates for big and some medium tables but smaller ones shouldn't really need it.

It can also be a bit difficult for you to know how big a table is since we don't really have this data exposed anywhere but trying to give some intuition about this above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ok that's the rough idea I had in mind, still very much a guessing game though.

Isn't there a way to have some hardcoded rules for those to be handled automatically by spellbook based on the number of rows in a table (or whatever other criteria)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not at the moment, I don't think it's easy to dynamically decide on partitioning since sometimes we need to add an extra column for it to work for example.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@@ -1,6 +1,7 @@
{{ config(
alias = alias('trades')
,partition_by = ['block_date']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove partitioning here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm just saw Alan's comment

@couralex6 couralex6 changed the title Migrate zerion_{chain}.trades tables SPE-126 Migrate zerion_{chain}.trades tables Sep 29, 2023
@couralex6 couralex6 added ready-for-merging and removed ready-for-review this PR development is complete, please review labels Sep 29, 2023
@jeff-dude jeff-dude self-assigned this Oct 4, 2023
@jeff-dude jeff-dude merged commit 2b847b5 into duneanalytics:main Oct 4, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Oct 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
dbt: dex covers the DEX dbt subproject ready-for-merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants