Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add migration for adding default db and schema to model #19425

Closed
wants to merge 5 commits into from

Conversation

hughhhh
Copy link
Member

@hughhhh hughhhh commented Mar 30, 2022

SUMMARY

Adding 2 new columns to sip-68 new dataset model to allow users to easily query based upon the default_schema + default_database

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Benchmarking:

Results:

Current: 0.54 s
10+: 0.51 s
100+: 0.59 s
1000+: 2.25 s

@hughhhh hughhhh requested a review from a team as a code owner March 30, 2022 05:08
@hughhhh hughhhh force-pushed the update-dataset-schema branch 2 times, most recently from 9066bcd to 6dd7947 Compare March 30, 2022 05:17
@hughhhh hughhhh force-pushed the update-dataset-schema branch from 6dd7947 to 03bbb0d Compare March 30, 2022 05:19
@github-actions
Copy link
Contributor

⚠️ @hughhhh Your base branch master has just also updated superset/migrations.

Please consider rebasing your branch to avoid db migration conflicts.

@codecov
Copy link

codecov bot commented Mar 30, 2022

Codecov Report

Merging #19425 (50765d0) into master (0968f86) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 50765d0 differs from pull request most recent head 482479d. Consider uploading reports for the commit 482479d to get more accurate results

@@           Coverage Diff           @@
##           master   #19425   +/-   ##
=======================================
  Coverage   66.55%   66.55%           
=======================================
  Files        1670     1670           
  Lines       63824    63824           
  Branches     6510     6510           
=======================================
  Hits        42481    42481           
  Misses      19654    19654           
  Partials     1689     1689           
Flag Coverage Δ
javascript 51.33% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/datasets/models.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0968f86...482479d. Read the comment docs.

@jinghua-qa jinghua-qa added the need:qa-review Requires QA review label Mar 30, 2022
@pull-request-size pull-request-size bot added size/L and removed size/M labels Mar 31, 2022
@github-actions
Copy link
Contributor

⚠️ @hughhhh Your base branch master has just also updated superset/migrations.

Please consider rebasing your branch to avoid db migration conflicts.

@hughhhh hughhhh force-pushed the update-dataset-schema branch from b2a5c3c to f664548 Compare March 31, 2022 00:29
@pull-request-size pull-request-size bot added size/M and removed size/L labels Mar 31, 2022
@hughhhh hughhhh force-pushed the update-dataset-schema branch from f664548 to 4dfbd20 Compare March 31, 2022 02:52
@pull-request-size pull-request-size bot added size/L and removed size/M labels Mar 31, 2022
@hughhhh hughhhh force-pushed the update-dataset-schema branch 6 times, most recently from e1fcb45 to 02eb0b6 Compare March 31, 2022 16:39
database_id = sa.Column(sa.Integer, sa.ForeignKey("dbs.id"), nullable=False)
default_database: Database = relationship(
"Database",
foreign_keys=[database_id],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't we pull this information from the tables? cc @betodealmeida

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initially we would on create, but the main issue is being able to filter/query via the ModelRestApi

the class currently doesn't allow us to join across models, which is making it difficult to for me to get the filtering to work properly for these columns. So the idea was to add default column for both schema and database to allow us to be able to power the CRUD filtering and we'd allow the user to update this field through settings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eschutho I talked with Hugh about this, we think we need the concept of a 'default database'. Right now there's an implicit assumption that all tables belong to the same database, but having a default database would allow us to relax that restriction in the future.

It's similar with schema, tables can belong to different schemas today, but there's still a concept of a default schema which is needed if the query references only the table name.

@hughhhh hughhhh force-pushed the update-dataset-schema branch from 02eb0b6 to 4f23a9c Compare March 31, 2022 17:05
@hughhhh hughhhh force-pushed the update-dataset-schema branch from 4f23a9c to 482479d Compare March 31, 2022 17:53
Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a small bug, but I'm also concerned about performance. I think AirBnB would have a hard time running this migration in their system, and we need to spend some time optimizing it.

Copy link
Member

@ktmud ktmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please hold on making more changes to this PR until #19421 is approved.

…a_dataset_model.py

Co-authored-by: Beto Dealmeida <roberto@dealmeida.net>
@hughhhh hughhhh closed this Apr 1, 2022
@mistercrunch mistercrunch deleted the update-dataset-schema branch March 26, 2024 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need:qa-review Requires QA review size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants