-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add proposal for dataset schema versions #2696
Conversation
✅ Deploy Preview for peppy-sprite-186812 canceled.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2696 +/- ##
=========================================
Coverage 84.45% 84.45%
Complexity 1416 1416
=========================================
Files 251 251
Lines 6447 6447
Branches 291 291
=========================================
Hits 5445 5445
Misses 850 850
Partials 152 152 ☔ View full report in Codecov by Sentry. |
3f2376f
to
05a4d4d
Compare
@wslulciuc @pawel-big-lebowski I just pushed some updates to this based on our discussion yesterday:
|
Superb proposal with A1 diagram 👍 The proposal contains an example of job run every 10 mins resulting in 864,000 rows in |
For migration of existing data, the simple thing to do would be to create a schema version for every existing dataset version, even though it would cause duplication. I would definitely prefer to do a smarter script so it would only create distinct schema versions. In our example above this would result in just 1 schema version. I think my next move should be to experiment with whether this can be scripted in just SQL, and how slow it is for e.g. millions of records. |
b488967
to
2c6f54e
Compare
2c6f54e
to
8db5460
Compare
Added a section regarding the behaviour with input datasets, where the current dataset version is updated, and how we will handle this differently for dataset schema versions. |
Signed-off-by: David Goss <david.goss@matillion.com>
8db5460
to
46fdd91
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 🚀 💯
Signed-off-by: David Goss <david.goss@matillion.com>
Official proposal for #2676.
Checklist
CHANGELOG.md
(Depending on the change, this may not be necessary)..sql
database schema migration according to Flyway's naming convention (if relevant)