-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: a native SQLAlchemy dialect for Superset #14225
feat: a native SQLAlchemy dialect for Superset #14225
Conversation
Sounds interesting! Could you provide some possible Use Cases? Is this a step towards paving the path to a LookML-like modeling layer? |
Yeah, that's my vision. Improving the semantic layer in Superset so we can do things like:
|
1c99f84
to
77c1794
Compare
77c1794
to
16c39c3
Compare
Thanks for the explanation. BTW, is this the right place for such a discussion, or would Slack or the mailing list be more appropriate? |
16c39c3
to
c6007a3
Compare
c6007a3
to
8290846
Compare
It would not, but that's an interesting use case. Presto has a similar use case, because we don't support multiple catalogs in a single DB, you need to create a DB per catalog. |
d28ef15
to
075abf4
Compare
b8f0a3f
to
e06bd79
Compare
OK, I think this is ready for it's annual review! 😆 |
b64e78a
to
8e2b971
Compare
This might interest you: https://preset.io/blog/accessing-apis-with-superset/ |
@michael-s-molina do we plan to include this in 3.0? |
Given the experimental nature of this feature, I think it's a good idea to give it some time to mature. It will likely be included in a minor release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betodealmeida I left one small comment, but otherwise this LGTM.
@@ -67,6 +67,10 @@ def get_git_sha() -> str: | |||
"sqlalchemy.dialects": [ | |||
"postgres.psycopg2 = sqlalchemy.dialects.postgresql:dialect", | |||
"postgres = sqlalchemy.dialects.postgresql:dialect", | |||
"superset = superset.extensions.metadb:SupersetAPSWDialect", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would dialects
be a more appropriate name that metadb
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, good point. I'm not sure if we're ever going to have multiple SQLAlchemy dialects defined in Superset, which is why I named it this way.
Thanks a lot, man! Edit "merged" Wait, what? This is fantastic, lol Congratulations! |
It appears that the latest release, v3.0.1 (which was released two weeks ago), still doesn't include this feature. However, this feature has been mentioned in https://superset.apache.org/docs/databases/meta-database/ When will we have access to this feature? If I can't wait, how can I build the application with the latest code? Could you please provide instructions? |
SUMMARY
This PR introduces a new SQLAlchemy dialect,
superset://
, together with a corresponding DB engine spec. With this, users can create a new database using thesuperset://
SQLAlchemy URI, and use it to write queries like this:Queries can also join data from multiple databases, or even move data from one to another:
The database can even query itself (not that that's useful):
The dialect can only be used if the
ENABLE_SUPERSET_META_DB
feature flag is enabled, otherwise it will be blocked and won't even show up in the list of available databases.While the dialect can be use to join across databases, users should be careful with big joins and expensive queries. Filtering, sorting, limiting and offsetting are pushed to the corresponding databases, but aggregations and joins happen in memory. For this reason it's recommended to enable asynchronous queries in the
superset://
database, so that computations are executed in Celery workers instead of the web workers.In addition, it's also possible to limit how much data is read from each database via the
SUPERSET_META_DB_LIMIT
configuration value, set initially to 1000.The dialect uses Superset's security manager to prevent users from accessing unauthorized databases. DML is supported as long as DML is enabled in the
superset://
database and all the related databases. It's also possible to limit the databases that the dialect has access to, by settingallowed_dbs
in the engine parameters. Eg:BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TEST PLAN
Added unit tests.
ADDITIONAL INFORMATION