-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add functionality to create and delete catalogs, tables and schemas to Unity catalog client #20956
Conversation
e7df116
to
dadad0d
Compare
ff8d546
to
38bf0cb
Compare
@@ -46,7 +53,7 @@ def __init__( | |||
* "databricks-sdk": Use the Databricks SDK to retrieve and use the | |||
bearer token from the environment. | |||
""" | |||
from polars.polars import PyCatalogClient | |||
issue_unstable_warning("`Catalog` functionality is considered unstable.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed calling issue_unstable_warning()
before, currently on 1.21.0 the unstable warning is only in the docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's fine. I don't think we need to warn per-se.
Catalog
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20956 +/- ##
==========================================
- Coverage 79.28% 79.08% -0.20%
==========================================
Files 1578 1580 +2
Lines 224195 224938 +743
Branches 2576 2576
==========================================
+ Hits 177756 177898 +142
- Misses 45848 46452 +604
+ Partials 591 588 -3 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only comment about the "schema" name. I'd like to propose "namespace"
for that.
Ok(()) | ||
} | ||
|
||
pub async fn create_schema( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that our schema conflicts with the catalog schema definition. Shall we name it create_namespace
and mention in the docstrings that we mean catalog schema's for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can update it. Do you also want it changed on the Python API, or just on the Rust-side functions?
The "schema" word is also used in multiple places (e.g. struct SchemaInfo
, or schema_info: &str
in function arguments) - if we want to be consistent we would need to rename all of those places.
Adds functionality to create / delete catalogs, tables and schemas, along with other drive-by improvements. Note, this does not contain functionality to actually write data to a table.
Changes
type_json
field instead oftype_text
, astype_text
was observed to have format inconsistencies across different data sources.Breaking change from
TypedDict
todataclass
in function returnsThis PR currently also updates some Python-side function returns to return dataclasses instead of (typed) dictionaries. This was mainly done so that I could add a
get_polars_schema()
toTableInfo
as it wasn't possible to add methods to aTypedDict
. It will be a breaking change compared to the API in the latest release (1.21.0) (i.e. you would have to change existing code fromcatalog_info['name']
tocatalog_info.name
etc.), but as it is unstable functionality it should be acceptable. I also think this is better to do now as it means we will be able to add functions later if we need to without having to break it then.Other notes
It is a fairly large PR, if it helps for reviewing, this is a rough description of the sections of code when scrolling from top to bottom:
polars-io/src/catalog/schema.rs
type_json
parsingpolars-io/src/catalog/unity/client.rs
create_catalog
etc.)polars-io/src/catalog/unity/models.rs
polars-python/src/catalog/mod.rs
create_catalog
etc.)py-polars/polars/catalog.py