Skip to content

Commit

Permalink
Release v0.1.0 (#213)
Browse files Browse the repository at this point in the history
# Version changelog

## 0.1.0

Features

* Added interactive installation wizard
([#184](#184),
[#117](#117)).
* Added schedule of jobs as part of `install.sh` flow and created some
documentation ([#187](#187)).
* Added debug notebook companion to troubleshoot the installation
([#191](#191)).
* Added support for Hive Metastore Table ACLs inventory from all
databases ([#78](#78),
[#122](#122),
[#151](#151)).
* Created `$inventory.tables` from Scala notebook
([#207](#207)).
* Added local group migration support for ML-related objects
([#56](#56)).
* Added local group migration support for SQL warehouses
([#57](#57)).
* Added local group migration support for all compute-related resources
([#53](#53)).
* Added local group migration support for security-related objects
([#58](#58)).
* Added local group migration support for workflows
([#54](#54)).
* Added local group migration support for workspace-level objects
([#59](#59)).
* Added local group migration support for dashboards, queries, and
alerts ([#144](#144)).

Stability

* Added `codecov.io` publishing
([#204](#204)).
* Added more tests to group.py
([#148](#148)).
* Added tests for group state
([#133](#133)).
* Added tests for inventorizer and typed
([#125](#125)).
* Added tests WorkspaceListing
([#110](#110)).
* Added `make_*_permissions` fixtures
([#159](#159)).
* Added reusable fixtures module
([#119](#119)).
* Added testing for permissions
([#126](#126)).
* Added inventory table manager tests
([#153](#153)).
* Added `product_info` to track as SDK integration
([#76](#76)).
* Added failsafe permission get operations
([#65](#65)).
* Always install the latest `pip` version in `./install.sh`
([#201](#201)).
* Always store inventory in `hive_metastore` and make only
`inventory_database` configurable
([#178](#178)).
* Changed default logging level from `TRACE` to `DEBUG` log level
([#124](#124)).
* Consistently use `WorkspaceClient` from `databricks.sdk`
([#120](#120)).
* Convert pipeline code to use fixtures.
([#166](#166)).
* Exclude mixins from coverage
([#130](#130)).
* Fixed codecov.io reporting
([#212](#212)).
* Fixed configuration path in job task install code
([#210](#210)).
* Fixed a bug with dependency definitions
([#70](#70)).
* Fixed failing `test_jobs`
([#140](#140)).
* Fixed the issues with experiment listing
([#64](#64)).
* Fixed integration testing configuration
([#77](#77)).
* Make project runnable on nightly testing infrastructure
([#75](#75)).
* Migrated cluster policies to new fixtures
([#174](#174)).
* Migrated clusters to the new fixture framework
([#162](#162)).
* Migrated instance pool to the new fixture framework
([#161](#161)).
* Migrated to `databricks.labs.ucx` package
([#90](#90)).
* Migrated token authorization to new fixtures
([#175](#175)).
* Migrated experiment fixture to standard one
([#168](#168)).
* Migrated jobs test to fixture based one.
([#167](#167)).
* Migrated model fixture to the standard fixtures
([#169](#169)).
* Migrated warehouse fixture to standard one
([#170](#170)).
* Organise modules by domain
([#197](#197)).
* Prefetch all account-level and workspace-level groups
([#192](#192)).
* Programmatically create a dashboard
([#121](#121)).
* Properly integrate Python `logging` facility
([#118](#118)).
* Refactored code to use Databricks SDK for Python
([#27](#27)).
* Refactored configuration and remove global provider state
([#71](#71)).
* Removed `pydantic` dependency
([#138](#138)).
* Removed redundant `pyspark`, `databricks-connect`, `delta-spark`, and
`pandas` dependencies
([#193](#193)).
* Removed redundant `typer[all]` dependency and its usages
([#194](#194)).
* Renamed `MigrationGroupsProvider` to `GroupMigrationState`
([#81](#81)).
* Replaced `ratelimit` and `tenacity` dependencies with simpler
implementations ([#195](#195)).
* Reorganised integration tests to align more with unit tests
([#206](#206)).
* Run `build` workflow also on `main` branch
([#211](#211)).
* Run integration test with a single group
([#152](#152)).
* Simplify `SqlBackend` and table creation logic
([#203](#203)).
* Updated `migration_config.yml`
([#179](#179)).
* Updated legal information
([#196](#196)).
* Use `make_secret_scope` fixture
([#163](#163)).
* Use fixture factory for `make_table`, `make_schema`, and
`make_catalog` ([#189](#189)).
* Use new fixtures for notebooks and folders
([#176](#176)).
* Validate toolkit notebook test
([#183](#183)).

Contributing

* Added a note on external dependencies
([#139](#139)).
* Added ability to run SQL queries on Spark when in Databricks Runtime
([#108](#108)).
* Added some ground rules for contributing
([#82](#82)).
* Added contributing instructions link from main readme
([#109](#109)).
* Added info about environment refreshes
([#155](#155)).
* Clarified documentation
([#137](#137)).
* Enabled merge queue
([#146](#146)).
* Improved `CONTRIBUTING.md` guide
([#135](#135),
[#145](#145)).
  • Loading branch information
nfx authored Sep 18, 2023
1 parent 1c427b3 commit c6019ad
Show file tree
Hide file tree
Showing 5 changed files with 116 additions and 73 deletions.
86 changes: 86 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Version changelog

## 0.1.0

Features

* Added interactive installation wizard ([#184](https://github.com/databricks/ucx/pull/184), [#117](https://github.com/databricks/ucx/pull/117)).
* Added schedule of jobs as part of `install.sh` flow and created some documentation ([#187](https://github.com/databricks/ucx/pull/187)).
* Added debug notebook companion to troubleshoot the installation ([#191](https://github.com/databricks/ucx/pull/191)).
* Added support for Hive Metastore Table ACLs inventory from all databases ([#78](https://github.com/databricks/ucx/pull/78), [#122](https://github.com/databricks/ucx/pull/122), [#151](https://github.com/databricks/ucx/pull/151)).
* Created `$inventory.tables` from Scala notebook ([#207](https://github.com/databricks/ucx/pull/207)).
* Added local group migration support for ML-related objects ([#56](https://github.com/databricks/ucx/pull/56)).
* Added local group migration support for SQL warehouses ([#57](https://github.com/databricks/ucx/pull/57)).
* Added local group migration support for all compute-related resources ([#53](https://github.com/databricks/ucx/pull/53)).
* Added local group migration support for security-related objects ([#58](https://github.com/databricks/ucx/pull/58)).
* Added local group migration support for workflows ([#54](https://github.com/databricks/ucx/pull/54)).
* Added local group migration support for workspace-level objects ([#59](https://github.com/databricks/ucx/pull/59)).
* Added local group migration support for dashboards, queries, and alerts ([#144](https://github.com/databricks/ucx/pull/144)).

Stability

* Added `codecov.io` publishing ([#204](https://github.com/databricks/ucx/pull/204)).
* Added more tests to group.py ([#148](https://github.com/databricks/ucx/pull/148)).
* Added tests for group state ([#133](https://github.com/databricks/ucx/pull/133)).
* Added tests for inventorizer and typed ([#125](https://github.com/databricks/ucx/pull/125)).
* Added tests WorkspaceListing ([#110](https://github.com/databricks/ucx/pull/110)).
* Added `make_*_permissions` fixtures ([#159](https://github.com/databricks/ucx/pull/159)).
* Added reusable fixtures module ([#119](https://github.com/databricks/ucx/pull/119)).
* Added testing for permissions ([#126](https://github.com/databricks/ucx/pull/126)).
* Added inventory table manager tests ([#153](https://github.com/databricks/ucx/pull/153)).
* Added `product_info` to track as SDK integration ([#76](https://github.com/databricks/ucx/pull/76)).
* Added failsafe permission get operations ([#65](https://github.com/databricks/ucx/pull/65)).
* Always install the latest `pip` version in `./install.sh` ([#201](https://github.com/databricks/ucx/pull/201)).
* Always store inventory in `hive_metastore` and make only `inventory_database` configurable ([#178](https://github.com/databricks/ucx/pull/178)).
* Changed default logging level from `TRACE` to `DEBUG` log level ([#124](https://github.com/databricks/ucx/pull/124)).
* Consistently use `WorkspaceClient` from `databricks.sdk` ([#120](https://github.com/databricks/ucx/pull/120)).
* Convert pipeline code to use fixtures. ([#166](https://github.com/databricks/ucx/pull/166)).
* Exclude mixins from coverage ([#130](https://github.com/databricks/ucx/pull/130)).
* Fixed codecov.io reporting ([#212](https://github.com/databricks/ucx/pull/212)).
* Fixed configuration path in job task install code ([#210](https://github.com/databricks/ucx/pull/210)).
* Fixed a bug with dependency definitions ([#70](https://github.com/databricks/ucx/pull/70)).
* Fixed failing `test_jobs` ([#140](https://github.com/databricks/ucx/pull/140)).
* Fixed the issues with experiment listing ([#64](https://github.com/databricks/ucx/pull/64)).
* Fixed integration testing configuration ([#77](https://github.com/databricks/ucx/pull/77)).
* Make project runnable on nightly testing infrastructure ([#75](https://github.com/databricks/ucx/pull/75)).
* Migrated cluster policies to new fixtures ([#174](https://github.com/databricks/ucx/pull/174)).
* Migrated clusters to the new fixture framework ([#162](https://github.com/databricks/ucx/pull/162)).
* Migrated instance pool to the new fixture framework ([#161](https://github.com/databricks/ucx/pull/161)).
* Migrated to `databricks.labs.ucx` package ([#90](https://github.com/databricks/ucx/pull/90)).
* Migrated token authorization to new fixtures ([#175](https://github.com/databricks/ucx/pull/175)).
* Migrated experiment fixture to standard one ([#168](https://github.com/databricks/ucx/pull/168)).
* Migrated jobs test to fixture based one. ([#167](https://github.com/databricks/ucx/pull/167)).
* Migrated model fixture to the standard fixtures ([#169](https://github.com/databricks/ucx/pull/169)).
* Migrated warehouse fixture to standard one ([#170](https://github.com/databricks/ucx/pull/170)).
* Organise modules by domain ([#197](https://github.com/databricks/ucx/pull/197)).
* Prefetch all account-level and workspace-level groups ([#192](https://github.com/databricks/ucx/pull/192)).
* Programmatically create a dashboard ([#121](https://github.com/databricks/ucx/pull/121)).
* Properly integrate Python `logging` facility ([#118](https://github.com/databricks/ucx/pull/118)).
* Refactored code to use Databricks SDK for Python ([#27](https://github.com/databricks/ucx/pull/27)).
* Refactored configuration and remove global provider state ([#71](https://github.com/databricks/ucx/pull/71)).
* Removed `pydantic` dependency ([#138](https://github.com/databricks/ucx/pull/138)).
* Removed redundant `pyspark`, `databricks-connect`, `delta-spark`, and `pandas` dependencies ([#193](https://github.com/databricks/ucx/pull/193)).
* Removed redundant `typer[all]` dependency and its usages ([#194](https://github.com/databricks/ucx/pull/194)).
* Renamed `MigrationGroupsProvider` to `GroupMigrationState` ([#81](https://github.com/databricks/ucx/pull/81)).
* Replaced `ratelimit` and `tenacity` dependencies with simpler implementations ([#195](https://github.com/databricks/ucx/pull/195)).
* Reorganised integration tests to align more with unit tests ([#206](https://github.com/databricks/ucx/pull/206)).
* Run `build` workflow also on `main` branch ([#211](https://github.com/databricks/ucx/pull/211)).
* Run integration test with a single group ([#152](https://github.com/databricks/ucx/pull/152)).
* Simplify `SqlBackend` and table creation logic ([#203](https://github.com/databricks/ucx/pull/203)).
* Updated `migration_config.yml` ([#179](https://github.com/databricks/ucx/pull/179)).
* Updated legal information ([#196](https://github.com/databricks/ucx/pull/196)).
* Use `make_secret_scope` fixture ([#163](https://github.com/databricks/ucx/pull/163)).
* Use fixture factory for `make_table`, `make_schema`, and `make_catalog` ([#189](https://github.com/databricks/ucx/pull/189)).
* Use new fixtures for notebooks and folders ([#176](https://github.com/databricks/ucx/pull/176)).
* Validate toolkit notebook test ([#183](https://github.com/databricks/ucx/pull/183)).

Contributing

* Added a note on external dependencies ([#139](https://github.com/databricks/ucx/pull/139)).
* Added ability to run SQL queries on Spark when in Databricks Runtime ([#108](https://github.com/databricks/ucx/pull/108)).
* Added some ground rules for contributing ([#82](https://github.com/databricks/ucx/pull/82)).
* Added contributing instructions link from main readme ([#109](https://github.com/databricks/ucx/pull/109)).
* Added info about environment refreshes ([#155](https://github.com/databricks/ucx/pull/155)).
* Clarified documentation ([#137](https://github.com/databricks/ucx/pull/137)).
* Enabled merge queue ([#146](https://github.com/databricks/ucx/pull/146)).
* Improved `CONTRIBUTING.md` guide ([#135](https://github.com/databricks/ucx/pull/135), [#145](https://github.com/databricks/ucx/pull/145)).
17 changes: 12 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# UCX - Unity Catalog Migration Toolkit

Your best companion for enabling the Unity Catalog.
[![build](https://github.com/databrickslabs/ucx/actions/workflows/push.yml/badge.svg)](https://github.com/databrickslabs/ucx/actions/workflows/push.yml) [![codecov](https://codecov.io/github/databrickslabs/ucx/graph/badge.svg?token=p0WKAfW5HQ)](https://codecov.io/github/databrickslabs/ucx)

Your best companion for enabling the Unity Catalog. It helps you to migrate all Databricks workspace assets:
Entitlements, AWS instance profiles, Clusters, Cluster policies, Instance Pools, Databricks SQL warehouses, Delta Live
Tables, Jobs, MLflow experiments, MLflow registry, SQL Dashboards & Queries, SQL Alerts, Token and Password usage
permissions that are set on the workspace level, Secret scopes, Notebooks, Directories, Repos, Files.

See [contributing instructions](CONTRIBUTING.md) to help improve this project.

## Installation

Expand All @@ -18,19 +25,19 @@ export DATABRICKS_CONFIG_PROFILE=ABC
```

You can also specify environment variables in a more direct way, like in this example for installing
on a Azure Databricks Workspace using the Azure CLI authentication:
on an Azure Databricks Workspace using the Azure CLI authentication:

```shell
az login
export DATABRICKS_HOST=https://adb-123....azuredatabricks.net/
./install.sh
```

## Latest working version and how-to
Please follow the instructions in `./install.sh`, which will open a notebook with the description of all jobs to trigger. The journey starts with assessment.

Please note that current project statis is 🏗️ **WIP**, but we have a minimal set of already working utilities.
## Star History

See [contributing instructions](CONTRIBUTING.md).
[![Star History Chart](https://api.star-history.com/svg?repos=databrickslabs/ucx&type=Date)](https://star-history.com/#databrickslabs/ucx)

## Project Support
Please note that all projects in the /databrickslabs github account are provided for your exploration only, and are not formally supported by Databricks with Service Level Agreements (SLAs). They are provided AS-IS and we do not make any guarantees of any kind. Please do not submit a support ticket relating to any issues arising from the use of these projects.
Expand Down
67 changes: 0 additions & 67 deletions USAGE.md

This file was deleted.

17 changes: 17 additions & 0 deletions docs/logic.md → docs/local-group-migration.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,22 @@
# Permissions migration logic and data structures

During the UC adoption, it's critical to move the groups from the workspace to account level.

To deliver this migration, the following steps are performed:

| Step description | Relevant API method |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|
| A set of groups to be migrated is identified (either via `groups.selected` config property, or automatically).<br/>Group existence is verified against the account level.<br/>**If there is no group on the account level, an error is thrown.**<br/>Backup groups are created on the workspace level. | `toolkit.prepare_groups_in_environment()` |
| Inventory table is cleaned up. | `toolkit.cleanup_inventory_table()` |
| Workspace local group permissions are inventorized and saved into a Delta Table. | `toolkit.inventorize_permissions()` |
| Backup groups are entitled with permissions from the inventory table. | `toolkit.apply_permissions_to_backup_groups()` |
| Workspace-level groups are deleted. Account-level groups are granted with access to the workspace.<br/>Workspace-level entitlements are synced from backup groups to newly added account-level groups. | `toolkit.replace_workspace_groups_with_account_groups()` |
| Account-level groups are entitled with workspace-level permissions from the inventory table. | `toolkit.apply_permissions_to_account_groups()` |
| Backup groups are deleted | `toolkit.delete_backup_groups()` |
| Inventory table is cleaned up. This step is optional. | `toolkit.cleanup_inventory_table()` |

> Please note that inherited permissions will not be inventorized / migrated. We only cover direct permissions.
On a very high-level, the permissions inventorization process is split into two steps:

1. collect all existing permissions into a persistent storage.
Expand Down
2 changes: 1 addition & 1 deletion src/databricks/labs/ucx/__about__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.0.3"
__version__ = "0.1.0"

0 comments on commit c6019ad

Please sign in to comment.