Release v0.1.0 (#213)

# Version changelog ## 0.1.0 Features * Added interactive installation wizard ([#184](#184), [#117](#117)). * Added schedule of jobs as part of `install.sh` flow and created some documentation ([#187](#187)). * Added debug notebook companion to troubleshoot the installation ([#191](#191)). * Added support for Hive Metastore Table ACLs inventory from all databases ([#78](#78), [#122](#122), [#151](#151)). * Created `$inventory.tables` from Scala notebook ([#207](#207)). * Added local group migration support for ML-related objects ([#56](#56)). * Added local group migration support for SQL warehouses ([#57](#57)). * Added local group migration support for all compute-related resources ([#53](#53)). * Added local group migration support for security-related objects ([#58](#58)). * Added local group migration support for workflows ([#54](#54)). * Added local group migration support for workspace-level objects ([#59](#59)). * Added local group migration support for dashboards, queries, and alerts ([#144](#144)). Stability * Added `codecov.io` publishing ([#204](#204)). * Added more tests to group.py ([#148](#148)). * Added tests for group state ([#133](#133)). * Added tests for inventorizer and typed ([#125](#125)). * Added tests WorkspaceListing ([#110](#110)). * Added `make_*_permissions` fixtures ([#159](#159)). * Added reusable fixtures module ([#119](#119)). * Added testing for permissions ([#126](#126)). * Added inventory table manager tests ([#153](#153)). * Added `product_info` to track as SDK integration ([#76](#76)). * Added failsafe permission get operations ([#65](#65)). * Always install the latest `pip` version in `./install.sh` ([#201](#201)). * Always store inventory in `hive_metastore` and make only `inventory_database` configurable ([#178](#178)). * Changed default logging level from `TRACE` to `DEBUG` log level ([#124](#124)). * Consistently use `WorkspaceClient` from `databricks.sdk` ([#120](#120)). * Convert pipeline code to use fixtures. ([#166](#166)). * Exclude mixins from coverage ([#130](#130)). * Fixed codecov.io reporting ([#212](#212)). * Fixed configuration path in job task install code ([#210](#210)). * Fixed a bug with dependency definitions ([#70](#70)). * Fixed failing `test_jobs` ([#140](#140)). * Fixed the issues with experiment listing ([#64](#64)). * Fixed integration testing configuration ([#77](#77)). * Make project runnable on nightly testing infrastructure ([#75](#75)). * Migrated cluster policies to new fixtures ([#174](#174)). * Migrated clusters to the new fixture framework ([#162](#162)). * Migrated instance pool to the new fixture framework ([#161](#161)). * Migrated to `databricks.labs.ucx` package ([#90](#90)). * Migrated token authorization to new fixtures ([#175](#175)). * Migrated experiment fixture to standard one ([#168](#168)). * Migrated jobs test to fixture based one. ([#167](#167)). * Migrated model fixture to the standard fixtures ([#169](#169)). * Migrated warehouse fixture to standard one ([#170](#170)). * Organise modules by domain ([#197](#197)). * Prefetch all account-level and workspace-level groups ([#192](#192)). * Programmatically create a dashboard ([#121](#121)). * Properly integrate Python `logging` facility ([#118](#118)). * Refactored code to use Databricks SDK for Python ([#27](#27)). * Refactored configuration and remove global provider state ([#71](#71)). * Removed `pydantic` dependency ([#138](#138)). * Removed redundant `pyspark`, `databricks-connect`, `delta-spark`, and `pandas` dependencies ([#193](#193)). * Removed redundant `typer[all]` dependency and its usages ([#194](#194)). * Renamed `MigrationGroupsProvider` to `GroupMigrationState` ([#81](#81)). * Replaced `ratelimit` and `tenacity` dependencies with simpler implementations ([#195](#195)). * Reorganised integration tests to align more with unit tests ([#206](#206)). * Run `build` workflow also on `main` branch ([#211](#211)). * Run integration test with a single group ([#152](#152)). * Simplify `SqlBackend` and table creation logic ([#203](#203)). * Updated `migration_config.yml` ([#179](#179)). * Updated legal information ([#196](#196)). * Use `make_secret_scope` fixture ([#163](#163)). * Use fixture factory for `make_table`, `make_schema`, and `make_catalog` ([#189](#189)). * Use new fixtures for notebooks and folders ([#176](#176)). * Validate toolkit notebook test ([#183](#183)). Contributing * Added a note on external dependencies ([#139](#139)). * Added ability to run SQL queries on Spark when in Databricks Runtime ([#108](#108)). * Added some ground rules for contributing ([#82](#82)). * Added contributing instructions link from main readme ([#109](#109)). * Added info about environment refreshes ([#155](#155)). * Clarified documentation ([#137](#137)). * Enabled merge queue ([#146](#146)). * Improved `CONTRIBUTING.md` guide ([#135](#135), [#145](#145)).
databrickslabs · Sep 18, 2023 · c6019ad · c6019ad
1 parent 1c427b3
commit c6019ad
Show file tree

Hide file tree

Showing 5 changed files with 116 additions and 73 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,86 @@
+# Version changelog
+
+## 0.1.0
+
+Features
+
+* Added interactive installation wizard ([#184](https://github.com/databricks/ucx/pull/184), [#117](https://github.com/databricks/ucx/pull/117)).
+* Added schedule of jobs as part of `install.sh` flow and created some documentation ([#187](https://github.com/databricks/ucx/pull/187)).
+* Added debug notebook companion to troubleshoot the installation ([#191](https://github.com/databricks/ucx/pull/191)).
+* Added support for Hive Metastore Table ACLs inventory from all databases ([#78](https://github.com/databricks/ucx/pull/78), [#122](https://github.com/databricks/ucx/pull/122), [#151](https://github.com/databricks/ucx/pull/151)).
+* Created `$inventory.tables` from Scala notebook ([#207](https://github.com/databricks/ucx/pull/207)).
+* Added local group migration support for ML-related objects ([#56](https://github.com/databricks/ucx/pull/56)).
+* Added local group migration support for SQL warehouses ([#57](https://github.com/databricks/ucx/pull/57)).
+* Added local group migration support for all compute-related resources ([#53](https://github.com/databricks/ucx/pull/53)).
+* Added local group migration support for security-related objects ([#58](https://github.com/databricks/ucx/pull/58)).
+* Added local group migration support for workflows ([#54](https://github.com/databricks/ucx/pull/54)).
+* Added local group migration support for workspace-level objects ([#59](https://github.com/databricks/ucx/pull/59)).
+* Added local group migration support for dashboards, queries, and alerts ([#144](https://github.com/databricks/ucx/pull/144)).
+
+Stability
+
+* Added `codecov.io` publishing ([#204](https://github.com/databricks/ucx/pull/204)).
+* Added more tests to group.py ([#148](https://github.com/databricks/ucx/pull/148)).
+* Added tests for group state ([#133](https://github.com/databricks/ucx/pull/133)).
+* Added tests for inventorizer and typed ([#125](https://github.com/databricks/ucx/pull/125)).
+* Added tests WorkspaceListing ([#110](https://github.com/databricks/ucx/pull/110)).
+* Added `make_*_permissions` fixtures ([#159](https://github.com/databricks/ucx/pull/159)).
+* Added reusable fixtures module ([#119](https://github.com/databricks/ucx/pull/119)).
+* Added testing for permissions ([#126](https://github.com/databricks/ucx/pull/126)).
+* Added inventory table manager tests ([#153](https://github.com/databricks/ucx/pull/153)).
+* Added `product_info` to track as SDK integration ([#76](https://github.com/databricks/ucx/pull/76)).
+* Added failsafe permission get operations ([#65](https://github.com/databricks/ucx/pull/65)).
+* Always install the latest `pip` version in `./install.sh` ([#201](https://github.com/databricks/ucx/pull/201)).
+* Always store inventory in `hive_metastore` and make only `inventory_database` configurable ([#178](https://github.com/databricks/ucx/pull/178)).
+* Changed default logging level from `TRACE` to `DEBUG` log level ([#124](https://github.com/databricks/ucx/pull/124)).
+* Consistently use `WorkspaceClient` from `databricks.sdk` ([#120](https://github.com/databricks/ucx/pull/120)).
+* Convert pipeline code to use fixtures. ([#166](https://github.com/databricks/ucx/pull/166)).
+* Exclude mixins from coverage ([#130](https://github.com/databricks/ucx/pull/130)).
+* Fixed codecov.io reporting ([#212](https://github.com/databricks/ucx/pull/212)).
+* Fixed configuration path in job task install code ([#210](https://github.com/databricks/ucx/pull/210)).
+* Fixed a bug with dependency definitions ([#70](https://github.com/databricks/ucx/pull/70)).
+* Fixed failing `test_jobs` ([#140](https://github.com/databricks/ucx/pull/140)).
+* Fixed the issues with experiment listing ([#64](https://github.com/databricks/ucx/pull/64)).
+* Fixed integration testing configuration ([#77](https://github.com/databricks/ucx/pull/77)).
+* Make project runnable on nightly testing infrastructure ([#75](https://github.com/databricks/ucx/pull/75)).
+* Migrated cluster policies to new fixtures ([#174](https://github.com/databricks/ucx/pull/174)).
+* Migrated clusters to the new fixture framework ([#162](https://github.com/databricks/ucx/pull/162)).
+* Migrated instance pool to the new fixture framework ([#161](https://github.com/databricks/ucx/pull/161)).
+* Migrated to `databricks.labs.ucx` package ([#90](https://github.com/databricks/ucx/pull/90)).
+* Migrated token authorization to new fixtures ([#175](https://github.com/databricks/ucx/pull/175)).
+* Migrated experiment fixture to standard one ([#168](https://github.com/databricks/ucx/pull/168)).
+* Migrated jobs test to fixture based one. ([#167](https://github.com/databricks/ucx/pull/167)).
+* Migrated model fixture to the standard fixtures ([#169](https://github.com/databricks/ucx/pull/169)).
+* Migrated warehouse fixture to standard one ([#170](https://github.com/databricks/ucx/pull/170)).
+* Organise modules by domain ([#197](https://github.com/databricks/ucx/pull/197)).
+* Prefetch all account-level and workspace-level groups ([#192](https://github.com/databricks/ucx/pull/192)).
+* Programmatically create a dashboard ([#121](https://github.com/databricks/ucx/pull/121)).
+* Properly integrate Python `logging` facility ([#118](https://github.com/databricks/ucx/pull/118)).
+* Refactored code to use Databricks SDK for Python ([#27](https://github.com/databricks/ucx/pull/27)).
+* Refactored configuration and remove global provider state ([#71](https://github.com/databricks/ucx/pull/71)).
+* Removed `pydantic` dependency ([#138](https://github.com/databricks/ucx/pull/138)).
+* Removed redundant `pyspark`, `databricks-connect`, `delta-spark`, and `pandas` dependencies ([#193](https://github.com/databricks/ucx/pull/193)).
+* Removed redundant `typer[all]` dependency and its usages ([#194](https://github.com/databricks/ucx/pull/194)).
+* Renamed `MigrationGroupsProvider` to `GroupMigrationState` ([#81](https://github.com/databricks/ucx/pull/81)).
+* Replaced `ratelimit` and `tenacity` dependencies with simpler implementations ([#195](https://github.com/databricks/ucx/pull/195)).
+* Reorganised integration tests to align more with unit tests ([#206](https://github.com/databricks/ucx/pull/206)).
+* Run `build` workflow also on `main` branch ([#211](https://github.com/databricks/ucx/pull/211)).
+* Run integration test with a single group ([#152](https://github.com/databricks/ucx/pull/152)).
+* Simplify `SqlBackend` and table creation logic ([#203](https://github.com/databricks/ucx/pull/203)).
+* Updated `migration_config.yml` ([#179](https://github.com/databricks/ucx/pull/179)).
+* Updated legal information ([#196](https://github.com/databricks/ucx/pull/196)).
+* Use `make_secret_scope` fixture ([#163](https://github.com/databricks/ucx/pull/163)).
+* Use fixture factory for `make_table`, `make_schema`, and `make_catalog` ([#189](https://github.com/databricks/ucx/pull/189)).
+* Use new fixtures for notebooks and folders ([#176](https://github.com/databricks/ucx/pull/176)).
+* Validate toolkit notebook test ([#183](https://github.com/databricks/ucx/pull/183)).
+
+Contributing
+
+* Added a note on external dependencies ([#139](https://github.com/databricks/ucx/pull/139)).
+* Added ability to run SQL queries on Spark when in Databricks Runtime ([#108](https://github.com/databricks/ucx/pull/108)).
+* Added some ground rules for contributing ([#82](https://github.com/databricks/ucx/pull/82)).
+* Added contributing instructions link from main readme ([#109](https://github.com/databricks/ucx/pull/109)).
+* Added info about environment refreshes ([#155](https://github.com/databricks/ucx/pull/155)).
+* Clarified documentation ([#137](https://github.com/databricks/ucx/pull/137)).
+* Enabled merge queue ([#146](https://github.com/databricks/ucx/pull/146)).
+* Improved `CONTRIBUTING.md` guide ([#135](https://github.com/databricks/ucx/pull/135), [#145](https://github.com/databricks/ucx/pull/145)).
diff --git a/README.md b/README.md
@@ -1,6 +1,13 @@
 # UCX - Unity Catalog Migration Toolkit
 
-Your best companion for enabling the Unity Catalog.
+[![build](https://github.com/databrickslabs/ucx/actions/workflows/push.yml/badge.svg)](https://github.com/databrickslabs/ucx/actions/workflows/push.yml) [![codecov](https://codecov.io/github/databrickslabs/ucx/graph/badge.svg?token=p0WKAfW5HQ)](https://codecov.io/github/databrickslabs/ucx)
+
+Your best companion for enabling the Unity Catalog. It helps you to migrate all Databricks workspace assets:
+Entitlements, AWS instance profiles, Clusters, Cluster policies, Instance Pools, Databricks SQL warehouses, Delta Live 
+Tables, Jobs, MLflow experiments, MLflow registry, SQL Dashboards & Queries, SQL Alerts, Token and Password usage 
+permissions that are set on the workspace level, Secret scopes, Notebooks, Directories, Repos, Files.
+
+See [contributing instructions](CONTRIBUTING.md) to help improve this project.
 
 ## Installation
 
@@ -18,19 +25,19 @@ export DATABRICKS_CONFIG_PROFILE=ABC
 ```
 
 You can also specify environment variables in a more direct way, like in this example for installing 
-on a Azure Databricks Workspace using the Azure CLI authentication:
+on an Azure Databricks Workspace using the Azure CLI authentication:
 
 ```shell
 az login
 export DATABRICKS_HOST=https://adb-123....azuredatabricks.net/
 ./install.sh
 ```
 
-## Latest working version and how-to
+Please follow the instructions in `./install.sh`, which will open a notebook with the description of all jobs to trigger. The journey starts with assessment. 
 
-Please note that current project statis is 🏗️ **WIP**, but we have a minimal set of already working utilities.
+## Star History
 
-See [contributing instructions](CONTRIBUTING.md).
+[![Star History Chart](https://api.star-history.com/svg?repos=databrickslabs/ucx&type=Date)](https://star-history.com/#databrickslabs/ucx)
 
 ## Project Support
 Please note that all projects in the /databrickslabs github account are provided for your exploration only, and are not formally supported by Databricks with Service Level Agreements (SLAs).  They are provided AS-IS and we do not make any guarantees of any kind.  Please do not submit a support ticket relating to any issues arising from the use of these projects.

diff --git a/USAGE.md b/USAGE.md
diff --git a/docs/logic.md → docs/local-group-migration.md b/docs/logic.md → docs/local-group-migration.md
@@ -1,5 +1,22 @@
 # Permissions migration logic and data structures
 
+During the UC adoption, it's critical to move the groups from the workspace to account level.
+
+To deliver this migration, the following steps are performed:
+
+| Step description                                                                                                                                                                                                                                                                                       | Relevant API method                                      |
+|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|
+| A set of groups to be migrated is identified (either via `groups.selected` config property, or automatically).<br/>Group existence is verified against the account level.<br/>**If there is no group on the account level, an error is thrown.**<br/>Backup groups are created on the workspace level. | `toolkit.prepare_groups_in_environment()`                |
+| Inventory table is cleaned up.                                                                                                                                                                                                                                                                         | `toolkit.cleanup_inventory_table()`                      |
+| Workspace local group permissions are inventorized and saved into a Delta Table.                                                                                                                                                                                                                       | `toolkit.inventorize_permissions()`                      |
+| Backup groups are entitled with permissions from the inventory table.                                                                                                                                                                                                                                  | `toolkit.apply_permissions_to_backup_groups()`           |
+| Workspace-level groups are deleted.  Account-level groups are granted with access to the workspace.<br/>Workspace-level entitlements are synced from backup groups to newly added account-level groups.                                                                                                | `toolkit.replace_workspace_groups_with_account_groups()` |
+| Account-level groups are entitled with workspace-level permissions from the inventory table.                                                                                                                                                                                                           | `toolkit.apply_permissions_to_account_groups()`          |
+| Backup groups are deleted                                                                                                                                                                                                                                                                              | `toolkit.delete_backup_groups()`                         |
+| Inventory table is cleaned up. This step is optional.                                                                                                                                                                                                                                                  | `toolkit.cleanup_inventory_table()`                      |
+
+> Please note that inherited permissions will not be inventorized / migrated. We only cover direct permissions.
+
 On a very high-level, the permissions inventorization process is split into two steps:
 
 1. collect all existing permissions into a persistent storage.

diff --git a/src/databricks/labs/ucx/__about__.py b/src/databricks/labs/ucx/__about__.py
@@ -1 +1 @@
-__version__ = "0.0.3"
+__version__ = "0.1.0"