Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawl dashboards, queries, and alerts #144

Merged
merged 80 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
af09334
init
renardeinside Sep 3, 2023
5efc5b7
add basic impl
renardeinside Sep 3, 2023
52de7aa
add initial code for sql applicator
renardeinside Sep 3, 2023
8b351c0
Update src/databricks/labs/ucx/inventory/inventorizer.py
renardeinside Sep 4, 2023
6244fe2
factor-out the listing functions
renardeinside Sep 4, 2023
7d91f88
remove old usages of listing
renardeinside Sep 4, 2023
b218fe1
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 4, 2023
9a82ce2
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 5, 2023
e64e80b
factor-out various inventorizers into separate methods
renardeinside Sep 5, 2023
b8f5959
apply linting
renardeinside Sep 5, 2023
15f7128
add class-based SQL inventorizer
renardeinside Sep 5, 2023
dcc6707
fix generics passing to standard inventorizer
renardeinside Sep 5, 2023
2f7565e
apply formatting
renardeinside Sep 5, 2023
cce9e8e
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 5, 2023
627df05
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 6, 2023
7226eea
refactor relevance identification logic
renardeinside Sep 6, 2023
88f92c3
refactor applicator logic and add sql applicator
renardeinside Sep 6, 2023
6f00c8c
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 6, 2023
a9e6c7c
fix imports
renardeinside Sep 6, 2023
29463ec
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 7, 2023
88b1b8a
introduce func-based applicators
renardeinside Sep 7, 2023
aded05b
backport the secret scope logic into applicator
renardeinside Sep 7, 2023
ce0d755
fix abstract property methods
renardeinside Sep 7, 2023
569cca9
fix first chunk of tests
renardeinside Sep 7, 2023
51e953b
apply replace instead of deepcopy
renardeinside Sep 7, 2023
bd241f8
refactor tests
renardeinside Sep 7, 2023
7b8c95a
improve coverage
renardeinside Sep 7, 2023
5a4b4f4
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 8, 2023
315ce48
add logical objects doc
renardeinside Sep 8, 2023
1b6386b
add logical objects doc
renardeinside Sep 8, 2023
9a039ec
add basic impl
renardeinside Sep 8, 2023
f978d73
add permissions support
renardeinside Sep 8, 2023
3524a99
add sql permissions support'
renardeinside Sep 8, 2023
afe8cb0
add relevance method check
renardeinside Sep 8, 2023
0fa1f6b
add relevance check impls
renardeinside Sep 8, 2023
75ba5d1
add comments
renardeinside Sep 8, 2023
fab8992
add passwords and token support
renardeinside Sep 8, 2023
8a19eb5
finish impls
renardeinside Sep 10, 2023
b6938e5
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 10, 2023
0a5d9ed
split supports into a package
renardeinside Sep 10, 2023
2e78d5b
remove unused functions from permissions manager
renardeinside Sep 11, 2023
e45b3f3
change table schema
renardeinside Sep 11, 2023
02bf874
remove unused tests
renardeinside Sep 11, 2023
cef9804
add test-cov to make
renardeinside Sep 11, 2023
4bf302a
remove unused types
renardeinside Sep 11, 2023
aa6344e
apply fmt
renardeinside Sep 11, 2023
8c181dc
add tests for types
renardeinside Sep 11, 2023
f7b349c
add test for utils
renardeinside Sep 11, 2023
d9a236a
add tests for permissions manager
renardeinside Sep 11, 2023
1a80b64
add explicit naming for supports
renardeinside Sep 11, 2023
13dc63c
fix usages of logical types
renardeinside Sep 11, 2023
f4d7de0
add full coverage for passwords
renardeinside Sep 11, 2023
0fb3bc2
add tests coverage for tokens
renardeinside Sep 11, 2023
73025aa
switch towards listing-based logic
renardeinside Sep 11, 2023
5bed8a6
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 11, 2023
fa8a125
merge with main
renardeinside Sep 11, 2023
75d87b6
improve docs
renardeinside Sep 11, 2023
18c2ca7
rename the package
renardeinside Sep 11, 2023
d616171
remove support from ignore
renardeinside Sep 11, 2023
042431a
refactor supports logic in permissions manager
renardeinside Sep 11, 2023
d472c18
type the verification code
renardeinside Sep 11, 2023
d335751
normalize tests
renardeinside Sep 11, 2023
2a0a15c
add tests for relevance checker
renardeinside Sep 11, 2023
15ca3ad
improve tests
renardeinside Sep 11, 2023
3b1faf1
Merge remote-tracking branch 'origin/main' into feature/sql-object-pe…
renardeinside Sep 11, 2023
ac5540d
fix issues with .gitignore
renardeinside Sep 11, 2023
966609e
address comments
renardeinside Sep 11, 2023
3396dff
add tests for secret scope
renardeinside Sep 11, 2023
f6fc978
full test coverage for secrets
renardeinside Sep 11, 2023
a86e628
full test coverage for permissions
renardeinside Sep 11, 2023
7661562
add basic tests for scim
renardeinside Sep 11, 2023
6beab23
improve tests for scim
renardeinside Sep 11, 2023
e8e95fd
fully cover scim
renardeinside Sep 11, 2023
d3185af
align package name in test
renardeinside Sep 11, 2023
53f396d
add full coverage for sql permissions
renardeinside Sep 11, 2023
3a58e4a
add logging test
renardeinside Sep 11, 2023
5676574
add model listing tests
renardeinside Sep 11, 2023
610a42b
full coverage for support submodule
renardeinside Sep 11, 2023
21f3f7f
fix typing
renardeinside Sep 11, 2023
192e142
improve docs
renardeinside Sep 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -145,4 +145,6 @@ cython_debug/
# dev files and scratches
dev/cleanup.py

Support
Support

.python-version
74 changes: 74 additions & 0 deletions src/databricks/labs/ucx/inventory/inventorizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,17 @@
from abc import ABC, abstractmethod
from collections.abc import Callable, Iterator
from functools import partial
from itertools import chain
from typing import Generic, TypeVar

from databricks.sdk import WorkspaceClient
from databricks.sdk.core import DatabricksError
from databricks.sdk.service.iam import AccessControlResponse, Group, ObjectPermissions
from databricks.sdk.service.ml import ModelDatabricks
from databricks.sdk.service.sql import Alert, Dashboard
from databricks.sdk.service.sql import GetResponse as SqlPermissions
from databricks.sdk.service.sql import ObjectTypePlural as SqlRequestObjectType
from databricks.sdk.service.sql import Query
from databricks.sdk.service.workspace import (
AclItem,
ObjectInfo,
Expand Down Expand Up @@ -371,6 +376,75 @@ def inner() -> Iterator[ModelDatabricks]:
return inner


class DBSQLInventorizer(BaseInventorizer[InventoryObject]):
def __init__(self, ws: WorkspaceClient):
self._ws = ws
self._queries: Iterator[Query] = iter([])
self._dashboards: Iterator[Dashboard] = iter([])
self._alerts: Iterator[Alert] = iter([])

@property
def logical_object_types(self) -> list[LogicalObjectType]:
return [LogicalObjectType.ALERT, LogicalObjectType.DASHBOARD, LogicalObjectType.QUERY]

def preload(self):
self._queries = self._ws.queries.list()
self._dashboards = self._ws.dashboards.list()
self._alerts = self._ws.alerts.list()
renardeinside marked this conversation as resolved.
Show resolved Hide resolved

@sleep_and_retry
@limits(calls=100, period=1)
def _get_dbsql_permissions(
self, request_object_type: SqlRequestObjectType, request_object_id: str
) -> SqlPermissions:
return self._ws.dbsql_permissions.get(object_type=request_object_type, object_id=request_object_id)
renardeinside marked this conversation as resolved.
Show resolved Hide resolved

def _safe_get_dbsql_permissions(
self, request_object_type: SqlRequestObjectType, object_id: str
) -> SqlPermissions | None:
try:
permissions = self._get_dbsql_permissions(request_object_type, object_id)
return permissions
except DatabricksError as e:
if e.error_code in ["RESOURCE_DOES_NOT_EXIST", "RESOURCE_NOT_FOUND", "PERMISSION_DENIED"]:
logger.warning(f"Could not get permissions for {request_object_type} {object_id} due to {e.error_code}")
return None
else:
raise e

def _prepare_permission_item(self, _obj: Alert | Dashboard | Query) -> PermissionsInventoryItem | None:
renardeinside marked this conversation as resolved.
Show resolved Hide resolved
if isinstance(_obj, Alert):
logical_type = LogicalObjectType.ALERT
request_type = SqlRequestObjectType.ALERTS
elif isinstance(_obj, Dashboard):
logical_type = LogicalObjectType.DASHBOARD
request_type = SqlRequestObjectType.DASHBOARDS
elif isinstance(_obj, Query):
logical_type = LogicalObjectType.QUERY
request_type = SqlRequestObjectType.QUERIES
else:
logger.warning(f"Unexpected object type {_obj}")
return

_permissions = self._safe_get_dbsql_permissions(request_object_type=request_type, object_id=_obj.id)

if _permissions:
_item = PermissionsInventoryItem(
object_id=_obj.id,
logical_object_type=logical_type,
request_object_type=request_type,
raw_object_permissions=json.dumps(_permissions.as_dict()),
)
renardeinside marked this conversation as resolved.
Show resolved Hide resolved

def inventorize(self) -> list[PermissionsInventoryItem]:
chained_objects = chain(self._queries, self._alerts, self._dashboards)
executables = [partial(self._prepare_permission_item, _object) for _object in chained_objects]
results = ThreadedExecution[PermissionsInventoryItem | None](executables).run()
renardeinside marked this conversation as resolved.
Show resolved Hide resolved
results = [result for result in results if result] # empty filter
renardeinside marked this conversation as resolved.
Show resolved Hide resolved
logger.info(f"Permissions fetched for {len(results)} DBSQL Objects")
return results


class Inventorizers:
@staticmethod
def provide(ws: WorkspaceClient, migration_state: GroupMigrationState, num_threads: int):
Expand Down
38 changes: 37 additions & 1 deletion src/databricks/labs/ucx/inventory/permissions.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.iam import AccessControlRequest, Group, ObjectPermissions
from databricks.sdk.service.sql import GetResponse as SqlPermissions
from databricks.sdk.service.sql import ObjectTypePlural as SqlRequestObjectType
from databricks.sdk.service.workspace import AclItem as SdkAclItem
from ratelimit import limits, sleep_and_retry
from tenacity import retry, stop_after_attempt, wait_fixed, wait_random
Expand Down Expand Up @@ -48,7 +50,19 @@ class RolesAndEntitlementsRequestPayload:
group_id: str


AnyRequestPayload = PermissionRequestPayload | SecretsPermissionRequestPayload | RolesAndEntitlementsRequestPayload
@dataclass
class SqlObjectRequestPayload:
object_id: str
request_object_type: SqlRequestObjectType
access_control_list: list


AnyRequestPayload = (
PermissionRequestPayload
| SecretsPermissionRequestPayload
| RolesAndEntitlementsRequestPayload
| SqlObjectRequestPayload
)


# TODO: this class has too many @staticmethod and they must not be such. write a unit test for this logic.
Expand Down Expand Up @@ -155,6 +169,13 @@ def _prepare_request_for_roles_and_entitlements(
destination_group: Group = getattr(migration_info, destination)
return RolesAndEntitlementsRequestPayload(payload=item.typed_object_permissions, group_id=destination_group.id)

def _prepare_request_for_sql_object(
self, item: PermissionsInventoryItem, migration_state: GroupMigrationState, destination
) -> SqlObjectRequestPayload:
_permissions: SqlPermissions = item.typed_object_permissions
# TODO: apply conversion logic
raise NotImplementedError()

renardeinside marked this conversation as resolved.
Show resolved Hide resolved
def _prepare_new_permission_request(
self,
item: PermissionsInventoryItem,
Expand All @@ -169,6 +190,12 @@ def _prepare_new_permission_request(
return self._prepare_permission_request_for_secrets_api(item, migration_state, destination)
elif item.logical_object_type in [LogicalObjectType.ROLES, LogicalObjectType.ENTITLEMENTS]:
return self._prepare_request_for_roles_and_entitlements(item, migration_state, destination)
elif item.logical_object_type in [
LogicalObjectType.ALERT,
LogicalObjectType.DASHBOARD,
LogicalObjectType.QUERY,
]:
return self._prepare_request_for_sql_object(item, migration_state, destination)
else:
logger.warning(
f"Unsupported permissions payload for object {item.object_id} "
Expand Down Expand Up @@ -219,6 +246,13 @@ def _standard_permissions_applicator(self, request_payload: PermissionRequestPay
access_control_list=request_payload.access_control_list,
)

def _sql_permissions_applicator(self, request_payload: SqlObjectRequestPayload):
self._ws.dbsql_permissions.set(
object_type=request_payload.request_object_type,
object_id=request_payload.object_id,
access_control_list=request_payload.access_control_list,
)

def applicator(self, request_payload: AnyRequestPayload):
if isinstance(request_payload, RolesAndEntitlementsRequestPayload):
self._apply_roles_and_entitlements(
Expand All @@ -230,6 +264,8 @@ def applicator(self, request_payload: AnyRequestPayload):
self._standard_permissions_applicator(request_payload)
elif isinstance(request_payload, SecretsPermissionRequestPayload):
self._scope_permissions_applicator(request_payload)
elif isinstance(request_payload, SqlObjectRequestPayload):
self._sql_permissions_applicator(request_payload)
else:
logger.warning(f"Unsupported payload type {type(request_payload)}")

Expand Down
17 changes: 15 additions & 2 deletions src/databricks/labs/ucx/inventory/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@

import pandas as pd
from databricks.sdk.service.iam import ObjectPermissions
from databricks.sdk.service.sql import GetResponse as SqlPermissions
from databricks.sdk.service.sql import ObjectTypePlural as SqlRequestObjectType
from databricks.sdk.service.workspace import AclItem as SdkAclItem
from databricks.sdk.service.workspace import AclPermission as SdkAclPermission

Expand Down Expand Up @@ -48,6 +50,11 @@ class LogicalObjectType(StrEnum):
INSTANCE_POOL = "INSTANCE_POOL"
CLUSTER_POLICY = "CLUSTER_POLICY"

# DBSQL Objects
ALERT = "ALERT"
DASHBOARD = "DASHBOARD"
QUERY = "QUERY"

def __repr__(self):
return self.value

Expand Down Expand Up @@ -102,19 +109,25 @@ class RolesAndEntitlements:
class PermissionsInventoryItem:
object_id: str
logical_object_type: LogicalObjectType
request_object_type: RequestObjectType | None
request_object_type: RequestObjectType | SqlRequestObjectType | None
renardeinside marked this conversation as resolved.
Show resolved Hide resolved
raw_object_permissions: str

@property
def object_permissions(self) -> dict:
return json.loads(self.raw_object_permissions)

@property
def typed_object_permissions(self) -> ObjectPermissions | AclItemsContainer | RolesAndEntitlements:
def typed_object_permissions(self) -> ObjectPermissions | AclItemsContainer | RolesAndEntitlements | SqlPermissions:
if self.logical_object_type == LogicalObjectType.SECRET_SCOPE:
return AclItemsContainer.from_dict(self.object_permissions)
elif self.logical_object_type in [LogicalObjectType.ROLES, LogicalObjectType.ENTITLEMENTS]:
return RolesAndEntitlements(**self.object_permissions)
elif self.logical_object_type in [
LogicalObjectType.ALERT,
LogicalObjectType.DASHBOARD,
LogicalObjectType.QUERY,
]:
return SqlPermissions.from_dict(self.object_permissions)
renardeinside marked this conversation as resolved.
Show resolved Hide resolved
else:
return ObjectPermissions.from_dict(self.object_permissions)

Expand Down