Terraform module for Databricks AWS Workspace E2 (Part 1)
βοΈ Important
π This Terraform module assumes you have access to: https://accounts.cloud.databricks.com
π Databricks account username:
databricks_account_username
π Databricks account password:
databricks_account_password
π Databricks account id,
databricks_account_id
can be found on the bottom left corner of the page, once you're logged in.π Part 2: Terraform module for Databricks Workspace management
- Module tested for Terraform 1.0.1.
databrickslabs/databricks
provider version 0.4.7- AWS provider version 3.47.
main
branch: Provider versions not pinned to keep up with Terraform releases.tags
releases: Tags are pinned with versions (use ).
terrafrom init
terraform plan -var='teamid=tryme' -var='prjid=project1'
terraform apply -var='teamid=tryme' -var='prjid=project1'
terraform destroy -var='teamid=tryme' -var='prjid=project1'
Note: With this option please take care of remote state storage
Recommended method (stores remote state in S3 using prjid
and teamid
to create directory structure):
- Create python 3.6+ virtual environment
python3 -m venv <venv name>
- Install package:
pip install tfremote --upgrade
- Set below environment variables:
export TF_AWS_BUCKET=<remote state bucket name>
export TF_AWS_BUCKET_REGION=us-west-2
export TF_AWS_PROFILE=<profile from ~/.ws/credentials>
or
- Set below environment variables:
export TF_AWS_BUCKET=<remote state bucket name>
export TF_AWS_BUCKET_REGION=us-west-2
export AWS_ACCESS_KEY_ID=<aws_access_key_id>
export AWS_SECRET_ACCESS_KEY=<aws_secret_access_key>
-
Update main.tf file with required values.
-
Run and verify the output before deploying:
tf -c=aws plan -var='teamid=foo' -var='prjid=bar'
- Run below to deploy:
tf -c=aws apply -var='teamid=foo' -var='prjid=bar'
- Run below to destroy:
tf -c=aws destroy -var='teamid=foo' -var='prjid=bar'
NOTE:
- Read more on tfremote
module "databricks_workspace" {
source = "git::git@github.com:tomarv2/terraform-databricks-aws-workspace.git"
# NOTE: One of the below is required:
# - 'profile_for_iam' - for IAM creation (if none is provided 'default' is used)
# - 'existing_role_name'
profile_for_iam = "iam-admin"
databricks_account_username = "example@example.com"
databricks_account_password = "sample123!"
databricks_account_id = "1234567-1234-1234-1234-1234567"
# -----------------------------------------
# Do not change the teamid, prjid once set.
teamid = var.teamid
prjid = var.prjid
}
module "databricks_workspace" {
source = "git::git@github.com:tomarv2/terraform-databricks-aws-workspace.git"
# NOTE: One of the below is required:
# - 'profile_for_iam' - for IAM creation (if none is provided 'default' is used)
# - 'existing_role_name'
existing_role_arn = "arn:aws:iam::123456789012:role/demo-role"
databricks_account_username = "example@example.com"
databricks_account_password = "sample123!"
databricks_account_id = "1234567-1234-1234-1234-1234567"
# -----------------------------------------
# Do not change the teamid, prjid once set.
teamid = var.teamid
prjid = var.prjid
}
Please refer to examples directory link for references.
If you notice below error:
Error: MALFORMED_REQUEST: Failed credentials validation checks: Spot Cancellation, Create Placement Group, Delete Tags, Describe Availability Zones, Describe instances, Describe Instance Status, Describe Placement Group, Describe Route Tables, Describe Security Groups, Describe Spot Instances, Describe Spot Price History, Describe Subnets, Describe Volumes, Describe Vpcs, Request Spot Instances
- Try creating workspace from UI:
- Verify if the role and policy exists (assume role should allow external id)
Name | Version |
---|---|
terraform | >= 1.0.1 |
aws | ~> 3.63 |
databricks | 0.5.1 |
random | ~> 3.1 |
time | ~> 0.7 |
Name | Version |
---|---|
aws | ~> 3.63 |
databricks | 0.5.1 |
databricks.created_workspace | 0.5.1 |
databricks.mws | 0.5.1 |
random | ~> 3.1 |
time | ~> 0.7 |
Name | Source | Version |
---|---|---|
iam_policies | git::git@github.com:tomarv2/terraform-aws-iam-policies.git | v0.0.4 |
iam_role | git::git@github.com:tomarv2/terraform-aws-iam-role.git//modules/iam_role_external | v0.0.7 |
s3 | git::git@github.com:tomarv2/terraform-aws-s3.git | v0.0.8 |
vpc | git::git@github.com:tomarv2/terraform-aws-vpc.git | v0.0.6 |
Name | Type |
---|---|
aws_s3_bucket_policy.root_bucket_policy | resource |
databricks_mws_credentials.this | resource |
databricks_mws_networks.this | resource |
databricks_mws_storage_configurations.this | resource |
databricks_mws_workspaces.this | resource |
databricks_token.pat | resource |
random_string.naming | resource |
time_sleep.wait | resource |
aws_region.current | data source |
databricks_aws_assume_role_policy.this | data source |
databricks_aws_bucket_policy.this | data source |
databricks_aws_crossaccount_policy.cross_account_iam_policy | data source |
Name | Description | Type | Default | Required |
---|---|---|---|---|
cidr_block | The CIDR block for the VPC | string |
"10.4.0.0/16" |
no |
custom_tags | Extra custom tags | any |
null |
no |
databricks_account_id | External ID provided by third party. | string |
n/a | yes |
databricks_account_password | databricks account password | string |
n/a | yes |
databricks_account_username | databricks account username | string |
n/a | yes |
databricks_hostname | databricks hostname | string |
"https://accounts.cloud.databricks.com" |
no |
existing_role_name | If you want to use existing role name, else a new role will be created | string |
null |
no |
prjid | Name of the project/stack e.g: mystack, nifieks, demoaci. Should not be changed after running 'tf apply' | string |
n/a | yes |
profile | profile to use for resource creation | string |
"default" |
no |
profile_for_iam | profile to use for IAM | string |
null |
no |
region | AWS region to deploy resources | string |
"us-east-1" |
no |
teamid | Name of the team/group e.g. devops, dataengineering. Should not be changed after running 'tf apply' | string |
n/a | yes |
Name | Description |
---|---|
databricks_credentials_id | databricks credentials id |
databricks_deployment_name | databricks deployment name |
databricks_host | databricks hostname |
databricks_mws_credentials_id | databricks mws credentials id |
databricks_mws_network_id | databricks mws network id |
databricks_mws_storage_bucket_name | databricks mws storage bucket name |
databricks_mws_storage_id | databricks mws storage id |
databricks_token | Value of the newly created token |
databricks_token_lifetime_hours | Token validity |
iam_role_arn | iam role arn |
inline_policy_id | inline policy id |
nonsensitive_databricks_token | Value of the newly created token (nonsensitive) |
s3_bucket_arn | s3 bucket arn |
s3_bucket_id | s3 bucket id |
s3_bucket_name | s3 bucket name |
storage_configuration_id | databricks storage configuration id |
vpc_id | vpc id |
vpc_route_table_ids | list of VPC route tables IDs |
vpc_security_group_id | list of VPC security group ID |
vpc_subnet_ids | list of subnet ids within VPC |
workspace_url | databricks workspace url |