Skip to content
This repository has been archived by the owner on Dec 4, 2024. It is now read-only.

[DOCS][SPARK-620] Update docs for binary secrets in DC/OS 1.11 #282

Merged
merged 5 commits into from
Mar 6, 2018
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions docs/kerberos.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,8 @@ single `krb5.conf` file for all of the its drivers.
dcos package install --options=/path/to/options.json spark

1. Make sure your keytab is in the DC/OS Secret Store, under a path that is accessible
by the Spark service. See [Using the Secret Store][../security/#using-the-secret-store]
by the Spark service. Since the keytab is a binary file, you must also base64-encode it on DC/OS 1.10 or lower.
See [Using the Secret Store][../security/#using-the-secret-store]
for details.


Expand All @@ -99,7 +100,7 @@ single `krb5.conf` file for all of the its drivers.
"enabled": true,
"krb5conf": "<base64_encoding>",
"principal": "<Kerberos principal>", # e.g. spark@REALM
"keytab": "<keytab secret path>" # e.g. spark-history/__dcos_base64__hdfs_keytab
"keytab": "<keytab secret path>" # e.g. spark-history/hdfs_keytab
}
}
}
Expand All @@ -117,7 +118,7 @@ single `krb5.conf` file for all of the its drivers.
},
"realm": "<kdc_realm>"
"principal": "<Kerberos principal>", # e.g. spark@REALM
"keytab": "<keytab secret path>" # e.g. spark-history/__dcos_base64__hdfs_keytab
"keytab": "<keytab secret path>" # e.g. spark-history/hdfs_keytab
}
}
}
Expand Down Expand Up @@ -172,7 +173,7 @@ Submit the job with the keytab:

dcos spark run --submit-args="\
--kerberos-principal user@REALM \
--keytab-secret-path /spark/__dcos_base64__hdfs-keytab \
--keytab-secret-path /spark/hdfs-keytab \
--conf spark.mesos.driverEnv.SPARK_USER=<spark user> \
--conf ... --class MySparkJob <url> <args>"

Expand All @@ -182,7 +183,7 @@ Submit the job with the ticket:

dcos spark run --submit-args="\
--kerberos-principal user@REALM \
--tgt-secret-path /spark/__dcos_base64__tgt \
--tgt-secret-path /spark/tgt \
--conf spark.mesos.driverEnv.SPARK_USER=<spark user> \
--conf ... --class MySparkJob <url> <args>"

Expand All @@ -192,7 +193,7 @@ the secret paths accordingly.

**Note:** You can access external (i.e. non-DC/OS) Kerberos-secured HDFS clusters from Spark on Mesos.

**Note:** These credentials are security-critical. The DC/OS Secret Store requires you to base64 encode binary secrets
**DC/OS 1.10 or lower:** These credentials are security-critical. The DC/OS Secret Store requires you to base64 encode binary secrets
(such as the Kerberos keytab) before adding them. If they are uploaded with the `__dcos_base64__` prefix, they are
automatically decoded when the secret is made available to your Spark job. If the secret name **doesn't** have this
prefix, the keytab will be decoded and written to a file in the sandbox. This leaves the secret exposed and is not
Expand All @@ -217,9 +218,9 @@ installation parameters, however does require the Spark Driver _and_ the Spark E
* The `keytab` containing the credentials for accessing the Kafka cluster.

--conf spark.mesos.containerizer=mesos # required for secrets
--conf spark.mesos.driver.secret.names=<base64_encoded_keytab> # e.g. spark/__dcos_base64__kafka_keytab
--conf spark.mesos.driver.secret.names=<keytab> # e.g. spark/kafka_keytab
--conf spark.mesos.driver.secret.filenames=<keytab_file_name> # e.g. kafka.keytab
--conf spark.mesos.executor.secret.names=<base64_encoded_keytab> # e.g. spark/__dcos_base64__kafka_keytab
--conf spark.mesos.executor.secret.names=<keytab> # e.g. spark/kafka_keytab
--conf spark.mesos.executor.secret.filenames=<keytab_file_name> # e.g. kafka.keytab


Expand Down
58 changes: 36 additions & 22 deletions docs/security.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,24 +36,46 @@ to the Spark Dispatcher instance.

### Binary Secrets

When you need to store binary files into DC/OS secrets store, for example a Kerberos keytab file, your file needs to be base64-encoded as specified in RFC 4648.
You can store binary files, like a Kerberos keytab, in the DC/OS secrets store. In DC/OS 1.11+ you can create
secrets from binary files directly, while in DC/OS 1.10 or lower, files must be base64-encoded as specified in
RFC 4648 prior to being stored as secrets.

#### DC/OS 1.11+

To create a secret called `mysecret` with the binary contents of `kerb5.keytab` run:

```bash
$ dcos security secrets create --file kerb5.keytab mysecret
```

#### DC/OS 1.10 or lower

To create a secret called `mysecret` with the binary contents of `kerb5.keytab`, first encode it using the
`base64` command line utility. The following example uses BSD `base64` (default on macOS).

You can use standard `base64` command line utility. Take a look at the following example that is using BSD `base64` command.
```bash
$ base64 -i krb5.keytab -o kerb5.keytab.base64-encoded
```

`base64` command line utility in Linux inserts line-feeds in the encoded data by default. Disable line-wrapping via `-w 0` argument. Here is a sample base64 command in Linux.
Alternatively, GNU `base64` (the default on Linux) inserts line-feeds in the encoded data by default.
Disable line-wrapping with the `-w 0` argument.

```bash
$ base64 -w 0 -i krb5.keytab > kerb5.keytab.base64-encoded
```

Give the secret basename prefixed with `__dcos_base64__`. For example, `some/path/__dcos_base64__mysecret` and `__dcos_base64__mysecret` will be base64-decoded automatically.
Now that the file is encoded it can be stored as a secret.

```bash
$ dcos security secrets create -f kerb5.keytab.base64-encoded some/path/__dcos_base64__mysecret
```
When you reference the `__dcos_base64__mysecret` secret in your service, the content of the secret will be first base64-decoded, and then copied and made available to your Spark application. Refer to a binary secret only as a file such that it will be automatically decoded and made available as a temporary in-memory file mounted within your container (file-based secrets).

**Note:** The secret name **must** be prefixed with `__dcos_base64__`.

When the `some/path/__dcos_base64__mysecret` secret is referenced in your `dcos spark run` command, its base64-decoded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Is it referenced in the command or in some service specification (as is used here: https://github.com/mesosphere/dcos-commons/blob/6da53edd1ff5392986bef5096a8c7a96470c64eb/docs/pages/_includes/services/overview.md#binary-secrets)? I'm ok leaving it as is though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Spark, it's referenced in the Spark "run" command.

contents will be made available as a [temporary file](http://mesos.apache.org/documentation/latest/secrets/#file-based-secrets)
in your Spark application. **Note:** Make sure to only refer to binary secrets as files since holding binary content
in environment variables is discouraged.


# Using Mesos Secrets
Expand Down Expand Up @@ -103,7 +125,7 @@ sources (i.e. files and environment variables). For example
```
will place the content of `spark/my-secret-file` into the `PLACEHOLDER` environment variable and the `target-secret-file` file
as well as the content of `spark/my-secret-envvar` into the `SECRET_ENVVAR` and `placeholder-file`. In the case of binary
secrets (tagged with `__dcos_base64__`, for example) the environment variable will still be empty because environment
secrets, the environment variable will still be empty because environment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I don't think this comma is required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording "environment variables cannot be assigned to binary values" seems in the wrong order (or the "to" should be removed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix it.

variables cannot be assigned to binary values.

# Spark SSL
Expand Down Expand Up @@ -138,24 +160,16 @@ The keystore and truststore are created using the [Java keytool][12]. The keysto
signed public key. The truststore is optional and might contain a self-signed root-ca certificate that is explicitly
trusted by Java.

Both stores must be base64 encoded without newlines, for example:

```bash
cat keystore | base64 -w 0 > keystore.base64
cat keystore.base64
/u3+7QAAAAIAAAACAAAAAgA...
```

**Note:** The base64 string of the keystore will probably be much longer than the snippet above, spanning 50 lines or
so.
**DC/OS 1.10 or lower:** Since both stores are binary files, they must be base64-encoded before being placed in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DC/OS 1.10 note could be placed after the instructions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point.

DC/OS secret store. Follow the instructions above on encoding binary secrets to encode the keystore and truststore.

Add the stores to your secrets in the DC/OS secret store. For example, if your base64-encoded keystores and truststores
are server.jks.base64 and trust.jks.base64, respectively, then use the following commands to add them to the secret
Add the stores to your secrets in the DC/OS secret store. For example, if your keystores and truststores
are server.jks and trust.jks, respectively, then use the following commands to add them to the secret
store:

```bash
dcos security secrets create /spark/__dcos_base64__keystore --value-file server.jks.base64
dcos security secrets create /spark/__dcos_base64__truststore --value-file trust.jks.base64
dcos security secrets create /spark/keystore --value-file server.jks
dcos security secrets create /spark/truststore --value-file trust.jks
```

You must add the following configurations to your `dcos spark run ` command.
Expand All @@ -164,10 +178,10 @@ The ones in parentheses are optional:
```bash

dcos spark run --verbose --submit-args="\
--keystore-secret-path=<path/to/keystore, e.g. spark/__dcos_base64__keystore> \
--keystore-secret-path=<path/to/keystore, e.g. spark/keystore> \
--keystore-password=<password to keystore> \
--private-key-password=<password to private key in keystore> \
(—-truststore-secret-path=<path/to/truststore, e.g. spark/__dcos_base64__truststore> \)
(—-truststore-secret-path=<path/to/truststore, e.g. spark/truststore> \)
(--truststore-password=<password to truststore> \)
(—-conf spark.ssl.enabledAlgorithms=<cipher, e.g., TLS_RSA_WITH_AES_128_CBC_SHA256> \)
--class <Spark Main class> <Spark Application JAR> [application args]"
Expand Down
2 changes: 1 addition & 1 deletion docs/usage-examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Visit the Spark cluster dispatcher at `http://<dcos-url>/service/spark/` to view
dcos spark run --submit-args="\
--conf spark.mesos.containerizer=mesos \ # required for secrets
--conf spark.mesos.uris=<URI_of_jaas.conf> \
--conf spark.mesos.driver.secret.names=spark/__dcos_base64___keytab \ # __dcos_base64__ prefix required for decoding base64 encoded binary secrets
--conf spark.mesos.driver.secret.names=spark/__dcos_base64___keytab \ # base64 encoding of binary secrets required in DC/OS 1.10 or lower
--conf spark.mesos.driver.secret.filenames=kafka-client.keytab \
--conf spark.mesos.executor.secret.names=spark/__dcos_base64___keytab \
--conf spark.mesos.executor.secret.filenames=kafka-client.keytab \
Expand Down
4 changes: 4 additions & 0 deletions docs/walkthroughs/installing-secure-spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,10 @@ Spark has two communication channels used amongst its components:
directory.

1. Now base64 encode these two artifacts and upload them to the secret store.

**DC/OS 1.11+:** Base64-encoding of binary secrets is not necessary in DC/OS 1.11+. You may skip the encoding
and update the secret names accordingly in the following example.

```bash
# encoding
base64 -w 0 server.jks > server.jks.base64
Expand Down
4 changes: 3 additions & 1 deletion docs/walkthroughs/running-with-secure-hdfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@ running on DC/OS. To start this walkthrough we assume that you have the followin

## Setting up secure HDFS
The Keberos `keytab` is a binary file and cannot be uploaded to the Secret Store directly. To use binary secrets in
DC/OS 1.10 and 1.11 the binary file must be base64 encoded and the resultant string will be uploaded with the prefix:
DC/OS 1.10 or lower, the binary file must be base64 encoded and the resultant string will be uploaded with the prefix:
`__dcos_base64__<secretname>`, this tells DC/OS to decode the file before placing it in the Sandbox.
In DC/OS 1.11+, base64-encoding of binary secrets is not necessary. You may skip the encoding
and update the secret names accordingly in the following example.

1. Establish correct principals for HDFS (assumes you're using the HDFS from the DC/OS Universe and installed with
service name `hdfs`). **Note** Requires DC/OS EE for file-based secrets. Add the following principals to the KDC
Expand Down
4 changes: 3 additions & 1 deletion docs/walkthroughs/secure-ml-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,10 @@ Completing this walkthrough requires the following.

## Setting up secure HDFS and Kafka
The Keberos `keytab` is a binary file and cannot be uploaded to the Secret Store directly. To use binary secrets in
DC/OS 1.10 and 1.11 the binary file must be base64 encoded and the resultant string will be uploaded with the prefix:
DC/OS 1.10 and lower the binary file must be base64 encoded and the resultant string will be uploaded with the prefix:
`__dcos_base64__<secretname>`, this tells DC/OS to decode the file before placing it in the Sandbox.
In DC/OS 1.11+, base64-encoding of binary secrets is not necessary. You may skip the encoding
and update the secret names accordingly in the following example.

1. Establish correct principals for HDFS and Kafka, assuming you're using the packages from the DC/OS universe.
**Note** Requires DC/OS EE for file-based secrets. Add the following principals to the KDC (Obviously it is
Expand Down