Skip to content

Latest commit

 

History

History
761 lines (598 loc) · 46 KB

README.md

File metadata and controls

761 lines (598 loc) · 46 KB

Metacat Helm Chart

Metacat is repository software for preserving data and metadata (documentation about data) that helps scientists find, understand and effectively use data sets they manage or that have been created by others. For more details, see https://github.com/NCEAS/metacat

Before You Start:

  1. This Metacat Helm chart is a beta feature. It has been tested, and we believe it to be working well, but it has not yet been used in production - so we recommend caution with this early release. If you try it, we'd love to hear your feedback! After you have read the details below, this checklist may be helpful in guiding you through the necessary installation steps.

  2. If you are considering migrating an existing Metacat installation to Kubernetes, note that before starting a migration, you must have a fully-functioning installation of Metacat version 2.19, running with PostgreSQL version 14. Migrating from other versions of Metacat and/or PostgreSQL is not supported. See this checklist for the necessary migration steps.

  3. This deployment does not currently work on Apple Silicon machines (e.g. in Rancher Desktop), because the official Docker image for at least one of the dependencies (RabbitMQ) doesn't yet work in that environment.



TL;DR

Starting in the root directory of the metacat repo:

  1. You should not need to edit much in values.yaml, but you can look at the contents of the values overlay files (like the ones in the ./examples directory), to see which settings typically need to be overridden. Save your settings in a yaml file, e.g: /your/values-overrides.yaml

  2. Add your credentials to ./admin/secrets.yaml, and add to cluster:

    $ vim helm/admin/secrets.yaml    ## follow the instructions in this file
  3. Deploy

    (Note: Your k8s service account must have the necessary permissions to get information about the resource roles in the API group rbac.authorization.k8s.io).

    $ ./helm-upstall.sh  myreleasename  mynamespace oci://ghcr.io/nceas/charts/metacat  \
                                            --version 2.1.0  -f  /your/values-overrides.yaml

To access Metacat, you'll need to create a mapping between your ingress IP address (found by: kubectl describe ingress | grep "Address:") and your metacat hostname. Do this either by adding a permanent DNS record for everyone to use, or by adding a line to the /etc/hosts file on your local machine, providing temporary local access for your own testing. You should then be able to access the application via http://your-host-name/metacat.

Read on for more in-depth information about the various installation and configuration options that are available...

Introduction

This chart deploys a Metacat deployment on a Kubernetes cluster, using the Helm package manager.

Prerequisites

  • Kubernetes 1.23.3+
  • Helm 3.16.1+
  • PV provisioner support in the underlying infrastructure

Installing the Chart

To install the chart with the release name my-release:

helm install my-release oci://ghcr.io/nceas/charts/metacat --version 2.1.0

This command deploys Metacat on the Kubernetes cluster in the default configuration that is defined by the parameters in the values.yaml file. The Parameters section, below, lists the parameters that can be configured during installation.

It is likely that you will need to override some of these default parameters. This can be achieved by creating a YAML file that specifies only those values that need to be overridden, and providing that file as part of the helm install command. For example:

helm install my-release  -f myValues.yaml  oci://ghcr.io/nceas/charts/metacat --version 2.1.0

(where myValues.yaml contains only the values you wish to override.)

Parameters may also be provided on the command line to override those in values.yaml; e.g.

helm install my-release oci://ghcr.io/nceas/charts/metacat --version 2.1.0  \
                        --set postgres.auth.existingSecret=my-release-secrets

Note: Some settings need to be edited to include release name that you choose. See the values.yaml file for settings that include ${RELEASE_NAME}. The instructions at the beginning of values.yaml suggest simple ways to achieve this.

Uninstalling the Chart

To uninstall/delete the my-release deployment:

helm delete my-release

The helm delete command removes all the Kubernetes components associated with the chart (except for Secrets, PVCs and PVs) and deletes the release.

There are multiple PVCs associated with my-release, for Metacat data files, the PostgreSQL database, and for components of the indexer sub-chart. To delete:

kubectl delete pvc <myPVCName>   ## deletes specific named PVC
or:
kubectl delete pvc -l release=my-release   ## DANGER! deletes all PVCs associated with the release

NOTE: DELETING THE PVCs MAY ALSO DELETE ALL YOUR DATA. depending upon your setup! Please be cautious!

Parameters

Global Properties Shared Across Sub-Charts Within This Deployment

Name Description Value
global.metacatExternalBaseUrl Metacat base url accessible from outside cluster. https://localhost/
global.d1ClientCnUrl The url of the CN; used to populate metacat's 'D1Client.CN_URL' https://cn.dataone.org/cn
global.passwordsSecret The name of the Secret containing application passwords ${RELEASE_NAME}-metacat-secrets
global.metacatAppContext The application context to use metacat
global.storageClass default name of the storageClass to use for PVs local-path
global.ephemeralVolumeStorageClass Optional global storageClass override ""
global.sharedVolumeSubPath The subdirectory of the metacat data volume to mount ""
global.dataone-indexer.enabled Enable the dataone-indexer sub-chart true
global.includeMetacatUi Enable or disable the metacatui sub-chart. true
global.metacatUiThemeName The theme name to use. Required, even if overriding config.js knb
global.metacatUiWebRoot The url root to be appended after the metacatui baseUrl. /

Metacat Application-Specific Properties

Name Description Value
metacat.application.context see global.metacatAppContext metacat
metacat.auth.administrators A semicolon-separated list of admin ORCID iDs ""
metacat.database.connectionURI postgres database URI, or lave blank to use sub-chart ""
metacat.guid.doi.enabled Allow users to publish Digital Object Identifiers at doi.org? true
metacat.server.port The http port exposed externally, if NOT using the ingress ""
metacat.server.name The hostname for the server, as exposed by the ingress localhost
metacat.solr.baseURL The url to access solr, or leave blank to use sub-chart ""
metacat.solr.coreName The solr core (solr standalone) or collection name (solr cloud) ""
metacat.replication.logdir Location for the replication logs /var/metacat/logs
metacat.index.rabbitmq.hostname the hostname of the rabbitmq instance that will be used ""
metacat.index.rabbitmq.username the username for connecting to the RabbitMQ instance metacat-rmq-guest

OPTIONAL DataONE Member Node (MN) Parameters

Name Description Value
metacat.cn.server.publiccert.filename optional cert(s) used to validate jwt auth tokens, /var/metacat/pubcerts/DataONEProdIntCA.pem
metacat.dataone.certificate.fromHttpHeader.enabled Enable mutual auth with client certs false
metacat.dataone.autoRegisterMemberNode Automatically push MN updates to CN? (yyyy-MM-dd) 2023-02-28
metacat.dataone.nodeId The unique ID of your DataONE MN - must match client cert subject urn:node:CHANGE_ME_TO_YOUR_VALUE!
metacat.dataone.subject The "subject" string from your DataONE MN client certificate CN=urn:node:CHANGE_ME_TO_YOUR_VALUE!,DC=dataone,DC=org
metacat.dataone.nodeName short name for the node that can be used in user interfaces My Metacat Node
metacat.dataone.nodeDescription What is the node's intended scope and purpose? Describe your Member Node briefly.
metacat.dataone.contactSubject registered contact for this MN http://orcid.org/0000-0002-8888-999X
metacat.dataone.nodeSynchronize Enable Synchronization of Metadata to DataONE false
metacat.dataone.nodeSynchronization.schedule.year sync schedule year *
metacat.dataone.nodeSynchronization.schedule.mon sync schedule month *
metacat.dataone.nodeSynchronization.schedule.mday sync schedule day of month *
metacat.dataone.nodeSynchronization.schedule.wday sync schedule day of week ?
metacat.dataone.nodeSynchronization.schedule.hour sync schedule hour *
metacat.dataone.nodeSynchronization.schedule.min sync schedule minute 0/3
metacat.dataone.nodeSynchronization.schedule.sec sync schedule second 10
metacat.dataone.nodeReplicate Accept and Store Replicas? false
metacat.dataone.replicationpolicy.default.numreplicas # copies to store on other nodes 0
metacat.dataone.replicationpolicy.default.preferredNodeList Preferred replication nodes nil
metacat.dataone.replicationpolicy.default.blockedNodeList Nodes blocked from replication nil

OPTIONAL (but Recommended) Site Map Parameters

Name Description Value
metacat.sitemap.enabled Enable sitemaps to tell search engines which URLs are available false
metacat.sitemap.interval Interval (in milliseconds) between rebuilding the sitemap 86400000
metacat.sitemap.location.base The first part of the URLs listed in sitemap_index.xml /
metacat.sitemap.entry.base base URI of the dataset landing page, listed in the sitemap /view

robots.txt file (search engine indexing)

Name Description Value
robots.userAgent "User-agent:" defined in robots.txt file. Defaults to "*" if not set ""
robots.disallow the "Disallow:" value defined in robots.txt file. ""

Metacat Image, Container & Pod Parameters

Name Description Value
image.repository Metacat image repository ghcr.io/nceas/metacat
image.pullPolicy Metacat image pull policy IfNotPresent
image.tag Overrides the image tag. Will default to the chart appVersion if set to "" ""
image.debug Specify if container debugging should be enabled (sets log level to "DEBUG") false
imagePullSecrets Optional list of references to secrets in the same namespace []
container.ports Optional list of additional container ports to expose within the cluster []
serviceAccount.create Should a service account be created to run Metacat? false
serviceAccount.annotations Annotations to add to the service account {}
serviceAccount.name The name to use for the service account. ""
podAnnotations Map of annotations to add to the pods {}
podSecurityContext.enabled Enable security context true
podSecurityContext.runAsUser numerical User ID for the pod 59997
podSecurityContext.runAsGroup numerical Group ID for the pod 59997
podSecurityContext.fsGroup numerical Group ID used to access mounted volumes 59997
podSecurityContext.supplementalGroups additional GIDs used to access vol. mounts []
podSecurityContext.runAsNonRoot ensure all containers run as a non-root user. true
securityContext holds container-level security attributes that override those at pod level {}
resources Resource limits for the deployment {}
tolerations Tolerations for pod assignment []

Metacat Persistence

Name Description Value
persistence.enabled Enable metacat data persistence using Persistent Volume Claims true
persistence.storageClass Storage class of backing PV local-path
persistence.existingClaim Name of an existing Persistent Volume Claim to re-use ""
persistence.volumeName Name of an existing Volume to use for volumeClaimTemplate ""
persistence.subPath The subdirectory of the volume (see persistence.volumeName) to mount ""
persistence.accessModes PVC Access Mode for metacat volume ["ReadWriteMany"]
persistence.size PVC Storage Request for metacat volume 1Gi

Networking & Monitoring

Name Description Value
ingress.enabled Enable or disable the ingress true
ingress.className ClassName of the ingress provider in your cluster nginx
ingress.annotations.nginx.ingress.kubernetes.io/client-body-buffer-size - see docs: see values.yaml
ingress.annotations.nginx.ingress.kubernetes.io/client_max_body_size - see docs: see values.yaml
ingress.defaultBackend.enabled enable the optional defaultBackend false
ingress.defaultBackend.enabled enable the optional defaultBackend false
ingress.rewriteRules rewrite rules for the nginx ingress []
ingress.tls The TLS configuration []
ingress.d1CaCertSecretName Name of Secret containing DataONE CA certificate chain d1-ca-chain
service.enabled Enable another optional service in addition to headless svc false
service.type Kubernetes Service type. Defaults to ClusterIP if not set LoadBalancer
service.clusterIP IP address of the service. Auto-generated if not set ""
service.ports The port(s) to be exposed []
livenessProbe.enabled Enable livenessProbe for Metacat container true
livenessProbe.httpGet.path The url path to probe. /metacat/
livenessProbe.httpGet.port The named containerPort to probe metacat-web
livenessProbe.initialDelaySeconds Initial delay seconds for livenessProbe 45
livenessProbe.periodSeconds Period seconds for livenessProbe 15
livenessProbe.timeoutSeconds Timeout seconds for livenessProbe 10
readinessProbe.enabled Enable readinessProbe for Metacat container true
readinessProbe.httpGet.path The url path to probe. /metacat/admin
readinessProbe.httpGet.port The named containerPort to probe metacat-web
readinessProbe.initialDelaySeconds Initial delay seconds for readinessProbe 45
readinessProbe.periodSeconds Period seconds for readinessProbe 5
readinessProbe.timeoutSeconds Timeout seconds for readinessProbe 5

Postgresql Sub-Chart

Name Description Value
postgresql.enabled enable the postgresql sub-chart true
postgresql.auth.username Username for accessing the database used by metacat metacat
postgresql.auth.database The name of the database used by metacat. metacat
postgresql.auth.existingSecret Secrets location for postgres password ${RELEASE_NAME}-metacat-secrets
postgresql.auth.secretKeys.userPasswordKey Identifies metacat db's account password POSTGRES_PASSWORD
postgresql.auth.secretKeys.adminPasswordKey Dummy value - not used (see notes): POSTGRES_PASSWORD
postgresql.primary.pgHbaConfiguration PostgreSQL Primary client authentication see values.yaml
postgresql.primary.containerSecurityContext.enabled enable containerSecurityContext true
postgresql.primary.containerSecurityContext.runAsUser uid for container to run as 59996
postgresql.primary.podSecurityContext.runAsNonRoot pod defaults to run as non-root? true
postgresql.primary.extendedConfiguration Extended configuration, appended to defaults max_connections = 250
postgresql.primary.persistence.enabled Enable data persistence using PVC true
postgresql.primary.persistence.existingClaim Existing PVC to re-use ""
postgresql.primary.persistence.storageClass Storage class of backing PV ""
postgresql.primary.persistence.size PVC Storage Request for postgres volume 1Gi

Tomcat Configuration

Name Description Value
tomcat.heapMemory.min minimum memory heap size for Tomcat (-Xms JVM parameter) ""
tomcat.heapMemory.max maximum memory heap size for Tomcat (-Xmx JVM parameter) ""

dataone-indexer Sub-Chart

Name Description Value
dataone-indexer.podSecurityContext.fsGroup gid used to access mounted volumes 59997
dataone-indexer.podSecurityContext.supplementalGroups additional vol access gids []
dataone-indexer.persistence.subPath The subdirectory of the volume to mount ""
dataone-indexer.rabbitmq.extraConfiguration extra config, to be appended to rmq config consumer_timeout = 144000000
dataone-indexer.rabbitmq.auth.username set the username that rabbitmq will use metacat-rmq-guest
dataone-indexer.rabbitmq.auth.existingPasswordSecret location of rabbitmq password ${RELEASE_NAME}-metacat-secrets
dataone-indexer.solr.javaMem Java memory options to pass to the Solr container -Xms2g -Xmx2g
dataone-indexer.solr.customCollection name of the solr collection to use metacat-index
dataone-indexer.solr.coreNames Solr core names to be created ["metacat-core"]
dataone-indexer.solr.persistence.size solr Persistent Volume size 100Gi
dataone-indexer.solr.extraVolumes[0].name DO NOT EDIT - referenced by sub-chart solr-config
dataone-indexer.solr.extraVolumes[0].configMap.name see notes in values.yaml ${RELEASE_NAME}-indexer-configfiles
dataone-indexer.solr.extraVolumes[0].configMap.defaultMode DO NOT EDIT 777
dataone-indexer.solr.zookeeper.persistence.size Persistent Volume size 100Gi

Configuration and installation details

Metacat Application-Specific Properties

The parameters in the Metacat Application-Specific Properties section, above, map to the values required by Metacat at runtime. For more information please refer to the Metacat Properties section of the Metacat Administrators' Guide.

Secrets

Secret parameters (such as login credentials, auth tokens, private keys etc.) should be installed as kubernetes Secrets in the cluster. The file admin/secrets.yaml provides a template that you can complete and apply using kubectl -- for details, see the instructions in the comments inside that file. Please remember to NEVER ADD UNENCRYPTED SECRETS TO GITHUB!

Important:

  1. The deployed Secrets name includes the release name as a prefix, (e.g. my-release-metacat-secrets), so it's important to ensure that the secrets name matches the release name referenced whenever you use helm commands.
  2. The parameter postgresql.auth.existingSecret in values.yaml must be set to match the name of these installed secrets (which will change if the release name is changed).

Warning:

Setting a password will be ignored on new installations in cases when a previous PostgreSQL release was deleted through the helm command. In that case, the old PVC will have an old password, and setting it through helm won't take effect. Deleting persistent volumes (PVs) and redeploying will solve the issue (BUT TAKE CARE not to delete your data, and make sure you have backups, first!). Refer to issue 2061 for more details

User Interface

The Metacat helm chart also installs MetacatUI, which is included as a sub-chart. The MetacatUI sub-chart is highly configurable, and can be used with included themes, or you can provide your own custom theme, mounted on a PVC. At a minimum, you should provide values for the global properties. More information can be found in the MetacatUI README.

If you wish to disable the subchart altogether, set global.includeMetacatUi: false and provide your own MetacatUI installation. deployed separately.

Persistence

Persistent Volume Claims are used to keep the data across deployments. See the Parameters section to configure the PVCs or to disable persistence for either application.

The Metacat image stores the Metacat data and configurations on a PVC mounted at the /var/metacat path in the metacat container.

The PostgreSQL image stores the database data at the /bitbami/pgdata path in its own container.

Details of the dataone-indexer sub-chart PV/PVC requirements can be found in the dataone-indexer repository. DataONE Indexer also needs read access to the same PVC used by Metacat for its file-storage.

Details of the MetacatUI sub-chart (optional) PV/PVC requirements can be found in the MetacatUI README.

Networking and Certificates

By default, the chart will install an Ingress (see the ingress.* parameters under Networking & Monitoring), which will expose HTTP and/or HTTPS routes from outside the cluster to the Metacat application within the cluster. Note that your cluster must have an Ingress controller in order for this to work.

Note: We strongly recommend that you use the Kubernetes open source community version of the nginx ingress controller. (Full functionality may not be available if you choose an alternative). You can install it as follows:

$  helm upgrade --install ingress-nginx ingress-nginx \
                --repo https://kubernetes.github.io/ingress-nginx \
                --namespace ingress-nginx --create-namespace

Tip: You can inspect available Ingress classes in your cluster using: $ kubectl get ingressclasses

Note that there are significant differences between the community version of the nginx ingress controller and the one provided by the NGINX company. This helm chart relies on the functionality of the community version.

Setting up a TLS Certificate for HTTPS Traffic

HTTPS traffic is served on port 443 (a.k.a. the "SSL" port), and requires the ingress to have access to a TLS certificate and private key. A certificate signed by a trusted Certificate Authority is needed for public servers, or you can create your own self-signed certificate for development purposes - see Appendix 1 below, for self-signing instructions.

Once you have obtained the server certificate and private key, you can add them to the Kubernetes secrets, as follows (creates a Secret named tls-secret, assuming the server certificate and private key are named server.crt and server.key):

kubectl create secret tls tls-secret --key server.key --cert server.crt

Then simply tell the ingress which secret to use:

ingress:
  tls:
    - hosts:
        # hostname is auto-populated from the value of
        #     metacat:
        #       server.name: &extHostname myHostName.com
        - knb.test.dataone.org
      secretName: tls-secret

Tip: You can save time and reduce complexity by using a certificate manager service. For example, our NCEAS k8s clusters include a cert-manager service that constantly watches for Ingress modifications, and updates letsEncrypt certificates automatically, so this step is as simple as ensuring the ingress includes:

ingress:
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
  className: "nginx"
  tls:
    - hosts:
        - knb-dev.test.dataone.org
      secretName: ingress-nginx-tls-cert

...and a tls cert will be created and applied automatically, matching the hostname defined in the tls: section. It will be created in a new secret: ingress-nginx-tls-cert, in the ingress' namespace

Setting up Certificates for DataONE Replication

For full details on becoming part of the DataONE network, see the Metacat Administrator's Guide and Authentication and Authorization in DataONE .

DataONE Replication relies on mutual authentication with x509 client-side certs. As a DataONE Member Node (MN) or Coordinating Node (CN), your metacat instance will act as both a server and as a client, at different times during the replication process. It is therefore necessary to configure certificates and settings for both these roles.

Prerequisites

  1. First make sure you have the Kubernetes version of the nginx ingress installed
  2. Ensure HTTPS access is set up and working correctly. This allows other nodes, acting as "clients" to verify your server's identity during mutual authentication.
  3. Download a copy of the DataONE Certificate Authority (CA) certificate chain. This enables your node (when acting as server) to verify that other nodes' client certificates were signed by the DataONE Certificate Authority.
    1. DataONE Production CA Chain: DataONEProdCAChain.crt
    2. DataONE Test CA Chain: DataONETestCAChain.crt
  4. From the DataONE administrators (support@dataone.org), obtain a Client Certificate (sometimes referred to as a DataONE Node Certificate) that uniquely identifies your Metacat instance. This allows another node (acting as server) to verify your node's identity (acting as "client") during mutual authentication. The client certificate contains sensitive information, and must be kept private and secure.

Install the CA Chain

  • Create the Kubernetes Secret (named d1-ca-chain) to hold the ca chain (e.g. assuming it's in a file named DataONEProdCAChain.crt):

    kubectl create secret generic d1-ca-chain --from-file=ca.crt=DataONEProdCAChain.crt
    # (don't forget to define a non-default namespace if necessary, using `-n myNameSpace`)

Install the Client Certificate

  • Create the Kubernetes Secret (named <yourReleaseName>-d1-client-cert) to hold the Client Certificate, identified by the key d1client.crt (e.g. assuming the cert is in a file named urn_node_TestNAME.pem):

    kubectl create secret generic <yourReleaseName>-d1-client-cert \
                                  --from-file=d1client.crt=urn_node_TestNAME.pem
    # (don't forget to define a non-default namespace if necessary, using `-n myNameSpace`)

Set the correct parameters in values.yaml

  1. set the CA secret name

    ingress:
      className: "nginx"
      d1CaCertSecretName: d1-ca-chain
  2. Enable the shared secret header

    metacat:
      dataone.certificate.fromHttpHeader.enabled: true
  3. Ensure you have already defined a value for the shared secret that will enable metacat to verify the validity of incoming requests. The secret should be defined in metacat Secrets, identified by the key: METACAT_DATAONE_CERT_FROM_HTTP_HEADER_PROXY_KEY.

  4. Finally, re-install or upgrade to apply the changes

See Appendix 3 for help with troubleshooting


Appendices

Appendix 1: Self-Signing TLS Certificates for HTTPS Traffic

NOTE: For development and testing purposes only!

Also see the Kubernetes nginx documentation

You can create your own self-signed certificate as follows:

HOST=myHostName.com \
&&  openssl req -x509 -nodes -days 365  \
    -newkey rsa:2048 -keyout server.key \
    -out server.crt                     \
    -subj "/CN=${HOST}/O=${HOST}"       \
    -addext "subjectAltName = DNS:${HOST}"

The output will be a server certificate file named server.crt, and a private key file named server.key. For the ${HOST}, you can use localhost, or your machine's real hostname.

Alternatively, you can use any other valid hostname, but you'll need to add an entry to your /etc/hosts file to map it to your localhost IP address (127.0.0.1) so that your browser can resolve it; e.g.:

# add entry in /etc/hosts
127.0.0.1       myHostName.com

Whatever hostname you are using, don't forget to set the metacat.server.name accordingly, in values.yaml!

Appendix 2: Self-Signing Certificates for Testing Mutual Authentication

NOTE: For development and testing purposes only!

Also see the Kubernetes nginx documentation

Assuming you already have a server certificate installed (either signed by a trusted CA or self-signed for development & testing), you can create your own self-signed Mutual Auth Client certificate and CA certificate as follows:

  1. Generate the CA Key and Certificate:

    openssl req -x509 -sha256 -newkey rsa:4096 -keyout ca.key -out ca.crt -days 356 -nodes \
            -subj '/CN=My Cert Authority'
  2. Generate the Client Key and Certificate Signing Request:

    openssl req -new -newkey rsa:4096 -keyout client.key -out client.csr -nodes \
            -subj '/CN=My Client'
  3. Sign with the CA Key:

    openssl x509 -req -sha256 -days 365 -in client.csr -CA ca.crt -CAkey ca.key \
            -set_serial 02 -out client.crt

Appendix 3: Troubleshooting Mutual Authentication

If you're having trouble getting Mutual Authentication working, you can run metacat in debug mode and view the logs (see Appendix 4 for details).

If you see the message: X-Proxy-Key is null or blank, it means the nginx ingress has not been set up correctly (see Setting up Certificates for DataONE Replication ).

You can check the configuration as follows:

  1. first check the ingress definition

    kubectl get ingress <yourReleaseName>-metacat -o yaml

    ...and ensure the output contains these lines:

      metadata:
        annotations:
        # NOTE: more lines above, omitted for clarity
          nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true"
          nginx.ingress.kubernetes.io/auth-tls-secret: default/d1-ca-chain
        ## NOTE: above may differ for you. Format is: <namespace>/<ingress.d1CaCertSecretName>
          nginx.ingress.kubernetes.io/auth-tls-verify-client: optional_no_ca
          nginx.ingress.kubernetes.io/auth-tls-verify-depth: "10"
          nginx.ingress.kubernetes.io/configuration-snippet: |
            more_set_input_headers "X-Proxy-Key: <your-secret-here>";

    NOTE: <your-secret-here> is the plaintext value associated with the key METACAT_DATAONE_CERT_FROM_HTTP_HEADER_PROXY_KEY in your secret <releaseName>-metacat-secrets -- ensure it has been set correctly!

    If you don't see these, or they are incorrect, check values.yaml for:

      metacat:
        dataone.certificate.fromHttpHeader.enabled: true    # must be true for mutual auth to work!
    
      ingress:
        tls:    # needs to have been set up properly [see ref 1]
    
        d1CaCertSecretName:  # needs to match the secret name holding your ca cert chain [see ref 2]
  2. If you have access to the correct namespace, you can also view the nginx ingress logs using:

      NS=ingress-nginx    # this is the ingress controller's namespace. Typically ingress-nginx
      kubectl logs -n ${NS} -f $(kc get pods -n ${NS} | grep "nginx" | awk '{print $1}')

Appendix 4: Debugging and Logging

To run Metacat in debug mode

Set the debug flag in values.yaml:

image:
  debug: true

# (or you can do the same thing temporarily, via the `--set image.debug=true` command-line flag)

This has the following effects:

  1. sets the logging level to DEBUG

    Tip: you can also temporarily change logging settings without needing to upgrade or re-install the application, by editing the log4J configuration ConfigMap:

    $ kc edit configmaps <releaseName>-metacat-configfiles

    (look for the key log4j2.k8s.properties). The config is automatically reloaded every monitorInterval seconds.

    Note that these edits will be overwritten next time you do a helm install or helm upgrade!

  2. enables remote Java debugging via port 5005. You will need to forward this port, in order to access it on localhost:

    $ kubectl  port-forward  --namespace myNamespace  pod/mypod-0  5005:5005

    Tip:

    For the indexer, you can also set the debug flag in values.yaml (Note that this only sets the logging level to DEBUG; it does not enable remote debugging for the indexer):

    dataone-indexer:
      image:
        debug: true

To view the logs

General syntax:

Application logs for all containers running this application:

  $ kubectl logs -f -l app.kubernetes.io/name=<my-application-name>

  # example: logs from all running index worker containers
  $ kubectl logs -f -l app.kubernetes.io/name=dataone-indexer

Application logs from one specific pod:

  $ kubectl logs -f <specific-pod-name>

  # example: Metacat logs
  $ kubectl logs -f metacatknb-0

  # example: previous Metacat logs from (now exited) pod
  $ kubectl logs -p metacatknb-0

Logs from an initContainer:

  $ kubectl logs -f <specific-pod-name> -c <init-container-name>

  # example: Metacat's `init-solr-metacat-dep` initContainer logs
  $ kubectl logs -f metacatknb-0 -c init-solr-metacat-dep