Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS seems to be broken on applications when using boot #5310

Closed
1 task
ccojocar opened this issue Sep 3, 2019 · 32 comments
Closed
1 task

TLS seems to be broken on applications when using boot #5310

ccojocar opened this issue Sep 3, 2019 · 32 comments

Comments

@ccojocar
Copy link
Contributor

ccojocar commented Sep 3, 2019

Summary

  • Enable TLS when installing with boot via the cert-manager and external DNS
  • Create an application with either create spring or create quickstart and deploy it with Jenkins X
  • The certificate for the application public endpoint form staging environment seems to be invalid

The application ingress resource seems to still have the expose controller annotations:

apiVersion: v1
items:
- apiVersion: extensions/v1beta1
  kind: Ingress
  metadata:
    annotations:
      fabric8.io/generated-by: exposecontroller
      kubernetes.io/ingress.class: nginx
      kubernetes.io/tls-acme: "true"
    creationTimestamp: 2019-09-03T08:13:40Z
    generation: 1
    labels:
      provider: fabric8
    name: bdd-spring-1567497978
    namespace: jx-staging
    ownerReferences:
    - apiVersion: v1
      kind: Service
      name: bdd-spring-1567497978
      uid: ba295232-ce22-11e9-bb9b-42010a84003c
    resourceVersion: "8133"
    selfLink: /apis/extensions/v1beta1/namespaces/jx-staging/ingresses/bdd-spring-1567497978
    uid: bcc72597-ce22-11e9-bb9b-42010a84003c
  spec:
    rules:
    - host: bdd-spring-1567497978.jx-staging.boot.bdd.jenkins-x.rocks
      http:
        paths:
        - backend:
            serviceName: bdd-spring-1567497978
            servicePort: 80
    tls:
    - hosts:
      - bdd-spring-1567497978.jx-staging.boot.bdd.jenkins-x.rocks
      secretName: tls-bdd-spring-1567497978
  status:
    loadBalancer:
      ingress:
      - ip: 
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

No cert-manger issuer seems to be installed in the stating namespace. The cert-manger fails with the following error when trying to acquire the certificate for newly deployed application:

I0903 08:13:40.282576       1 base_controller.go:193] cert-manager/controller/ingress-shim "level"=0 "msg"="finished processing work item" "key"="jx-staging/bdd-spring-1567497978"
I0903 08:14:04.218156       1 base_controller.go:187] cert-manager/controller/ingress-shim "level"=0 "msg"="syncing item" "key"="jx-staging/bdd-spring-1567497978"
I0903 08:14:04.218406       1 sync.go:77] cert-manager/controller/ingress-shim "level"=0 "msg"="failed to determine issuer to be used for ingress resource" "resource_kind"="Ingress" "resource_name"="bdd-spring-1567497978" "resource_namespace"="jx-staging"

Steps to reproduce the behavior

Expected behavior

A valid certificate should be acquired for an application deployed in the staging or production environments.

Actual behavior

Jx version

The output of jx version is:

COPY OUTPUT HERE

Jenkins type

  • [ x] Serverless Jenkins X Pipelines (Tekton + Prow)
  • Classic Jenkins

Kubernetes cluster

Operating system / Environment

@deanesmith
Copy link
Contributor

@ccojocar is there a workaround?

@ccojocar
Copy link
Contributor Author

ccojocar commented Sep 5, 2019

There are a few workarounds:

  1. disable TLS in the environment git repository after boot installation by setting the http: "true" and tlsacme: "false" e.g. for staging environment https://github.com/jenkins-x/environment-tekton-weasel-staging/blob/07a534f7d30cb47317c7dcded628f0e28fbbad31/env/values.yaml#L18
  2. use something like https://github.com/jenkins-x-charts/kubernetes-replicator to copy the wildcard certificate from jx (dev environment) namespace into jx-staging and jx-production namespaces and then modify the ingress annotation to use this secret as a certificate. This approach is not quite recommended from security point of view.
  3. create cert-manager issuer and certificate into stating/production git repository after boot installation following this example https://github.com/jenkins-x/jenkins-x-boot-config/blob/master/systems/acme/templates/cert-manager-prod-issuer.yaml and respectively https://github.com/jenkins-x/jenkins-x-boot-config/blob/master/systems/acme/templates/cert-manager-prod-certificate.yaml and add an ingress-config map in the same repository similar with https://github.com/jenkins-x/environment-tekton-weasel-dev/blob/master/env/templates/ingress-config-configmap.yaml with proper values for domain, issuer name etc.

I would recommend option 1 and let the users to take the responsibility to secure their applications deployed with Jenkins X until we have a fix on our side and everything works automatically.

@ccojocar
Copy link
Contributor Author

ccojocar commented Sep 5, 2019

I'll also extend the environment configuration in the jx-requirements file such that it will allow to define custom ingress configuration per environment, and then we can generate the cert-manager resources from templates per environment.

@daveconde daveconde modified the milestones: Sprint 13, Sprint 14 Sep 5, 2019
@ccojocar ccojocar self-assigned this Sep 5, 2019
@pmuir
Copy link
Contributor

pmuir commented Sep 5, 2019

See also #4096 - @ccojocar @cagiti can you check if they are dupes and close one out?

@pmuir pmuir closed this as completed Sep 5, 2019
@pmuir pmuir reopened this Sep 5, 2019
ccojocar referenced this issue in jenkins-x/jenkins-x-boot-config Sep 6, 2019
Disable the installation of the cert-manager, external-dns and acme charts when
they are disable in the jx-requirements.yml file.

Signed-off-by: Cosmin Cojocar <cosmin.cojocar@gmx.ch>
@cagiti
Copy link
Contributor

cagiti commented Sep 6, 2019

There are a few workarounds:

  1. disable TLS in the environment git repository after boot installation by setting the http: "true" and tlsacme: "false" e.g. for staging environment https://github.com/jenkins-x/environment-tekton-weasel-staging/blob/07a534f7d30cb47317c7dcded628f0e28fbbad31/env/values.yaml#L18
  2. use something like https://github.com/jenkins-x-charts/kubernetes-replicator to copy the wildcard certificate from jx (dev environment) namespace into jx-staging and jx-production namespaces and then modify the ingress annotation to use this secret as a certificate. This approach is not quite recommended from security point of view.
  3. create cert-manager issuer and certificate into stating/production git repository after boot installation following this example https://github.com/jenkins-x/jenkins-x-boot-config/blob/master/systems/acme/templates/cert-manager-prod-issuer.yaml and respectively https://github.com/jenkins-x/jenkins-x-boot-config/blob/master/systems/acme/templates/cert-manager-prod-certificate.yaml and add an ingress-config map in the same repository similar with https://github.com/jenkins-x/environment-tekton-weasel-dev/blob/master/env/templates/ingress-config-configmap.yaml with proper values for domain, issuer name etc.

I would recommend option 1 and let the users to take the responsibility to secure their applications deployed with Jenkins X until we have a fix on our side and everything works automatically.

I think option 3 is best, but we'd need to progress on operating the staging and production environments to use boot. We'd also need to provide an ingress template for the preview, staging and production environments. For the interim we should go with option 1 and quickly follow up with option 3.

@stephenstubbs
Copy link

stephenstubbs commented Jan 21, 2020

Will that work for previews and devpods?

I'm currently using

DOMAIN="your_domain"

kubectl patch deployment -n kube-system jxing-nginx-ingress-controller --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--default-ssl-certificate=jx/tls-$DOMAIN-p"}]'

which seems to work quite well and catch anything with a broken/missing cert. I wasn't able to get it stable and working for previews and devpods when I was trying replicator a while back.

@tdcox
Copy link
Contributor

tdcox commented Jan 21, 2020

@rawlingsj I tried that approach manually and found that certificates were not being finalised because there was no Issuer in the namespace. I added an Issuer and have now hit no configured challenge solvers can be used for this challenge.

@tdcox
Copy link
Contributor

tdcox commented Jan 21, 2020

Tried recreating the certificates manually, but get the same error.

@sdoxsee
Copy link

sdoxsee commented Jan 21, 2020

@tdcox @rawlingsj same here on my application's Ingress: Could not determine issuer for ingress due to bad annotations: failed to determine issuer name to be used for ingress resource

@haysclark
Copy link

haysclark commented Jan 22, 2020

Originally from Slack thread

I ended up getting this working, but I had to manually copy the keys over. For some reason, Secret replication is not functioning on my cluster.

I ended up adding my own CM Issuers in both Staging and Production. Currently, these are just based on letsencrypt-prod.

Note: I changed the Issuers name from letsencrypt-prod to letsencrypt-prod-stg while debugging, but as everything is silo’d in Kubernetes namespaces it is likely unnecessary.

apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  annotations:
    jenkins.io/chart: acme
    jenkins.io/chart-app-version: 0.0.12
  labels:
    jenkins.io/chart-release: acme
    jenkins.io/namespace: jx-staging
    jenkins.io/version: "12"
  name: letsencrypt-prod-stg
  namespace: jx-staging
  selfLink: /apis/cert-manager.io/v1alpha2/namespaces/jx-staging/issuers/letsencrypt-prod-stg
spec:
  acme:
    email: TLS_EMAIL
    privateKeySecretRef:
      name: letsencrypt-prod-stg
    server: https://acme-v02.api.letsencrypt.org/directory
    solvers:
    - dns01:
        clouddns:
          project: GCP_PROJECT
          serviceAccountSecretRef:
            key: credentials.json
            name: external-dns-gcp-sa
      selector:
        dnsNames:
        - '*.MYDOMAIN.com'
        - MYDOMAIN.com

Because I am using CloudDNS on GCP, I also ended up needing to also replicate (manually) the external-dns-gcp-sa Secret from jx to jx-staging and jx-production, so that CM could fulfil Orders. Here is the error to look out for:

cert-manager/controller/challenges "msg"="re-queuing item due to error processing" "error"="error getting clouddns service account: secret \"external-dns-gcp-sa\" not found" "key"="jx-staging/tls-PROJECT-NAMESPACE-DOMAIN-p-RAND-RAND-RAND"

Last of all, it’s good to ramp up on Cert Manager. In addition to following the CertManager namespace with kail. This post was super handy, as a lot over important debugging information is only available if you kubectl describe [Object] at the object level.

You can actually find error messages in each of these, like so:
kubectl get certificaterequest
kubectl describe certificaterequest X
kubectl get order
kubectl describe order X
kubectl get challenge
kubectl describe challenge X

@rudolph9
Copy link

rudolph9 commented Jan 27, 2020

Looks like jx upgrage ingress is being decommissioned soon which breaks the workaround I had been using #6107

@srehmanproov
Copy link

After jx boot with tls enabled in jx-requirements.txt, I am seeing since days, certificate is not even ready on dev
Status: Conditions: Last Transition Time: 2020-03-02T15:53:00Z Message: Waiting for CertificateRequest "tls-jenkinsx-dev-xxx-xxx-xx-p-2276452093" to complete Reason: InProgress Status: False Type: Ready Events: <none>

@srehmanproov
Copy link

I'll also extend the environment configuration in the jx-requirements file such that it will allow to define custom ingress configuration per environment, and then we can generate the cert-manager resources from templates per environment.

@ccojocar is that done? We need that urgently as currently blocked because domain and sub domain are with different providers. Jenkins X without security is not viable option for many companies.

@ccojocar
Copy link
Contributor Author

ccojocar commented Mar 5, 2020

@srehmanproov You can check with @daveconde or @deanesmith on slack. I don't think is done.

@srehmanproov
Copy link

@srehmanproov You can check with @daveconde or @deanesmith on slack. I don't think is done.

thanks @ccojocar

@deanesmith deanesmith added priority/critical and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Mar 23, 2020
@hferentschik hferentschik changed the title DISCOVERY - TLS seems to be broken on applications when using boot TLS seems to be broken on applications when using boot Mar 24, 2020
@deanesmith deanesmith changed the title TLS seems to be broken on applications when using boot DISCOVERY - TLS seems to be broken on applications when using boot Mar 24, 2020
@hferentschik hferentschik changed the title DISCOVERY - TLS seems to be broken on applications when using boot TLS seems to be broken on applications when using boot Mar 24, 2020
@hferentschik hferentschik added kind/discovery and removed kind/bug Issue is a bug labels Mar 24, 2020
@pradyuman
Copy link

Has anyone gotten a chance to confirm that jenkins-x-charts/jxboot-resources#32 fixes this issue?

@jenkins-x-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://jenkins-x.io/community.
/lifecycle stale

@jenkins-x-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Provide feedback via https://jenkins-x.io/community.
/lifecycle rotten

@lovepocky
Copy link

Will that work for previews and devpods?

I'm currently using

DOMAIN="your_domain"

kubectl patch deployment -n kube-system jxing-nginx-ingress-controller --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--default-ssl-certificate=jx/tls-$DOMAIN-p"}]'

which seems to work quite well and catch anything with a broken/missing cert. I wasn't able to get it stable and working for previews and devpods when I was trying replicator a while back.

This configure could be add to boot-config: systems/jxing/values.tmpl.yaml as below:

nginx-ingress:
  controller:
    extraArgs:
      default-ssl-certificate: jx/tls-${your_domain_name}-p

@jenkins-x-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Provide feedback via https://jenkins-x.io/community.
/close

@jenkins-x-bot
Copy link
Contributor

@jenkins-x-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Provide feedback via https://jenkins-x.io/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests