Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github alert issue: "could not list commit statuses: context canceled" #406

Closed
artemlive opened this issue Aug 22, 2022 · 4 comments · Fixed by #408
Closed

Github alert issue: "could not list commit statuses: context canceled" #406

artemlive opened this issue Aug 22, 2022 · 4 comments · Fixed by #408
Assignees
Labels
bug Something isn't working

Comments

@artemlive
Copy link

artemlive commented Aug 22, 2022

Hello. I have an issue with the notification controller that is, I believe, related to this #385 PR.
When the controller tries to dispatch the Kustomization I see the error in the notification-controller logs:

{"level":"error","ts":"2022-08-21T15:15:37.267Z","logger":"event-server","msg":"failed to send notification","reconciler kind":"Kustomization","name":"nginx-ingress","namespace":"flux-system","error":"could not list commit statuses: context canceled"}

As far as I see the client's request is not synchronous and it doesn't wait while the request is processed. So it cancels the context because the incoming request from the client (which is the kustomization controller) had been closed before the notification controller could get the result from GitHub API.

Here is the Alert definition:

kind: Alert
metadata:
  creationTimestamp: "2022-08-21T13:09:12Z"
  generation: 2
  labels:
    kustomize.toolkit.fluxcd.io/name: flux-githib-alert
    kustomize.toolkit.fluxcd.io/namespace: flux-system
    manager: notification-controller
    operation: Update
    subresource: status
    time: "2022-08-21T13:09:12Z"
  name: github-status
  namespace: flux-system
  resourceVersion: "1982008"
  uid: f570eb8d-d72e-41ef-8636-c12884bd9751
spec:
  eventSeverity: info
  eventSources:
  - kind: Kustomization
    name: '*'
    namespace: flux-system
  - kind: GitRepository
    name: '*'
    namespace: flux-system
  providerRef:
    name: flux-system
status:
  conditions:
  - lastTransitionTime: "2022-08-21T13:09:12Z"
    message: Initialized
    observedGeneration: 2
    reason: Succeeded
    status: "True"
    type: Ready
  observedGeneration: 2

The provider definition looks like:

kind: Provider
metadata:
  creationTimestamp: "2022-08-21T12:54:00Z"
  generation: 1
  labels:
    kustomize.toolkit.fluxcd.io/name: flux-githib-alert
    kustomize.toolkit.fluxcd.io/namespace: flux-system
    manager: notification-controller
    operation: Update
    subresource: status
    time: "2022-08-21T13:06:47Z"
  name: flux-system
  namespace: flux-system
  resourceVersion: "1965423"
  uid: 572e565c-3255-4334-b811-28f0be59b504
spec:
  address: https://github.com/artemlive/my-repo
  secretRef:
    name: github
  type: github
status:
  conditions:
  - lastTransitionTime: "2022-08-21T13:06:47Z"
    message: Initialized
    observedGeneration: 1
    reason: Succeeded
    status: "True"
    type: Ready
  observedGeneration: 1

The controller creates the context from the incoming parent's request context and passes it to the notifier.Post() method:

ctx, cancel := context.WithTimeout(r.Context(), 15*time.Second)

I've built my own version of the controller with the context.Background (for test purposes) instead of the ctx from the parent method and it works as expected.

	statuses, _, err := g.Client.Repositories.ListStatuses(context.Background(), g.Owner, g.Repo, rev, opts)
	if err != nil {
		return fmt.Errorf("could not list commit statuses: %v", err)
	}
	if duplicateGithubStatus(statuses, status) {
		return nil
	}

	_, _, err = g.Client.Repositories.CreateStatus(context.Background(), g.Owner, g.Repo, rev, status)

Instead of this.

statuses, _, err := g.Client.Repositories.ListStatuses(ctx, g.Owner, g.Repo, rev, opts)
if err != nil {
return fmt.Errorf("could not list commit statuses: %v", err)
}
if duplicateGithubStatus(statuses, status) {
return nil
}
_, _, err = g.Client.Repositories.CreateStatus(ctx, g.Owner, g.Repo, rev, status)

@pjbgf pjbgf added the bug Something isn't working label Aug 24, 2022
@pjbgf
Copy link
Member

pjbgf commented Aug 24, 2022

Other Flux controllers have configurable Timeouts which allows users to set it according to their needs.
I think the same could be beneficial here.

During the Bug Scrub we discussed about the possibility of having a flag to define Timeouts at the handleEvent() level. And potentially a CRD level Spec.Timeout for the providers. But it would be great to get the views of other maintainers on the matter.

@stefanprodan
Copy link
Member

I'm for adding a timeout field to the Provider spec with a default of 15s.

@somtochiama
Copy link
Member

+1 on adding timeouts to the provider spec.

@somtochiama somtochiama self-assigned this Aug 24, 2022
@phillebaba
Copy link
Member

This is my fault. #408 should fix this by moving the defer to inside of the go routine instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants