Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to customize VolumeSnapshotContent workqueue #308

Merged
merged 1 commit into from
Jun 10, 2021

Conversation

humblec
Copy link
Contributor

@humblec humblec commented Apr 28, 2020

Two new timeout values ( retryIntervalStart & retryIntervalMax )
have been added to set the ratelimiter for volumesnapshotcontent queue.

Fixes: #463

Signed-off-by: Humble Chirammal hchiramm@redhat.com

 `retry-interval-start` and `retry-interval-max` arguments are added to csi-snapshotter sidecar which controls retry interval of failed volume snapshot creation and deletion. These values set the ratelimiter for volumesnapshotcontent queue.

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 28, 2020
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 28, 2020
@humblec
Copy link
Contributor Author

humblec commented Apr 28, 2020

@xing-yang @yuxiangqian ptal . thanks

@xing-yang
Copy link
Collaborator

@humblec Thanks for adding this. Requeue currently does not work properly so adding these options won't really do anything. I'm working on a fix here: #230. After that we can get your fix in.

@humblec
Copy link
Contributor Author

humblec commented Apr 29, 2020

@humblec Thanks for adding this. Requeue currently does not work properly so adding these options won't really do anything. I'm working on a fix here: #230. After that we can get your fix in.

@xing-yang sure. Will wait for other PR to go in first.

@xing-yang
Copy link
Collaborator

@humblec can you rebase?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 22, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 22, 2020
@humblec
Copy link
Contributor Author

humblec commented May 22, 2020

@humblec can you rebase?

@xing-yang Sure, it is done. 👍 ptal.

Copy link
Collaborator

@xing-yang xing-yang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a release note as this PR adds 2 user facing config parameters.

cmd/csi-snapshotter/main.go Outdated Show resolved Hide resolved
cmd/csi-snapshotter/main.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels May 22, 2020
@humblec
Copy link
Contributor Author

humblec commented May 22, 2020

@xing-yang comments are addressed and also added release notes. Please let me know if any changes are required in any of it.

README.md Outdated Show resolved Hide resolved
@yuxiangqian
Copy link
Contributor

I am a little concerned for this change as it would change the behavior of the rateLimiter slightly.

  1. how should this be effectively tested? I understand there are no existing tests to cover, but can we consider covering in this PR?
  2. the original implementation uses the default controller rate limiter, which under the hood is just a combination of exponential failure rate limiter and a bucket rate limiter. For the exponential failure rate limiter, it uses 1 ms as the base. In this PR, the default base is 1s.
    What are the main purposes of this change? is it just to allow users to specify the base and max-retry interval?

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
pkg/sidecar-controller/framework_test.go Outdated Show resolved Hide resolved
@humblec
Copy link
Contributor Author

humblec commented May 27, 2020

@yuxiangqian regarding the name of these parameters: While I agree that we could change or use some strings like retry-interval-base ..etc, we are actually deviating from the same parameters available in other sidecars and the parity between those. For ex: external provisioner/attacher (https://github.com/kubernetes-csi/external-attacher) too have these parameters. Regarding the default value, we could change that to 1ms or whatever preferred value , Is 1ms good enough?

@humblec
Copy link
Contributor Author

humblec commented Jun 14, 2020

@yuxiangqian regarding the name of these parameters: While I agree that we could change or use some strings like retry-interval-base ..etc, we are actually deviating from the same parameters available in other sidecars and the parity between those. For ex: external provisioner/attacher (https://github.com/kubernetes-csi/external-attacher) too have these parameters. Regarding the default value, we could change that to 1ms or whatever preferred value , Is 1ms good enough?

@xing-yang can you share your thoughts.. I would like to get some consensus and hopefully finish this PR .

@xing-yang
Copy link
Collaborator

@yuxiangqian regarding the name of these parameters: While I agree that we could change or use some strings like retry-interval-base ..etc, we are actually deviating from the same parameters available in other sidecars and the parity between those. For ex: external provisioner/attacher (https://github.com/kubernetes-csi/external-attacher) too have these parameters. Regarding the default value, we could change that to 1ms or whatever preferred value , Is 1ms good enough?

@xing-yang can you share your thoughts.. I would like to get some consensus and hopefully finish this PR .

I agree that the naming in externa-snapshotter should be consistent with that in other sidecars. @yuxiangqian are you okay with this PR?

@yuxiangqian
Copy link
Contributor

@yuxiangqian regarding the name of these parameters: While I agree that we could change or use some strings like retry-interval-base ..etc, we are actually deviating from the same parameters available in other sidecars and the parity between those. For ex: external provisioner/attacher (https://github.com/kubernetes-csi/external-attacher) too have these parameters. Regarding the default value, we could change that to 1ms or whatever preferred value , Is 1ms good enough?

@humblec dropped the ball a bit, my apologies. If its consistent with others, I am good with the name.
I still have concerns over the default value of 1s as users likely would just rely on the default. Snapshot is latency sensitive and that's different from attacher. I'd suggest add some tests to that and default to the original 1ms starting. and in framework_test.go, should we also use the same ratelimiter as in main.go.

@humblec humblec force-pushed the ratelimiter branch 2 times, most recently from 21edee2 to 0bd41ab Compare June 29, 2020 14:22
@k8s-ci-robot k8s-ci-robot removed the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 29, 2020
@k8s-ci-robot k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 3, 2021
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@xing-yang
Copy link
Collaborator

@humblec can you reference this issue #463 in the top section of this PR?

@humblec
Copy link
Contributor Author

humblec commented Jun 3, 2021

@humblec can you reference this issue #463 in the top section of this PR?

Sure thing @xing-yang and thanks for reopening this PR, let me rebase and finish it off 👍

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 7, 2021
@humblec
Copy link
Contributor Author

humblec commented Jun 7, 2021

The default value of the ratelimiter has been set to 1s. Apart from that, one thing to note here is, framework testing does not make use of this new rate limiter yet ie it use the defaultRateLimiter. If required, I can change framework, otherwise we are good with the change .

@xing-yang
Copy link
Collaborator

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 7, 2021
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
cmd/csi-snapshotter/main.go Outdated Show resolved Hide resolved
pkg/sidecar-controller/framework_test.go Outdated Show resolved Hide resolved
@humblec humblec force-pushed the ratelimiter branch 3 times, most recently from 5afda69 to fc851d6 Compare June 10, 2021 06:38
@humblec
Copy link
Contributor Author

humblec commented Jun 10, 2021

@xing-yang ptal.. Thanks !

Two new timeout values ( retryIntervalStart & retryIntervalMax )
have been added to set the ratelimiter for volumesnapshotcontent queue.

Fix# kubernetes-csi#463

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>

```release-note
 `retry-interval-start` and `retry-interval-max` arguments are added to csi-snapshotter sidecar which controls retry interval of failed volume snapshot creation and deletion. These values set the ratelimiter for volumesnapshotcontent queue.
```

Signed-off-by: Humble Chirammal <hchiramm@redhat.com>
@xing-yang
Copy link
Collaborator

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: humblec, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 10, 2021
@k8s-ci-robot k8s-ci-robot merged commit af117aa into kubernetes-csi:master Jun 10, 2021
@xing-yang xing-yang added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sidecar spins on CreateSnapshot() retries instead of backing off
5 participants