Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE-1132: EFS Access Point Tags Update DAY2 #313

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

anirudhAgniRedhat
Copy link

@anirudhAgniRedhat anirudhAgniRedhat commented Nov 4, 2024

This PR introduces a custom EFSAccessPointsTagController that monitors the OpenShift Infrastructure resource for changes in AWS ResourceTags. When tags are updated, the controller automatically fetches all AWS EFS-backed PersistentVolumes (PVs) in the cluster, retrieves their volume IDs, and updates the associated EFS Access Points tags in AWS.

Key Changes:

Monitors Infrastructure resource for AWS ResourceTags updates.
Directly fetches all PVs using the AWS EFS CSI driver (efs.csi.aws.com).
Updates AWS EFS AccessPoint tags by merging new and existing tags using the AWS SDK.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 4, 2024
@openshift-ci openshift-ci bot requested review from dobsonj and mpatlasov November 4, 2024 14:40
@anirudhAgniRedhat anirudhAgniRedhat changed the title [WIP] EFS Volume Tags Update DAY2 CFE-1132: EFS Volume Tags Update DAY2 Nov 5, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 5, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 5, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 5, 2024

@anirudhAgniRedhat: This pull request references CFE-1132 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR introduces a custom EFSVolumeTagController that monitors the OpenShift Infrastructure resource for changes in AWS ResourceTags. When tags are updated, the controller automatically fetches all AWS EFS-backed PersistentVolumes (PVs) in the cluster, retrieves their volume IDs, and updates the associated EFS tags in AWS.

Key Changes:

Monitors Infrastructure resource for AWS ResourceTags updates.
Directly fetches all PVs using the AWS EFS CSI driver (efs.csi.aws.com).
Updates AWS EFS tags by merging new and existing tags using the AWS SDK.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@anirudhAgniRedhat
Copy link
Author

/retest

@anirudhAgniRedhat
Copy link
Author

anirudhAgniRedhat commented Nov 5, 2024

@jsafrane this PR adds support for updating tags on efs resources.
Unlike the EBS Volumes, I didn't find way to update the tags in batch API calls.

Here I am annotating the volumes with the sorted hash as discussed in this thread.
So we can retry to tag the resources again in every re-sync period that is 20 mins also we are reporting the failures as a warning events.

Can you please review these changes as well.

/cc @jsafrane @TrilokGeer

@openshift-ci openshift-ci bot requested review from jsafrane and TrilokGeer November 5, 2024 08:12
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 8, 2024

@anirudhAgniRedhat: This pull request references CFE-1132 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This PR introduces a custom EFSAccessPointsTagController that monitors the OpenShift Infrastructure resource for changes in AWS ResourceTags. When tags are updated, the controller automatically fetches all AWS EFS-backed PersistentVolumes (PVs) in the cluster, retrieves their volume IDs, and updates the associated EFS Access Points tags in AWS.

Key Changes:

Monitors Infrastructure resource for AWS ResourceTags updates.
Directly fetches all PVs using the AWS EFS CSI driver (efs.csi.aws.com).
Updates AWS EFS AccessPoint tags by merging new and existing tags using the AWS SDK.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@anirudhAgniRedhat anirudhAgniRedhat changed the title CFE-1132: EFS Volume Tags Update DAY2 CFE-1132: EFS Access Point Tags Update DAY2 Nov 8, 2024
@anirudhAgniRedhat
Copy link
Author

/retest

@anirudhAgniRedhat anirudhAgniRedhat force-pushed the EFS_TAGS_UPDATE branch 2 times, most recently from 5c27d53 to 13d121b Compare November 26, 2024 06:38
hook := func(spec *opv1.OperatorSpec, deployment *appsv1.Deployment) error {
infraLister := c.GetInfraInformer().Lister()
infra, err := infraLister.Get(infrastructureName)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this code to a different function and perform this computation only if container in question is csi-driver

return nil
}

func (c *EFSAccessPointTagsController) updateVolumeWithRetry(ctx context.Context, pv *v1.PersistentVolume) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No retries here. I would add the PV to the queue via syncContext.Queue().Add()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see an issue with using the syncContext Queue as it will again call the sync function if anything is added to the queue.

I have new retry worker irrespective of the sync function logic and will solve our retry issues.

return err
}

limiter := rate.NewLimiter(rate.Limit(awsTagsRequestRateLimit), awsTagsRequestBurstSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we added this Limiter here? We are performing update on a single PV, so I am trying to understand how this is helping.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have improved the implementation of the rate limiter here.

I am using the limiter to not to hit more than 10 req/sec as listed in AWS Service Quotas

Copy link
Contributor

openshift-ci bot commented Dec 16, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: anirudhAgniRedhat
Once this PR has been reviewed and has the lgtm label, please assign mandre for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@anirudhAgniRedhat anirudhAgniRedhat force-pushed the EFS_TAGS_UPDATE branch 3 times, most recently from 7de43ae to 4bb4f56 Compare December 16, 2024 12:55
@anirudhAgniRedhat
Copy link
Author

/retest

Copy link
Contributor

openshift-ci bot commented Jan 8, 2025

@anirudhAgniRedhat: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-openstack-manila-csi 77f5f8c link true /test e2e-openstack-manila-csi
ci/prow/hypershift-e2e-openstack-csi-cinder 13d121b link true /test hypershift-e2e-openstack-csi-cinder
ci/prow/hypershift-e2e-openstack-csi-manila 13d121b link true /test hypershift-e2e-openstack-csi-manila
ci/prow/smb-win2019-operator-e2e 6c87a0e link false /test smb-win2019-operator-e2e
ci/prow/smb-win2022-operator-e2e 6c87a0e link false /test smb-win2022-operator-e2e
ci/prow/e2e-aws-csi-extended 6c87a0e link false /test e2e-aws-csi-extended
ci/prow/e2e-azurestack-csi f63f0ef link false /test e2e-azurestack-csi
ci/prow/smb-operator-e2e f63f0ef link false /test smb-operator-e2e
ci/prow/okd-scos-e2e-aws-ovn f63f0ef link false /test okd-scos-e2e-aws-ovn
ci/prow/hypershift-aws-e2e-external f63f0ef link true /test hypershift-aws-e2e-external
ci/prow/e2e-azure-csi f63f0ef link true /test e2e-azure-csi
ci/prow/e2e-azure f63f0ef link true /test e2e-azure

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants