Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle smartnic cleanup when container net namespace is empty #129

Merged
merged 3 commits into from
Sep 29, 2020

Conversation

pperiyasamy
Copy link
Collaborator

This change attempts to cleanup smartVFs net representor from
ovs bridge and rename smartVF back to its original name when
container network namespace is no more available due to errors
like OOMKilled, CrashLoopBackOff etc.

Signed-off-by: Periyasamy Palanisamy periyasamy.palanisamy@est.tech

This change attempts to cleanup smartVFs net representor from
ovs bridge and rename smartVF back to its original name when
container network namespace is no more available due to errors
like OOMKilled, CrashLoopBackOff etc.

Signed-off-by: Periyasamy Palanisamy <periyasamy.palanisamy@est.tech>
@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/L labels Sep 28, 2020
@pperiyasamy
Copy link
Collaborator Author

/release-note-none

@kubevirt-bot kubevirt-bot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Sep 28, 2020
@pperiyasamy pperiyasamy changed the title Handle smartnic cleanup when container net namespace is empty handle smartnic cleanup when container net namespace is empty Sep 28, 2020
Copy link
Collaborator

@rhrazdil rhrazdil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, few comments

// SR-IOV Case - The sriov device is moved into host network namespace when args.Netns is empty.
// This happens container is killed due to an error (example: CrashLoopBackOff, OOMKilled)
var rep string
if rep, err = sriov.GetNetRepresentor(netconf.DeviceID); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, use := assignment instead of var rep string

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't use := as rep variable is used at line#447 (outside if scope).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks

}

return rep, nil

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra newline

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

// Make sure we have 1 netdevice per pci address
if len(vfNetdevices) != 1 {
// This would happen if netdevice is not yet visible in default network namespace.
// so return ErrLinkNotFound error so that multus can attempt multiple times
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this behaviour specific for multus only?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can happen with any meta plugins that use ovs-cni, but I've tested only with multus. now changed it into generic term meta plugin.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

}
if err = removeOvsPort(ovsDriver, rep); err != nil {
// Don't throw err as delete can be called multiple times because of error in ResetVF and ovs
// port is already deleted in a previous invocation. so log it and proceed further.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so log it and proceed further. seems superfluous

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was needed so that ResetVF is called at line#452 to rename the smartvf back to original name in case ResetVF thrown ErrLinkNotFound error in previous CmdDel call (noticed this behaviour in my tests).

Copy link
Collaborator

@rhrazdil rhrazdil Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry for not being clear enough. I meant the sentence in the comment: so log it and proceed further., I think it's not necessary as it's obvious from the next two lines.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah! yes. is it ok now ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good to me

// Don't throw err as delete can be called multiple times because of error in ResetVF and ovs
// port is already deleted in a previous invocation. so log it and proceed further.
log.Printf("removal of ovs port %s is failed for device %s , it may be removed already"+
" by previous delete call, err %v", rep, netconf.DeviceID, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other places in this module, we just log.Printf("Error: %v\n", err) in this situation, it may be better to stick with that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, good. done now.

Signed-off-by: Periyasamy Palanisamy <periyasamy.palanisamy@est.tech>
@pperiyasamy
Copy link
Collaborator Author

/cc @JanScheurich

@kubevirt-bot
Copy link
Collaborator

@pperiyasamy: GitHub didn't allow me to request PR reviews from the following users: JanScheurich.

Note that only kubevirt members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @JanScheurich

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Signed-off-by: Periyasamy Palanisamy <periyasamy.palanisamy@est.tech>
@rhrazdil
Copy link
Collaborator

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 29, 2020
Copy link
Member

@phoracek phoracek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

🚀

@kubevirt-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: phoracek, pperiyasamy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 29, 2020
@kubevirt-bot kubevirt-bot merged commit 5e9b39a into k8snetworkplumbingwg:master Sep 29, 2020
@phoracek
Copy link
Member

@pperiyasamy do you need a new release to consume this?

@pperiyasamy
Copy link
Collaborator Author

do you need a new release to consume this?

yes @phoracek, it would be great if we have a release with this fix.

@phoracek
Copy link
Member

phoracek commented Oct 2, 2020

Sure thing sir! Here you go https://github.com/kubevirt/ovs-cni/releases/tag/v0.14.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants