Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support fix containerrecreaterequest #1182

Merged

Conversation

BH4AWS
Copy link
Collaborator

@BH4AWS BH4AWS commented Feb 20, 2023

Ⅰ. Describe what this PR does

add the forceRecreate to container recreate request feature. When a crr resource is created with forceRecreate = true, the container will be kill once by forcibly and immediately.

Ⅱ. Does this pull request fix one issue?

yes, issue: #1177

Ⅲ. Describe how to verify it

using the container recreate request and the forceRecreate is set true.

Ⅳ. Special notes for reviews

@kruise-bot kruise-bot requested review from FillZpp and zmberg February 20, 2023 01:53
@kruise-bot kruise-bot added the size/M size/M: 30-99 label Feb 20, 2023
@codecov-commenter
Copy link

codecov-commenter commented Feb 20, 2023

Codecov Report

Merging #1182 (5e15dc1) into master (53fb7e1) will increase coverage by 0.32%.
The diff coverage is n/a.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##           master    #1182      +/-   ##
==========================================
+ Coverage   49.73%   50.05%   +0.32%     
==========================================
  Files         137      143       +6     
  Lines       19331    19906     +575     
==========================================
+ Hits         9614     9964     +350     
- Misses       8667     8848     +181     
- Partials     1050     1094      +44     
Flag Coverage Δ
unittests 50.05% <ø> (+0.32%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/controller/cloneset/cloneset_event_handler.go 59.34% <0.00%> (-18.73%) ⬇️
pkg/webhook/sidecarset/validating/utils.go 56.66% <0.00%> (-18.34%) ⬇️
pkg/controller/cloneset/sync/cloneset_update.go 57.72% <0.00%> (-2.21%) ⬇️
pkg/webhook/pod/mutating/persistent_pod_state.go 64.21% <0.00%> (-1.73%) ⬇️
...sistentpodstate/persistent_pod_state_controller.go 31.18% <0.00%> (-1.68%) ⬇️
pkg/control/pubcontrol/pub_control.go 25.00% <0.00%> (-1.22%) ⬇️
pkg/control/pubcontrol/pub_control_utils.go 50.99% <0.00%> (-1.04%) ⬇️
pkg/controller/statefulset/stateful_set_control.go 63.32% <0.00%> (-0.59%) ⬇️
pkg/control/sidecarcontrol/util.go 53.48% <0.00%> (-0.39%) ⬇️
pkg/controller/sidecarset/sidecarset_processor.go 68.61% <0.00%> (-0.13%) ⬇️
... and 30 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@zmberg zmberg added this to the v1.4 milestone Feb 20, 2023
@@ -362,10 +363,25 @@ func (c *Controller) manage(crr *appsv1alpha1.ContainerRecreateRequest) error {
}
return c.patchCRRContainerRecreateStates(crr, newCRRContainerRecreateStates)
}

if crr.Spec.Strategy.ForceRecreate {
newForceKilledContainerStatuses = append(newForceKilledContainerStatuses, appsv1alpha1.ContainerRecreateRequestForceKilledContainerStatus{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have to patch "forcekill" status if the container do not have to be recreated forcefully

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if crr.Spec.Strategy.ForceRecreate && state.ShouldBeKilledByForce {
state.IsForceKilled = true
state.ContainerID = kubeContainerStatus.ID.String()
}

@@ -38,6 +38,7 @@ rules:
- get
- list
- watch
- patch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is recommended to avoid assign object patch privilege to daemons, so the container id of killed container can be saved in crr status

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Name of the container.
Name string `json:"name"`
// Container is killed by force
IsForceKilled bool `json:"restartCount"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct tag for IsForceKilled should be isForceKilled or something similar. restartCount is considered as something different.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

@BH4AWS BH4AWS force-pushed the containerrecreate_add_forcekill branch from 303851d to 5e15dc1 Compare March 7, 2023 17:35
@@ -37,6 +37,9 @@ const (
// ContainerRecreateRequestUnreadyAcquiredKey indicates the Pod has been forced to not-ready.
// It is required if the unreadyGracePeriodSeconds is set in ContainerRecreateRequests.
ContainerRecreateRequestUnreadyAcquiredKey = "crr.apps.kruise.io/unready-acquired"

// It is a status key for containers which are be killed by crr
ContainerRecreateRequestForceKilledContainerStatusesKey = "crr.apps.kruise.io/force-killed-container-statuses"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unused key?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@@ -106,6 +109,8 @@ type ContainerRecreateRequestStrategy struct {
FailurePolicy ContainerRecreateRequestFailurePolicyType `json:"failurePolicy,omitempty"`
// OrderedRecreate indicates whether to recreate the next container only if the previous one has recreated completely.
OrderedRecreate bool `json:"orderedRecreate,omitempty"`
// ForceRecreate indicates whether to force kill the container even if the previous container is not running.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the previous container is not running -> if the previous container is starting

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -158,6 +163,12 @@ type ContainerRecreateRequestContainerRecreateState struct {
Phase ContainerRecreateRequestPhase `json:"phase"`
// A human readable message indicating details about this state.
Message string `json:"message,omitempty"`
// Container should be killed by force
ShouldBeKilledByForce bool `json:"shouldBeKilledByForce"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use crr.Spec.Strategy.ForceRecreate instead? it seems this status field unnecessary

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShouldBeKilledByForce has been removed in this solution

@@ -172,6 +183,16 @@ type ContainerRecreateRequestSyncContainerStatus struct {
ContainerID string `json:"containerID,omitempty"`
}

// ContainerRecreateRequestKilledContainerStatus contains the state of the last killed container.
type ContainerRecreateRequestForceKilledContainerStatus struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unused code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@@ -69,6 +69,8 @@ func getCurrentCRRContainersRecreateStates(
}

syncContainerStatuses := getCRRSyncContainerStatuses(crr)
// the statuses of force killed containers for this crr resource
forceKillContainerStatuses := getCRRForceKillContainersStatuses(crr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getCRRForceKillContainersStatuses -> getKilledContainersStatuses

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -128,6 +136,16 @@ func getCurrentCRRContainersRecreateStates(
return statuses
}

func containerIsNotForceKilled(recreateState *appsv1alpha1.ContainerRecreateRequestContainerRecreateState) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

containerIsNotForceKilled -> containerIsForceKilled
plz revert the func logic to make code more easy to understand


if crr.Spec.Strategy.ForceRecreate && (len(forceKillContainerStatuses) == 0 ||
containerIsNotForceKilled(killedContainerStatus)) {
currentState.ShouldBeKilledByForce = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShouldBeKilledByForce -> IsForceKilled

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new logic is changed in this code.

@@ -362,6 +362,11 @@ func (c *Controller) manage(crr *appsv1alpha1.ContainerRecreateRequest) error {
}
return c.patchCRRContainerRecreateStates(crr, newCRRContainerRecreateStates)
}

if crr.Spec.Strategy.ForceRecreate && state.ShouldBeKilledByForce {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should populate ContainerID even if the container is not killed forcely

@@ -114,7 +118,11 @@ func getCurrentCRRContainersRecreateStates(
syncContainerStatus.Ready {
currentState.Phase = appsv1alpha1.ContainerRecreateRequestSucceeded
}

if crr.Spec.Strategy.ForceRecreate && (len(forceKillContainerStatuses) == 0 ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len(forceKillContainerStatuses) checking is not necessary, containerIsNotForceKilled will check whether recreateState == nil , plz remove len(forceKillContainerStatuses) checking

@BH4AWS BH4AWS force-pushed the containerrecreate_add_forcekill branch 2 times, most recently from d367244 to 88baf65 Compare March 10, 2023 09:40
if recreateState.IsForceKilled {
return true
}
return false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return recreateState != nil && recreateState.IsForceKilled

@@ -158,6 +160,8 @@ type ContainerRecreateRequestContainerRecreateState struct {
Phase ContainerRecreateRequestPhase `json:"phase"`
// A human readable message indicating details about this state.
Message string `json:"message,omitempty"`
// Container has been killed by force
IsForceKilled bool `json:"isForceKilled,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsRealKilled bool ?

@@ -93,13 +97,21 @@ func getCurrentCRRContainersRecreateStates(
Message: "not found container on Node",
}

} else if crr.Spec.Strategy.ForceRecreate && !containerIsForceKilled(killedContainerStatus) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if crr.Spec.Strategy.ForceRecreate {
	if previousContainerRecreateState == nil || !previousContainerRecreateState.IsRealKilled {
		currentState = appsv1alpha1.ContainerRecreateRequestContainerRecreateState{
			Name:    c.Name,
			Phase:   appsv1alpha1.ContainerRecreateRequestPending,
			Message: "No real kill container yet",
		}
		statuses = append(statuses, currentState)
		continue
	}
}

@BH4AWS BH4AWS force-pushed the containerrecreate_add_forcekill branch from 88baf65 to ab2d8e9 Compare March 13, 2023 09:06
@kruise-bot kruise-bot added size/S size/S 10-29 and removed size/M size/M: 30-99 labels Mar 13, 2023
@BH4AWS BH4AWS force-pushed the containerrecreate_add_forcekill branch from ab2d8e9 to 7e43eb2 Compare March 13, 2023 17:05
@kruise-bot kruise-bot added size/M size/M: 30-99 and removed size/S size/S 10-29 labels Mar 13, 2023
@BH4AWS BH4AWS force-pushed the containerrecreate_add_forcekill branch from 7e43eb2 to 7a252af Compare March 14, 2023 16:41
}
}

// current: previousContainerRecreateState.IsRealKilled = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsRealKilled -> IsKilled

@@ -85,40 +85,56 @@ func getCurrentCRRContainersRecreateStates(
kubeContainerStatus := podStatus.FindContainerStatusByName(c.Name)

var currentState appsv1alpha1.ContainerRecreateRequestContainerRecreateState

if crr.Spec.Strategy.ForceRecreate {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we skip crr with completed phase in L77.

@BH4AWS BH4AWS force-pushed the containerrecreate_add_forcekill branch from 7a252af to d522f61 Compare March 15, 2023 17:38
@kruise-bot kruise-bot added size/L size/L: 100-499 and removed size/M size/M: 30-99 labels Mar 15, 2023
@BH4AWS BH4AWS force-pushed the containerrecreate_add_forcekill branch from d522f61 to 6f60438 Compare March 15, 2023 17:49
Signed-off-by: jicheng.sk <jicheng.sk@alibaba-inc.com>
@BH4AWS BH4AWS force-pushed the containerrecreate_add_forcekill branch from 6f60438 to f275bd3 Compare March 16, 2023 09:49
@zmberg
Copy link
Member

zmberg commented Mar 16, 2023

/lgtm

Copy link
Member

@furykerry furykerry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@kruise-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: furykerry

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kruise-bot kruise-bot merged commit eefcbb2 into openkruise:master Mar 16, 2023
diannaowa pushed a commit to diannaowa/kruise that referenced this pull request Jun 2, 2023
Signed-off-by: jicheng.sk <jicheng.sk@alibaba-inc.com>

Co-authored-by: jicheng.sk <jicheng.sk@alibaba-inc.com>
diannaowa pushed a commit to diannaowa/kruise that referenced this pull request Jun 2, 2023
Signed-off-by: jicheng.sk <jicheng.sk@alibaba-inc.com>

Co-authored-by: jicheng.sk <jicheng.sk@alibaba-inc.com>
ppbits pushed a commit to ppbits/kruise that referenced this pull request Apr 4, 2024
Co-authored-by: jicheng.sk <jicheng.sk@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants